Opened 7 years ago

Closed 7 years ago

Last modified 7 years ago

#9301 closed Bug (fixed)

Complete DTD for HTML5

Reported by: Piotrek Koszuliński Owned by: Garry Yao
Priority: Normal Milestone: CKEditor 4.0
Component: Core : DTD Version: 4.0
Keywords: Cc:

Description (last modified by Piotrek Koszuliński)

Intro

Googling... googling... That's right - there's no DTD for HTML5 and there won't be any, because HTML5 cannot be defined by poor DTD. There are other schema languages like RELAX NG in which HTML5 may be defined, but AFAIK WHATWG won't create schema for HTML5.

Subject of this task

We need to update our pseudo-DTD with HTML5's new stuff. See e.g. https://github.com/ckeditor/ckeditor-dev/pull/6

The purpose of CKEDITOR.dtd is mainly to validate DOM tree structure + few more specific for editor things (like the list of editable elements). Unfortunately in current CKEDITOR.dtd's format we won't be able to define HTML5's precisely.

I see two main problems:

  1. Transparent elements (http://dev.w3.org/html5/markup/terminology.html#transparent) E.g. if <a> is a child of <div>, then <a> may contain block elements (in HTML5's terminology - flow elements (http://dev.w3.org/html5/markup/common-models.html#common.elem.flow)) and inline elements (phrasing elements). If <a> is a child of <p>, then it may contain phrasing elements only.
  2. Nesting rules dependent on attributes/sth else. E.g. if <audio> doesn't have src attr it may contain <source> elements. Otherwise it may not.

We are not able to handle 1. and I believe that we've got to live now with the current state of things. However, for 2., rules of our DTD should just be as tolerant as possible. That's the solution for now.

Attachments (1)

diffs.js (10.8 KB) - added by Piotrek Koszuliński 7 years ago.

Download all attachments as: .zip

Change History (19)

comment:1 Changed 7 years ago by Piotrek Koszuliński

Description: modified (diff)

comment:2 Changed 7 years ago by Piotrek Koszuliński

Description: modified (diff)

comment:3 Changed 7 years ago by Piotrek Koszuliński

I pushed t/9301 with HTML5's DTD. I created it manually basing on http://dev.w3.org/html5/markup/elements.html#elements. However, it may be outdated now, since I did it in February.

I included tool allowing to compare current sets with the HTML5's ones. It shows how much we miss now. Question is - should we update the new DTD format (which is quite readable) with legacy stuff or the current one with the new missing things.

comment:4 Changed 7 years ago by Garry Yao

If it's to be considered in the 4.0 milestone, we shall just have:

  • the current DTD a little bit revised
  • fix the parser for HTML5 exceptions.

comment:5 Changed 7 years ago by Garry Yao

Owner: set to Garry Yao
Status: newassigned

Created branch t/9301b, features the following changes:

  1. Go expanded - with the current XHTML-based DTD kept untouched, absorbed the html5 DTD created by Reinmar in t/9301, by selectively merging.
  2. Go dynamic - No exposed tag list on CKEDITOR.dtd, DTD check should instead query CKEDITOR.dtd.checkChild( parent, child), read as "no dtd", the method will yield variant result based on the context of the criteria element (html5 semantic), e.g. block-level <a>
  3. Go classification - Revised and introduced various "$-prefixed" dictionary, $textBlock, $phrasingBlock, $structural, etc. Making dictionaries inferred with set operation, from some basic sets, instead of hard-coded.

comment:6 Changed 7 years ago by Garry Yao

Component: GeneralCore : DTD
Status: assignedreview

Review request opened at t/9301b on both repos.

comment:7 Changed 7 years ago by Piotrek Koszuliński

Owner: changed from Garry Yao to Piotrek Koszuliński

Reviewed t/9301b.

  • It is not optimized - the DTD is huge because of unnecessary merging.
  • DTD hasn't been corrected (v3's one has bugs, so merging v3+v4 gives incorrect result).
  • Proposed change is what we need to do in the future, to have full HTML5 support. Thus, I propose to archive this branch. However, it's unsafe now to do such a huge change. What's more - I wasn't even able to rebase this branch, because it touches so much code.

I pushed t/9301c branches with solution which IMO is more adequate to the beta state. I took HTML5 DTD and easily extended it with most important things from v3's DTD. I also corrected few things basing on Garry's proposal and careful review. I'm attaching file with diffs between v1 (old DTD), v3 (Garry's DTD) and v2 (mine one).

Of course mine solution doesn't handle transparent elements (for which we'll need to use Garry's). So I made a decision that all transparent elements, except audio and video are inline ones. That's how it was in v3 and has to be for some time still.

I had to do some changes to tests too, since <caption> in HTML5 can contain block elements and case table>caption was used in few tests. So I found a new case - details>summary, however, to make it work HTML5 shim was needed for IE<9 - I added it to wysiwygarea plugin.

Last edited 7 years ago by Piotrek Koszuliński (previous) (diff)

comment:8 Changed 7 years ago by Piotrek Koszuliński

Status: reviewassigned

Oups. One moment please.

comment:9 Changed 7 years ago by Piotrek Koszuliński

Status: assignedreview

Ok. I pushed small correction for audio and video (finally fixing bug reported on Github :).

Changed 7 years ago by Piotrek Koszuliński

Attachment: diffs.js added

comment:10 Changed 7 years ago by Garry Yao

Status: reviewreview_failed

This new DTD is basically good, I've pushed some fixes as well as compensations to t/9301c for issues that I found, please check those changes and put it on review if you agree with them.

comment:11 Changed 7 years ago by Piotrek Koszuliński

comment:12 Changed 7 years ago by Garry Yao

Owner: changed from Piotrek Koszuliński to Garry Yao
Status: review_failedreview
  • The html5 validator is not accepting command inside of head, but ok as for the spec it says so, I've reverted that change.
  • I've pushed a fix targeting elementPath::isContextFor to fix the tests' REDs.

comment:13 Changed 7 years ago by Piotrek Koszuliński

Status: reviewreview_passed

comment:14 Changed 7 years ago by Piotrek Koszuliński

Resolution: fixed
Status: review_passedclosed

comment:15 Changed 7 years ago by Piotrek Koszuliński

PS. I left t/9301b branches with Garry's proposal of the full HTML5 support. They may be useful for the future reference.

comment:16 in reply to:  15 Changed 7 years ago by Garry Yao

Replying to Reinmar:

PS. I left t/9301b branches with Garry's proposal of the full HTML5 support. They may be useful for the future reference.

I've moved branches t/9250b to #9457.

comment:17 Changed 7 years ago by Garry Yao

I've reverted the html5 shim on wysiwyg plugin:

  1. We're not intended to provide the shim
  2. The shim is not complete enough.
  3. It's not working for inline

https://github.com/ckeditor/ckeditor-dev/commit/3838a01cd910d8428ddba7e93c619df1e55d3551

comment:18 Changed 7 years ago by Piotrek Koszuliński

I think you forgot to push that change, because 3838a01 doesn't exist. And that's good, because it'd nice to first discuss this revert.

  1. Yes we are - this was discussed before.
  2. I know, but IMO enough for our needs. Lack of printing support isn't really a huge problem.
  3. That's right - because on inline page it's developer's obligation to enable HTML5 elements, since editor is part of the page. In framed editor we create our environment and we're obligated to do this. Especially that it'll be very hard for other developer to include any of the shims because it has to be placed in the <head> element.
Note: See TracTickets for help on using tickets.
© 2003 – 2019 CKSource – Frederico Knabben. All rights reserved. | Terms of use | Privacy policy