#10018 closed Bug (wontfix)
IE8 not cleaning up invalid HTML
Reported by: | Jon Sykes | Owned by: | |
---|---|---|---|
Priority: | Normal | Milestone: | |
Component: | Core : Parser | Version: | |
Keywords: | Cc: |
Description
We had HTML corruption, during testing we identified that in IE8 some broken HTML isn't being cleaned and allows for saving of the corrupt HTML structure.
If you open up the demo page, and paste the code snippet below either from a web page that has this code then switch to source you'll see that the empty tags don't get cleaned up.
They do they cleaned in Chrome and Firefox no problem.
<p><//><//></p>
Change History (9)
comment:1 Changed 12 years ago by
Keywords: | ie8 invalid html xhtml parse removed |
---|---|
Status: | new → pending |
Version: | 3.6.4 |
comment:2 Changed 12 years ago by
Sorry, I simplified it too much, I just took the one of the corrupt sample content blocks we have found and tried to reduce it down to the minium.
<ul dir="ltr"> <li align="justify"> </>abc <li align="justify"> </></></>123</></><//><//><//><//><//></li> <//></li> </ul> <p> </></></></></></> <p> </></></></></></> <p> </></></></></></> <p> </></></></></></> <p> </></></></></></> <p> </></></></></></> <p> </></></></></> <p> </></></></></></> <p> </></></></></> <p> </></></> <p> </></></></> <ul> </ul> <p> </p> <p> </p> <p> </p> <p> </p> <p> </p> <p> </p> <p> </p> <p> </p> <p> </p> <p> </p> <//><//><//><//></p> <//><//><//></p> <//><//><//><//><//></p> <//><//><//><//><//><//></p> <//><//><//><//><//></p> <//><//><//><//><//><//></p> <//><//><//><//><//><//></p> <//><//><//><//><//><//></p> <//><//><//><//><//><//></p> <//><//><//><//><//><//></p> <//><//><//><//><//><//></p>
If you drop this into ckeditor 3.6.4 in IE8 and toggle back and forth from source to edit view, you'll see each time the html gets more and more crazy without doing any other actions. I'll try and boil this down even further into an even shorter example.
comment:3 Changed 12 years ago by
<ul dir="ltr"> <li align="justify"> abc</li> <li align="justify"> </></></>123</></><//><//><//><//><//></li> </ul>
This doesn't generate additional corrupt html, but it also retains the <> tags and doesn't strip or convert to comments.
comment:4 Changed 12 years ago by
The root cause of these additional weird tags might be related to http://dev.ckeditor.com/ticket/6789 but even if empty tags are generated I was under the impression ckeditor validated markup.
comment:5 Changed 12 years ago by
Resolution: | → wontfix |
---|---|
Status: | pending → closed |
- Your first example is not valid. You have p inside p which is not valid HTML (<> are also invalid). You can always say that editor should not react the way it reacts but ... there is always way to break editor with really complicated and invalid HTML.
- Second example - As you already know this is also invalid HTML (entities should be used here). In this case editor does no cleaning. It pastes what browser gives it. In Webkit, FF and Opera you get comments, in IE9 these are removed, in IE7 and IE8 these are left untouched.
- This ignoring may happen (my assumption only) because sometimes users want to paste their own custom inline tags (not containing children). Such tags are then not ignored and allow better flexibility.
- In CKEditor 4.1 we will introduce allowedContent filter for filtering tags, styles and attributes. My colleague has checked your first example on test branch and it will change in two lists and bunch of paragraphs below them if <> will be filtered.
To summarize: in CKE 3.6.x this is a won't fix. In CKEditor 4.1 this will be handled by new feature.
To solve this problem in CKE 3.x you can attach listeners on Ctrl+v (keydown and check keystroke) and paste event and filer it there. You could try attaching filters to insertHtml event - http://docs.cksource.com/ckeditor_api/symbols/CKEDITOR.editor.html#event:insertHtml.
If this is not a problem in editor but in your application than you can filter data when it is submitted to your app (getData() method or event can be used here - http://docs.cksource.com/ckeditor_api/symbols/CKEDITOR.editor.html#event:getData).
comment:6 Changed 12 years ago by
I understand it's not valid HTML, that's why it's such a concern. In our situation it's not our users entering malformed HTML (they only have the wysiwyg) view, our suspicion is the <> and </> are being caused by another ckeditor bug (like 6789) what I was hoping this ticket might get was if that malformed HTML is generated by Ckeditor that it wouldn't then save it.
It does sound like we might have to write our own parser on save to ensure that HTML is cleaned up or rejected, I just thought one of the features of Ck was that it validated against a DTD and always generated valid HTML.
comment:7 Changed 12 years ago by
In CKEditor 4.1 we will introduce allowedContent filter for filtering tags, styles and attributes. My colleague has checked your first example on test branch and it will change in two lists and bunch of paragraphs below them if <> will be filtered.
It does sound like we might have to write our own parser on save to ensure that HTML is cleaned up or rejected, I just thought one of the features of Ck was that it validated against a DTD and always generated valid HTML.
No need to write anything. Wait for 4.1.
comment:9 Changed 12 years ago by
@jonsykes your initail report was talking about CKEditor leaving invalid code and not creating invalid code. If you are convinced that CKEditor is creating such code then could you please provide Sample file that is causing this error?
Please also note that #6789 creatures entities <> which in WYSIWYG mode show up as <> while you claim that CKEditor creates <> in HTML (Source mode). This doesn't look like the same issue.
I have tried pasting your code directly in source mode in CKEditor or displaying it first on HTML page, then copying and pasting in WYSIWYG area.
In all cases either such 'tags' were removed or changed to HTML comments.
Could you please provide exact steps to reproduce this problem or/and sample and working page helping to reproduce it?
I have tested editor 3.6.4, 3.6.6 and 4.0.1