Opened 8 years ago

Last modified 8 years ago

#14239 confirmed Bug

Spurious nodes added to document when toggling source view control when certain ACF rules applied

Reported by: Kevin Kamel Owned by:
Priority: Normal Milestone:
Component: General Version: 4.0
Keywords: Cc:

Description

Steps to reproduce

  1. Create custom CKEditor ACF configuration which permits non-html4 tags into the document, for example:

CKEDITOR.config.extraAllowedContent = 'summary';

  1. Instantiate browser instance using this custom configuration
  2. Repeatedly toggle the source view control
  3. Observe the <p>&nbsp;</p> nodes being rapidly appended to the document. Seemingly these nodes duplicate every single time you toggle.

Expected result

I do not expect to see random <p>&nbsp;</p> added to the document.

Actual result

<p>&nbsp;</p> nodes are rapidly added to the document, corrupting it.

Other details (browser, OS, CKEditor version, installed plugins)

A demo of this bug is present at the following URL: https://jsfiddle.net/kamelkev/hqLbhzxz/1/

I have observed this issue with current Firefox (42), current Safari (9.0.1) and current Chrome (47.0.2526.80).

I have observed this issue with the most recent release (4.5.5), but replicated it with older versions as well.

Note that the issue does not appear to manifest itself if you use the ENTER_BR entermode.

Note the issue does not appear if you disable auto-paragraphing.

Note the issue does not appear if you apply the fix from the following stackoverflow article: http://stackoverflow.com/questions/24283528/line-break-and-paragraph-has-doubled-itself-every-time-i-save-the-document-ckedi

Attachments (1)

replacebycode4.html (2.7 KB) - added by Jakub Ś 8 years ago.

Download all attachments as: .zip

Change History (17)

comment:1 Changed 8 years ago by Jakub Ś

Status: newconfirmed
Version: 4.5.54.0

I have been able to reproduce this problem from CKEditor 4.0 in all browsers.

Just in case I’m attaching sample file.

Changed 8 years ago by Jakub Ś

Attachment: replacebycode4.html added

comment:2 Changed 8 years ago by Marek Lewandowski

Resolution: invalid
Status: confirmedclosed

Given markup is invalid, as summary needs to be nested within details element, like so:

<p>The issue is entirely absent when switching enter mode over to BR.</p>

<details>
<summary>Summary HTML5 tag, below this tag is where the horrible duplication occurs</summary>
</details>

<p>Fake tags (in this case &quot;broken&quot;) also causes the issue</p>

In that case, given problem does not occur.

comment:3 Changed 8 years ago by Kevin Kamel

I think there was a miscommunication here.

I demonstrated that entering html5 (invalid or not, depends on the circumstances) will cause CKEditor to insert spurious nodes. The editor is effectively unusable as a result. Since when does CKEditor take markup and make it worse?

My example is just that, an example. If you play with HTML5 and custom ACF rules you are going to quickly see that there are other cases that are even worse. This is a very severe bug that prevents all but perfect HTML5 from being used with the editor.

The entire world is moving over to HTML5. Nobody wants to use an editor that breaks if there is a markup mistake. The HTML4 support was very good in this regard.

Last edited 8 years ago by Kevin Kamel (previous) (diff)

comment:4 Changed 8 years ago by Marek Lewandowski

If you play with HTML5 and custom ACF rules you are going to quickly see that there are other cases that are even worse.

ACF has nothing to do with this problem, my assumption here is that it might have something to do with DTD.

I demonstrated that entering html5 (invalid or not, depends on the circumstances) will cause CKEditor to insert spurious nodes. The editor is effectively unusable as a result. Since when does CKEditor take markup and make it worse?

This is just a demo of invalid HTML5 markup. If same thing happens in any different case with a valid HTML (be it HTML4 or HTML5) markup then please report a seaprate ticket, and we'll definitely check this out.

comment:5 Changed 8 years ago by Kevin Kamel

Of course ACF is related to this issue, I'm adding tags to it via extraAllowedContent. Without doing so the tag would be aggressively stripped.

If anything this behavior is highly inconsistent with what happens in HTML4.

For example, in HTML4 <table>, <tr> and <td> share the same relationship that <detail> and <summary> share. In both cases you must nest the tags properly in order to use the tag.

However in the case of HTML4, failure to nest the tags does not result in spurious <span>&nbsp;</span> being added to the end of the document. Instead the improperly nested tag is converted to a <p>, which is the expected behavior in HTML5, but simply isn't happening.

Whether or not there is an HTML5 syntax or error is largely irrelevant to the issue anyway. Under no circumstances should the editor "break" the document when you press the source view button.

The observed behavior between HTML5 and HTML4 is entirely inconsistent now.

I'm not the only one who has noticed this either. Stackoverflow has a series of questions related to this behavior.

comment:6 Changed 8 years ago by Jakub Ś

Resolution: invalid
Status: closedreopened

ACF doesn't apply here. It only allows certain tags to be used inside the editor but it doesn't create extra P.

Custom tags don't cause problems themselves.

The problem are summary tag and menuitem tags which should be placed inside "some other" tags. When they are placed outside of their parents, their handling not only creates extra P but also creates extra P for every tag placed below them.

While this is a problem in CKEditor, this is kind of an edge case. Unless someone types invalid code directly, there is no other way to enter it. HTML generator or Plugin authors should make sure that HTML provided into CKEditor is correct.

Issue is valid however it looks like a low priority because of the reasons mentioned above. @kamelkev have you perhaps found other examples or see this issue differently (I'm asking in terms of entering HTML by manually)?

comment:7 Changed 8 years ago by Jakub Ś

Status: reopenedconfirmed

comment:8 Changed 8 years ago by Kevin Kamel

@j.swiderski: Thanks for relooking at this issue.

There is another way for the bad html to be entered, and that's via copy/paste.

Our product service provides a UI to create and edit newsletters. My clients tend to have their HTML already generated, prepared using external tools, and then copy/paste it into the source view of the editor.

HTML5 is gradually making it's way into the email newsletter world, and as such I've seen customers trying to do more things with it.

Currently I have CKEditor configured to strip all of these tags, but seemingly I'm in the minority, as many of my competitors have already started accepting HTML5 using CKEditor. Some of them appear to have modified the software to prevent this particular issue from occurring.

I think the HTML5 is a little bit of a distraction here though. Proper entry of HTML5 is not really the problem. The problem is the spurious nodes that appear at the end of the document. These are very difficult to programmatically fix, as there is no way to know if a person put that data there, or if it was magically added by the tool. That part of the issue is what I have been trying (unsuccessfully) to focus on within this dialog.

Thanks again for looking.

comment:9 Changed 8 years ago by Jakub Ś

Thank you for the feedback. I understand the problem but there is one thing which in my opinion makes it a marginal issue (it’s all about making it non-marginal). If you could shed some light:

My clients tend to have their HTML already generated, prepared using external tools,

Who creates HTML generator that doesn't follow HTML rules? Are there many such clients (please answer honestly). If there is a company which has a faulty generator, why not talking to the company? I mean it is hard to expect from the single editor to fix every issue there is especially that generator author should have or could have easily avoid it.

Having said all the above, I will talk to the team. On one hand, it is the generator author who should have taken care of this. On second hand we take care or td, tr and legend tags by simply removing them if they are not located in the proper context. Perhaps we should also remove these ones.

comment:10 Changed 8 years ago by Kevin Kamel

Thanks again for following up on this issue.

It is extremely common to find various syntactical quirks related to HTML5 in just about every single tool out there.

Examples of editors where I know there have been HTML5 issues in the past (some got fixed, some remain unresolved): Dreamweaver, Coda, Wix, Google Pages. Both desktop and web based tools are affected. In some cases (like Coda) the page templates themselves start as invalid, so any code you generate using those templates remains invalid.

This shouldn't be surprising, HTML5 support is still being merged into many tools, so support is not really mature for it yet. That being said, these tools are fixing these bugs, and things are getting better.

My experience is based upon the handling of several thousand clients, many of which use whatever tool they are most comfortable with to generate their content, even if that tool isn't really "good".

With regard to following up with the authors of the above mentioned tools: I have done so whenever possible, although some of these tools are not open source, so filing and tracking bugs is not always possible.

My thinking here is that if we handle invalid HTML5 in the same way that the table tags are supported, we will have a decent solution. Hopefully others agree.

thanks

comment:11 Changed 8 years ago by arjenm

I just encountered this issue as well. I'm researching CKEditor for the CMS we use for articles (news bulletins, reviews and more).

We have some special on-the-fly elements that are only included via php-code upon display of the article. For instance a video-player or information about the lowest price for a certain product. Some of those can even change fairly often, like the that 'lowest price of product X' (even multiple times a day).

Besides, we have been doing this site for almost 18 years... So we know things change, the video-player has seen many versions (from flash-embed to iframe+video-tag). So I really don't want the specific html for a video player or the 'lowest price' hard coded in the article.

But I do want to provide the editors with a means to include those and that should be with a useful preview of those specific elements. When researching this, I encountered this tutorial about custom elements in html5.

Such custom elements seem to be really useful, especially with the shadow dom; the html in the article can be something like this:

<p>See the video below</p>
<our-video data-video-id="34567"></our-video>

I actually have that working via a widget+dialog. When that custom tag is inserted, it will actually display a really good preview of the video including the correct holder image and of the correct size. But when you switch to source-view, you just see the above html snippet: no img-tags, iframes or video-tags. I.e. we don't have to remove any preview-html from the source prior to storing it in the database. When such an article is displayed, the custom elements can be replaced with the actual current version of the video player html.

Even in (most) browsers that don't support custom elements, those custom elements can still be styled using CSS. So while the preview will be less accurate, it still gets a preview.

Unfortunately, that custom element triggers this bug. When you switch to source-display, it looks fine. When you switch back, there is much more white space. And viewing the source again reveals several added sets of these: '<p>&nbsp;</p>'

As far as I know, the above snippet is valid html5. Even if the tag is not registered or in browsers that don't support custom elements. And CKEditor is configured (via the widget's allowedContent) to allow above custom element.

Unfortunately, this bug makes CKEditor a much less viable option for our CMS (at least if we choose to use custom elements).

Last edited 8 years ago by arjenm (previous) (diff)

comment:12 Changed 8 years ago by Jakub Ś

@arjenm I'm not sure if your issue is the same as @kamelkev.

@arjenm please note that CKEditor is not any tags editor but HTML editor. Now there are two ways for having custom tags inside the editor.

  1. http://docs.ckeditor.com/#!/api/CKEDITOR.config-cfg-protectedSource - If you have limited number of custom tags, just define them in protectedSource regex and they will be left untouched. You will not see them in design mode and in source mode they will be left as entered.
  1. Modify dtd. I need to start by saying that CKEditor doesn't support custom block-level tags but it does support inline custom tags provided that you modify dtd and ACF. Please see:
    http://dev.ckeditor.com/ticket/14418
    http://dev.ckeditor.com/ticket/10340
    http://dev.ckeditor.com/ticket/11562
  1. There is another different solution. One thing that I can't understand is why developers tend to create various custom tags which, as you mentioned yourself, are not even supported in browsers when you have data- attributes. How simpler would it all be if instead of our-video a standard div element could be used (provided that our-video is a block) with a bunch o data-video-* attributes. Browsers will understand it, editors will have no problem with it and the only change which needs to be done is on the server side.
    I would strongly strongly recommend this one as data-* gives you the ability to do code almost anything custom.

comment:13 Changed 8 years ago by arjenm

As I said, I was experimenting. I haven't fully explored all options available via CKeditor. I do have to mention that I created a widget that inserted the specific custom video-tag. So rather than a 'ignore this, it should not be editable' it had a actual function in the wysiwyg-section as well.

I do think this is the same bug (or very similar in its result). To make it more clear, this is what happens. The first time the source displays this:

<twk-video data-video-id="1234566"></twk-video>

<p>A piece of text</p>

I then switch back to wysiwyg and it shows more space between the video and the text. And the source now displays:

<twk-video data-video-id="1234566"></twk-video>

<p>&nbsp;</p>

<p>A piece of text</p>

<p>&nbsp;</p>

http://static.tweakers.net/ext/f/rAQxU4v1hGKnSfMqc5Z4MTWO/full.png

When repeating that, the source becomes:

<twk-video data-video-id="1234566"></twk-video>

<p>&nbsp;</p>

<p>&nbsp;</p>

<p>&nbsp;</p>

<p>A piece of text</p>

<p>&nbsp;</p>

<p>&nbsp;</p>

<p>&nbsp;</p>

And so on.


Btw the custom tag functionality is actually more or less standardized nowadays.

The custom video-tag works pretty well (see screenshot). The only issue I couldn't fix is the that switching to source-mode and back adds empty paragraphs. That seems to be the same issue as described here. The html should not change when switching back and forth between source-mode and wysiwyg-mode. And more importantly, it should stabilize after one such iteration, not continu on changing.

To answer your third point. In the editor this is intended for, we prefer to have more semantic value in the content compared to its current incarnation. A <twk-video data-videoId=X> is much more explicit than <div data-type="video" data-videoId=X>. The earlier linked standards (custom element and shadow dom) make adding a visual representation to those tags very complete, yet keep things clean (see screenshot compared to the html). The visual representation required no change to CKeditor at all.

There are probably multiple solutions to achieve this, shadow dom can be added to div's as well or the innerHtml of div's can be changed and later removed, etc. I don't know yet which one is the best solution. But that additional <p>&nbsp;</p> that appears upon toggling to source is currently the only reason I wouldn't consider my solution as being complete in CKeditor.

Btw, if you want to, we can move this part of the discussion to the forum.

comment:14 Changed 8 years ago by Jakub Ś

Btw the custom tag functionality is actually more or less standardized nowadays.

The first note in the article you have linked says this is "this is an experimental technology" so I would be careful with talking about standards here. Even this, more up to date document, says Working Draft - https://www.w3.org/TR/custom-elements/.

I asked around the guys. Some think more conservative like me that HTML should not be extended and its list should be kept closed and some think like you. The point of views are devided.
Anyway the ticket is confirmed and it will get fixed at some point. In the meantime if you want to use the editor please use <div data-type="video" data-videoId=X> instaed of <twk-video data-videoId=X> as block-level custom tags are not supported #10340 (you can only get inline to work #14615)

comment:15 Changed 8 years ago by arjenm

I agree that part is experimental. Only one browser has its support enabled, although there is a working polyfill-library that adds (most of?) the functionality to other browsers.

But the html5 spec already allows for unknown elements:

User agents must treat elements and attributes that they do not understand as semantically neutral; leaving them in the DOM (for DOM processors), and styling them according to CSS (for CSS processors), but not inferring any meaning from them.

https://www.w3.org/TR/html5/infrastructure.html#extensibility-0

Anyway, whether browsers support the experimental custom elements or not isn't really relevant for this report.

CKEditor supports it: One can define a tag to be allowed, regardless of whether that tag is part of html or not. And that actually works; the tag is not removed upon submit, the tag is properly converted into a widget upon load, etc.

But when you switch to html-view, those additional elements appear. They were not in the dom-tree prior to that switch. Whether you think my specific use case is a bad idea, adding those elements should not happen should it?

That sounds like a bug to me.

Version 0, edited 8 years ago by arjenm (next)

comment:16 Changed 8 years ago by Jakub Ś

Yes, it is a confirmed bug as I have mentioned earlier.

Note: See TracTickets for help on using tickets.
© 2003 – 2022, CKSource sp. z o.o. sp.k. All rights reserved. | Terms of use | Privacy policy