Opened 9 years ago

Closed 9 years ago

Last modified 9 years ago

#12740 closed Bug (fixed)

Pastefromword terminates with error message if word markup contains style prop "mso-list: none;"

Reported by: Henning Owned by: Tade0
Priority: Normal Milestone:
Component: Plugin : Paste from Word Version: 4.0.1
Keywords: Cc:

Description

In CKEditor 4.3.5 and IE9 or Chrome (at least), the paste from word plugin terminates with its error message when pasting a Word content via clipboard that contains style prop "mso-list: none;". See attached Word file. The content then is pasted to the editor, but mostly uncleaned. The style prop "mso-list: none;" seems to be generated by Word for the heading of the TOC (for example, in our case). Analyzing the file plugins/pastefromword/filter/default.js, I found that the code expects a property value containing at least one space, and will throw an exception otherwise:

val = val.split(' ');
var listId = Number( val[ 0 ].match( /\d+/ ) ),
   indent = Number( val[ 1 ].match( /\d+/ ) );

Hence, adding the following line after the split cures the issue:

if( !val || val.length < 2 ) return;

Can you please check whether this line does any other harm and, if not, add it to the code for future versions?

Moreover, it would be helpful if the paste from word filters would be more robust in the sense that an exception does not stop the filtering completely ...

Best, Henning

Attachments (2)

WSGCN-3550 Test document minimal.docx (17.7 KB) - added by Henning 9 years ago.
word file causing the error message
CKEditor - Internal Error on Paste as CTRL-V(1).docx (13.1 KB) - added by Jakub Ś 9 years ago.
Minimum file to reproduce the problem.

Download all attachments as: .zip

Change History (25)

Changed 9 years ago by Henning

word file causing the error message

comment:1 Changed 9 years ago by Jakub Ś

Status: newconfirmed
Version: 4.3.54.0.1

Problem can be reproduced from CKEditor 4.0.1 (works fine in 4.0) in every browser.

comment:2 Changed 9 years ago by Jakub Ś

#13339 was marked as duplicate.

comment:3 Changed 9 years ago by Jakub Ś

Other issues that might be dupcates or related to same source of the problem.

"mso-list: none;": #11215, #8780, #9685 (same part of code that causes problems)

"mso-list: Ignore;": #9685 <- there is another sample attached where there is no "mso-list: none; but there is "mso-list: Ignore;. Could be also causing problems.

"mso-list: skip;":#12762<- there is another sample attached where there is no "mso-list: none; but there is "mso-list: skip;. Could be also causing problems.

comment:4 Changed 9 years ago by Piotrek Koszuliński

Changed 9 years ago by Jakub Ś

Minimum file to reproduce the problem.

comment:5 Changed 9 years ago by Jakub Ś

After CKEditor 4.5, error message is no longer displayed however result is different to the original.

Please use CKEditor - Internal Error on Paste as CTRL-V(1).docx​ to check the problem.

comment:6 Changed 9 years ago by Piotrek Koszuliński

Milestone: CKEditor 4.5.2

OK, we'll check it then. But I'm afraid there may be nothing we can do with it, because I know cases where it was not possible to recognise a list.

comment:7 Changed 9 years ago by Jakub Ś

Problem can be reproduced in IE and Blink. Firefox looks ok IMHO.

HTML which Chrome sees (for CKEditor - Internal Error on Paste as CTRL-V(1).docx​):

<p class="SkyBullet" style="margin-top:0cm;margin-right:0cm;margin-bottom:8.0pt;
margin-left:0cm;text-indent:0cm;line-height:107%;mso-list:none"><em><span style="font-size:11.0pt;mso-bidi-font-size:10.0pt;line-height:107%;font-family:
&quot;Calibri&quot;,&quot;sans-serif&quot;;mso-ascii-theme-font:minor-latin;mso-hansi-theme-font:
minor-latin;mso-bidi-font-family:Arial;font-style:normal">Test1<o:p></o:p></span></em></p>

<p class="MsoNormal"><em><span lang="IT" style="font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;
mso-ascii-theme-font:minor-latin;mso-hansi-theme-font:minor-latin;mso-bidi-font-family:
Arial;font-style:normal;mso-bidi-font-style:italic">Test2</span></em><i><span lang="IT"><o:p></o:p></span></i></p>

HTML which IE9 sees:

<P style="LINE-HEIGHT: 107%; TEXT-INDENT: 0cm; MARGIN: 0cm 0cm 8pt; mso-list: none" class=SkyBullet><EM><SPAN style="LINE-HEIGHT: 107%; FONT-STYLE: normal; FONT-FAMILY: 'Calibri','sans-serif'; FONT-SIZE: 11pt; mso-bidi-font-size: 10.0pt; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Arial">Test1<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /><o:p></o:p></SPAN></EM></P>
<P style="MARGIN: 0cm 0cm 8pt" class=MsoNormal><EM><SPAN style="FONT-STYLE: normal; FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Arial; mso-bidi-font-style: italic" lang=IT>Test2</SPAN></EM><I style="mso-bidi-font-style: normal"><SPAN lang=IT><o:p></o:p></SPAN></I></P>

HTML which IE1 sees:

<p style="margin: 0cm 0cm 8pt; line-height: 107%; text-indent: 0cm; mso-list: none;"><em><span style='line-height: 107%; font-family: "Calibri","sans-serif"; font-size: 11pt; font-style: normal; mso-bidi-font-size: 10.0pt; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Arial;'>Test1</span></em></p>

<p style="margin: 0cm 0cm 8pt;"><em><span lang="IT" style='font-family: "Calibri","sans-serif"; font-style: normal; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Arial; mso-bidi-font-style: italic;'>Test2</span></em></p>

comment:8 Changed 9 years ago by Tade0

Owner: set to Tade0
Status: confirmedassigned

comment:9 Changed 9 years ago by Tade0

Checked Firefox today and managed to reproduce the problem.

The trouble starts with this line: https://github.com/ckeditor/ckeditor-dev/blob/master/plugins/pastefromword/filter/default.js#L805

The condition is truthy and therefore a mso-list: Ignore style is added. After that the paragraph is treated as a bullet point.

How should we normally interpret values of mso-list such as "none", or "skip"?

comment:10 Changed 9 years ago by Tade0

Status: assignedpending

comment:11 Changed 9 years ago by Piotrek Koszuliński

How should we normally interpret values of mso-list such as "none", or "skip"?

That's a question to you. I will not even try to guess what this style may mean because it is pointless in case of MSWord. You must check when it appears and what does it mean by comparing to the document from which you copied content. In this case I can see that the document contained no list, so after pasting there should be no list. Perhaps, we should check value of mso-list in that condition.

comment:12 Changed 9 years ago by Tade0

I found a very well hidden manual for these attributes and most of the other stuff generated by MSWord: https://msdn.microsoft.com/en-us/library/Aa155477

It's a little dated, but contains descriptions and sometimes even rationale for some of the elements.

comment:13 Changed 9 years ago by Tade0

Status: pendingreview

I added a check that makes sure that a paragraph with the style mso-list: none is not turned into a list.

I turned the data that's produced by pasting the contents of the attachments into tests.

Changes pushed to branch:t/12740.

comment:14 Changed 9 years ago by Piotrek Koszuliński

Status: reviewreview_failed

Pushed branch:t/12740. I moved the created tests into the default file, as there's no reason to separate them.

However, they do not pass on Firefox and IE8 at least ;/

Last edited 9 years ago by Piotrek Koszuliński (previous) (diff)

comment:15 Changed 9 years ago by Tade0

Status: review_failedassigned

On Firefox this is a problem of two additional spans, that are not filtered out. One contains a lang attribute, the other some styles. Not sure what is the right behavior here.

On the other hand on IE this test breaks more severely - the result is completely empty. I'll have look deeper into this one.

comment:16 Changed 9 years ago by Tade0

The error in IE occurs because the test is faulty.

comment:17 Changed 9 years ago by Piotrek Koszuliński

What do you mean that it's faulty?

comment:18 Changed 9 years ago by Tade0

I'm using the getHtml() method instead of getValue(), which returns an empty string in IE.

Curiously enough, getValue() seems to return undefined only for the textareas that I created.

comment:19 Changed 9 years ago by Tade0

Status: assignedreview

I fixed the tests, but in order to do that I had to retype the textarea tags manually. Apparently there were some hidden whitespace characters(apart from tabs) that had to be removed.

Changes pushed to branch:t/12740.

Version 0, edited 9 years ago by Tade0 (next)

comment:20 Changed 9 years ago by Piotrek Koszuliński

Maybe typo in the <textarea> tag name? :D

comment:21 Changed 9 years ago by Tade0

I would wish. :D

Surprisingly, the browser recognized the textareas. The only difference was that the content in my textareas was interpreted as DOM within the tag(instead of being recognized as one giant text node).

Literally the same content, e.g. <textarea id="someList"><p>test</p></texarea> worked when I typed it and did not work if I simply modified the textarea created a while earlier. Magic.

comment:22 Changed 9 years ago by Piotrek Koszuliński

Milestone: CKEditor 4.5.2
Resolution: fixed
Status: reviewclosed

Unfortunately, when I started testing it manually, it turned out that this ticket is a mess. The reported bug (the one with mso-list) was fixed in 4.5.0 and this ticket should have been closed. Therefore I'm closing it.

However, there may be (I haven't checked because I don't have MSWord on Windows) two or even three more issues, all separate:

  1. That content of the original file (http://dev.ckeditor.com/attachment/ticket/12740/WSGCN-3550%20Test%20document%20minimal.docx) is not properly transformed into semantic HTML.
  2. The content of the original file still causes some error - this time a different one (tested on Firefox - got error in https://github.com/ckeditor/ckeditor-dev/blob/master/plugins/pastefromword/filter/default.js#L777)
  3. Some problems (errors?) with the file uploaded later (http://dev.ckeditor.com/attachment/ticket/12740/CKEditor%20-%20Internal%20Error%20on%20Paste%20as%20CTRL-V(1).docx).

Additionally, there are set of problems with different mso-list values (comment:3).

All this should be fixed in an organised way, supported by tests. And here goes a crucial detail - we must have HTML for each browser engine in the tests (IE, Firefox, Chrome). See http://tests.ckeditor.dev:1030/tests/tickets/9456/2 for instance.

Otherwise, we end up like now. When I tested both files manually I still got errors on Firefox. Errors which weren't caught by tests and which weren't mentioned and which had no direct connection to mso-list. Also, other, related cases weren't checked so we don't know if things got better or not.

What I'll do is I'll report an umbrella ticket for all these cases in 4.5.3. We will check all these files one by one and verify which tickets can be closed now and how to close the rest.

comment:23 Changed 9 years ago by Piotrek Koszuliński

Reported #13590.

Note: See TracTickets for help on using tickets.
© 2003 – 2022, CKSource sp. z o.o. sp.k. All rights reserved. | Terms of use | Privacy policy