#12740 closed Bug (fixed)
Pastefromword terminates with error message if word markup contains style prop "mso-list: none;"
Reported by: | Henning | Owned by: | Tade0 |
---|---|---|---|
Priority: | Normal | Milestone: | |
Component: | Plugin : Paste from Word | Version: | 4.0.1 |
Keywords: | Cc: |
Description
In CKEditor 4.3.5 and IE9 or Chrome (at least), the paste from word plugin terminates with its error message when pasting a Word content via clipboard that contains style prop "mso-list: none;". See attached Word file. The content then is pasted to the editor, but mostly uncleaned. The style prop "mso-list: none;" seems to be generated by Word for the heading of the TOC (for example, in our case). Analyzing the file plugins/pastefromword/filter/default.js, I found that the code expects a property value containing at least one space, and will throw an exception otherwise:
val = val.split(' '); var listId = Number( val[ 0 ].match( /\d+/ ) ), indent = Number( val[ 1 ].match( /\d+/ ) );
Hence, adding the following line after the split cures the issue:
if( !val || val.length < 2 ) return;
Can you please check whether this line does any other harm and, if not, add it to the code for future versions?
Moreover, it would be helpful if the paste from word filters would be more robust in the sense that an exception does not stop the filtering completely ...
Best, Henning
Attachments (2)
Change History (25)
Changed 10 years ago by
Attachment: | WSGCN-3550 Test document minimal.docx added |
---|
comment:1 Changed 10 years ago by
Status: | new → confirmed |
---|---|
Version: | 4.3.5 → 4.0.1 |
Problem can be reproduced from CKEditor 4.0.1 (works fine in 4.0) in every browser.
comment:3 Changed 10 years ago by
Other issues that might be dupcates or related to same source of the problem.
"mso-list: none;": #11215, #8780, #9685 (same part of code that causes problems)
"mso-list: Ignore;": #9685 <- there is another sample attached where there is no "mso-list: none;
but there is "mso-list: Ignore;
. Could be also causing problems.
"mso-list: skip;":#12762<- there is another sample attached where there is no "mso-list: none;
but there is "mso-list: skip;
. Could be also causing problems.
comment:4 Changed 10 years ago by
It seems that we fixed this bug in 4.5.0 or 4.4.8 with: https://github.com/ckeditor/ckeditor-dev/commit/0d92dbad05b71c2d06228976a7bcd6998faddc7b
Changed 10 years ago by
Attachment: | CKEditor - Internal Error on Paste as CTRL-V(1).docx added |
---|
Minimum file to reproduce the problem.
comment:5 Changed 10 years ago by
After CKEditor 4.5, error message is no longer displayed however result is different to the original.
Please use CKEditor - Internal Error on Paste as CTRL-V(1).docx to check the problem.
comment:6 Changed 10 years ago by
Milestone: | → CKEditor 4.5.2 |
---|
OK, we'll check it then. But I'm afraid there may be nothing we can do with it, because I know cases where it was not possible to recognise a list.
comment:7 Changed 10 years ago by
Problem can be reproduced in IE and Blink. Firefox looks ok IMHO.
HTML which Chrome sees (for CKEditor - Internal Error on Paste as CTRL-V(1).docx):
<p class="SkyBullet" style="margin-top:0cm;margin-right:0cm;margin-bottom:8.0pt; margin-left:0cm;text-indent:0cm;line-height:107%;mso-list:none"><em><span style="font-size:11.0pt;mso-bidi-font-size:10.0pt;line-height:107%;font-family: "Calibri","sans-serif";mso-ascii-theme-font:minor-latin;mso-hansi-theme-font: minor-latin;mso-bidi-font-family:Arial;font-style:normal">Test1<o:p></o:p></span></em></p> <p class="MsoNormal"><em><span lang="IT" style="font-family:"Calibri","sans-serif"; mso-ascii-theme-font:minor-latin;mso-hansi-theme-font:minor-latin;mso-bidi-font-family: Arial;font-style:normal;mso-bidi-font-style:italic">Test2</span></em><i><span lang="IT"><o:p></o:p></span></i></p>
HTML which IE9 sees:
<P style="LINE-HEIGHT: 107%; TEXT-INDENT: 0cm; MARGIN: 0cm 0cm 8pt; mso-list: none" class=SkyBullet><EM><SPAN style="LINE-HEIGHT: 107%; FONT-STYLE: normal; FONT-FAMILY: 'Calibri','sans-serif'; FONT-SIZE: 11pt; mso-bidi-font-size: 10.0pt; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Arial">Test1<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /><o:p></o:p></SPAN></EM></P> <P style="MARGIN: 0cm 0cm 8pt" class=MsoNormal><EM><SPAN style="FONT-STYLE: normal; FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Arial; mso-bidi-font-style: italic" lang=IT>Test2</SPAN></EM><I style="mso-bidi-font-style: normal"><SPAN lang=IT><o:p></o:p></SPAN></I></P>
HTML which IE1 sees:
<p style="margin: 0cm 0cm 8pt; line-height: 107%; text-indent: 0cm; mso-list: none;"><em><span style='line-height: 107%; font-family: "Calibri","sans-serif"; font-size: 11pt; font-style: normal; mso-bidi-font-size: 10.0pt; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Arial;'>Test1</span></em></p> <p style="margin: 0cm 0cm 8pt;"><em><span lang="IT" style='font-family: "Calibri","sans-serif"; font-style: normal; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Arial; mso-bidi-font-style: italic;'>Test2</span></em></p>
comment:8 Changed 10 years ago by
Owner: | set to Tade0 |
---|---|
Status: | confirmed → assigned |
comment:9 Changed 10 years ago by
Checked Firefox today and managed to reproduce the problem.
The trouble starts with this line: https://github.com/ckeditor/ckeditor-dev/blob/master/plugins/pastefromword/filter/default.js#L805
The condition is truthy and therefore a mso-list: Ignore style is added. After that the paragraph is treated as a bullet point.
How should we normally interpret values of mso-list such as "none", or "skip"?
comment:10 Changed 10 years ago by
Status: | assigned → pending |
---|
comment:11 Changed 10 years ago by
How should we normally interpret values of mso-list such as "none", or "skip"?
That's a question to you. I will not even try to guess what this style may mean because it is pointless in case of MSWord. You must check when it appears and what does it mean by comparing to the document from which you copied content. In this case I can see that the document contained no list, so after pasting there should be no list. Perhaps, we should check value of mso-list in that condition.
comment:12 Changed 10 years ago by
I found a very well hidden manual for these attributes and most of the other stuff generated by MSWord: https://msdn.microsoft.com/en-us/library/Aa155477
It's a little dated, but contains descriptions and sometimes even rationale for some of the elements.
comment:13 Changed 10 years ago by
Status: | pending → review |
---|
I added a check that makes sure that a paragraph with the style mso-list: none is not turned into a list.
I turned the data that's produced by pasting the contents of the attachments into tests.
Changes pushed to branch:t/12740.
comment:14 Changed 10 years ago by
Status: | review → review_failed |
---|
Pushed branch:t/12740. I moved the created tests into the default file, as there's no reason to separate them.
However, they do not pass on Firefox and IE8 at least ;/
comment:15 Changed 10 years ago by
Status: | review_failed → assigned |
---|
On Firefox this is a problem of two additional spans, that are not filtered out. One contains a lang attribute, the other some styles. Not sure what is the right behavior here.
On the other hand on IE this test breaks more severely - the result is completely empty. I'll have look deeper into this one.
comment:18 Changed 10 years ago by
I'm using the getHtml() method instead of getValue(), which returns an empty string in IE.
Curiously enough, getValue() seems to return undefined only for the textareas that I created.
comment:19 Changed 10 years ago by
Status: | assigned → review |
---|
I fixed the tests, but in order to do that I had to retype the textarea tags manually. Apparently there were some hidden whitespace characters(apart from tabs) that had to be removed.
Tests green on Chrome, Firefox, IE8.
Changes pushed to branch:t/12740.
comment:21 Changed 10 years ago by
I would wish. :D
Surprisingly, the browser recognized the textareas. The only difference was that the content in my textareas was interpreted as DOM within the tag(instead of being recognized as one giant text node).
Literally the same content, e.g. <textarea id="someList"><p>test</p></texarea> worked when I typed it and did not work if I simply modified the textarea created a while earlier. Magic.
comment:22 Changed 10 years ago by
Milestone: | CKEditor 4.5.2 |
---|---|
Resolution: | → fixed |
Status: | review → closed |
Unfortunately, when I started testing it manually, it turned out that this ticket is a mess. The reported bug (the one with mso-list) was fixed in 4.5.0 and this ticket should have been closed. Therefore I'm closing it.
However, there may be (I haven't checked because I don't have MSWord on Windows) two or even three more issues, all separate:
- That content of the original file (http://dev.ckeditor.com/attachment/ticket/12740/WSGCN-3550%20Test%20document%20minimal.docx) is not properly transformed into semantic HTML.
- The content of the original file still causes some error - this time a different one (tested on Firefox - got error in https://github.com/ckeditor/ckeditor-dev/blob/master/plugins/pastefromword/filter/default.js#L777)
- Some problems (errors?) with the file uploaded later (http://dev.ckeditor.com/attachment/ticket/12740/CKEditor%20-%20Internal%20Error%20on%20Paste%20as%20CTRL-V(1).docx).
Additionally, there are set of problems with different mso-list values (comment:3).
All this should be fixed in an organised way, supported by tests. And here goes a crucial detail - we must have HTML for each browser engine in the tests (IE, Firefox, Chrome). See http://tests.ckeditor.dev:1030/tests/tickets/9456/2 for instance.
Otherwise, we end up like now. When I tested both files manually I still got errors on Firefox. Errors which weren't caught by tests and which weren't mentioned and which had no direct connection to mso-list. Also, other, related cases weren't checked so we don't know if things got better or not.
What I'll do is I'll report an umbrella ticket for all these cases in 4.5.3. We will check all these files one by one and verify which tickets can be closed now and how to close the rest.
word file causing the error message