#9456 closed Bug (fixed)
Properly paste bullet list style from MS-Word
Reported by: | JPG | Owned by: | Piotrek Koszuliński |
---|---|---|---|
Priority: | Normal | Milestone: | CKEditor 4.0.1 |
Component: | Plugin : Paste from Word | Version: | 3.0 |
Keywords: | Chrome | Cc: | Jeff Fournier |
Description (last modified by )
When you use the "Paste from Word" function to paste text containing bullet list, there is a few problems in all browser i've tried (Chrome v.22, FF v.16 and IE v.9).
How to reproduce :
- Open the attached document (text-with-bullet-list-example.doc) ;
- Copy the text ;
- Click on the "Paste from Word" button or use <ctr><v> ;
What is the problem :
- Each element of the list is converted into a paragraph (<p>) instead of regular HTML unordered list tags (<ul> and <li>) ;
- Each element contains a character and a few spaces to visually represent the list. Depending on the level, you can get :
- Root level : "· " ;
- First indentation : "o " ;
- Second indentation : "§ ".
What would be expected :
To have proper HTML unordered list with <ul> and <li> tags, containing only the text of the element, without any extra character like "o" or "§" neither spaces.
The issue #6662 is similar to this one but for numeric list style only.
This may be just different TC for ticket #8734. Please see comment:10
Please see comment:5, comment:2 and comment:1 for better view at this problem
Attachments (7)
Change History (30)
Changed 12 years ago by
Attachment: | html-result.png added |
---|
Changed 12 years ago by
Attachment: | text-with-bullet-list-example.doc added |
---|
Text to copy for testing
comment:1 Changed 12 years ago by
Component: | Core : Pasting → Core : Lists |
---|---|
Keywords: | Chrome added; Paste Word List removed |
Version: | 3.6.5 → 3.0 |
@jipolin what you have described occurs only in Chrome and looks like the continuation of #8734.
I have noticed that your list doesn't have normal list styles but has something like - "Paragraphe de liste". Could you tell me how this list was created?
comment:2 Changed 12 years ago by
Status: | new → confirmed |
---|
Lists don't get pasted in Chrome from Word 2010 #8734 and also from Word 2003 when some custom formatting (not default) is applied to list.
As discussed with @wwalc lists should be recognized in such cases as well and pasted as HTML lists and not tags.
Workaround is to use normal lists on Word 2003 and then paste them in CKEditor.
comment:3 Changed 12 years ago by
You're right, it's working on IE9 and FF.
The "Paragraphe de liste" is the french traduction of "List paragraph" which is the default style automatically added by MS-Word when a list is created (see http://office.microsoft.com/en-us/word-help/style-basics-in-word-HA010230882.aspx#BM2b).
On the video copy-past-demo.swf, you can see that if I copy/paste the list with its default style ("Paragraphe de liste"), it won't work but if I change it for "Normal", the list is properly handled.
Maybe the CKEditor doesn't recognize the style because it's in French ? Do we have the same problem with a english version of MS-Word 2010 ?
comment:4 Changed 12 years ago by
@jipolin sorry for late response - English version doesn't have this problem. It doesn't have such styles attached by default.
comment:5 Changed 12 years ago by
I have noticed that if you paste normal MS WORD list (without any custom style), Chrome sees it as lists. Browser sees ul li tags.
If for example you attach some MS Word indented list style then Chrome sees paragraphs with 'MsoList...' class.
If you for example style text with indented text and then apply list, Chrome will see it as paragraphs with class MsoBodyTextIndent and some HTML comments about lists.
Anyway it looks like that if Chrome sees UL then lists are pasted but if it sees paragraphs it always pastes paragraphs (even if they have 'MsoList class).
comment:6 Changed 12 years ago by
About your first comment, are you sure ? Because as the mentionned it in the US english help page, you should have such style in english version of word too.
About your last comment, when Chrome sees paragraphs with 'MsoList...' class, would it be possible to consider those as actual lists in CKEditor ? Do you know how IE and FF react on those cases, is the pasted text contain paragraph too ?
comment:7 Changed 12 years ago by
Description: | modified (diff) |
---|
Changed 12 years ago by
Attachment: | pasted.txt added |
---|
Export of the complete "Stuff to get" file pasted into a contentEditable div.
comment:8 Changed 12 years ago by
I have the same issue with both numbered and bulleted lists pasted from Word 2010 (MS Office Pro Plus, Word 14.0.6123.5000, 32bit, English), both when pasting in Chrome 22.0.1229.79 and IE 8.0.6001.18702. No difference between the official CKEditor 3 (Full) and 4 Beta (Inline) demos. This uses the default number and bullet styles. Doesn't matter which list type is used where though.
I've attached the file I created and copied (Ctrl+A, Ctrl+C) as "Stuff to get.docx" and the "raw" markup pasted into a simple contentEditable div in Chrome as "pasted.txt".
This is on a Win XP Pro machine in case that matters.
comment:9 Changed 12 years ago by
If I understand correctly "Stuff to get.docx" is what you expect. Where is the file that is causing the problem? Where is the file which is creating : "Root level : "• " ; First indentation : "o " ; Second indentation : "§ " ".
That is what was reported in original ticket.
If you are just getting paragraphs please refer to #8734.
I'm waiting for your comments.
comment:10 Changed 12 years ago by
#8734 appears to be the exact same issue.
The "Stuff to get.docx" file is not what I expect, it's the actual file causing the problem. I should have renamed it to represent that instead of the "todo list" it was used as.
The inserted leading character mentioned in the original post is the "bullet item" itself. I get the default · character inserted inside span tags before each bulleted-list-paragraph, and a number before each numbered-list-paragraph, as I did not change the list style.
The "bullet item" span, and the "indent spaces" also mentioned in the original post, are wrapped in <!--[if !supportLists]--> <!--[endif]--> comments. To me it looks like these represent "artificial" bulletpoints/listnumbers for renderers which do not have real lists. CKEditor appears to remove/ignore the pasted comments but leave the span intact, which makes the "bullet items" visible, and possibly also interfering with parsing the list as an actual list. Everything inside those comments should be removed, not just the comments themselves.
If you change the list style, you can get any character or numerical representation to show up as the "bullet item" in each paragraph. The screencast attached in #8734 also shows that the "bullet item" has been included inside each of the paragraphs, but as a number since he used numbered lists.
I did not have a tool to inspect the "raw" clipboard data at the time of testing this, so I used a contentEditable div to at least get the HTML version of what was in there, after copying from the .docx file. I created "pasted.txt" by simply putting basic HTML markup including the contentEditable div in a file, opened that in my browser, pasted the contents from "Stuff to get.docx" inside the div, used the browser's developer tools to copy the current document's markup, and finally put that in the text file. Trivia note: Looks like either MS Word creates the weird markup on the fly during copying, or Libre Office filters it out when opening the file. If I open the same .docx file from Libre Office, copy everything and paste it into the contentEditable div I used to create "pasted.txt", I get nearly perfect markup to begin with. This is obviously a lot easier for CKEditor to work with, so the lists turn out great.
I attached the markup generated when copying from Libre Office and pasting into the plain contentEditable div as "pasted_libre.txt". That is much closer to what I expect when pasting into CKEditor from MS Word 2010.
Changed 12 years ago by
Attachment: | pasted_libre.txt added |
---|
"Stuff to get.docx" opened in Libre Office and copied to a contentEditable div.
comment:11 Changed 12 years ago by
@TwoD - Thank you for detailed description.
comment:12 Changed 12 years ago by
Description: | modified (diff) |
---|
comment:13 Changed 12 years ago by
I pushed t/9456 to tests repo.
I gathered results from FF, Chrome and IE8/9 for pasting contents of documents attached in this ticket (opened in Word 2007 and 2010).
comment:14 Changed 12 years ago by
Thanks for the feedback, sorry for not able to participant earlier in this bug since we'd been working hard to launch version 4.
On topic, this's a recent Webkit regression and in fact CKEditor has been working very well with list transformation from MS-Word in the past, FYI a list of tickets to back this feature:
- #5399
- Lists pasted from Word do not maintain their nesting
- #6330
- Roman list style are not pasted properly from Word
- #6658
- Paste from Word in IE - Internal error of html formatter - Tabs in empty in lists
- #6662
- Lists copied from Word are not pasted properly.
- #7131
- Copy/Paste Word List should preserve list properties
- #7269
- paste from word - footnote links link to document path in webkit based browsers
- #7480
- Bulleted lists copied from MS Word are pasted into the Editor as Numbered lists
- #7872
- two-level list pasted from word gets flattened or split
- #7898
- Problem with Label for Lock ratio button not changing in High Contrast mode
The Webkit regression that introduced recently on Chrome, compared with what we had before, on presenting a (circle) list bullet from MS-Word:
Mozilla/5.0 (Windows NT 6.1) AppleWebKit/534.57.2 (KHTML, like Gecko)
<!--[if !supportLists]--> <span style="mso-list:Ignore">o</span> <!--[endif]-->
Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.11 (KHTML, like Gecko)
<!--[if !supportLists]--> <span lang="EN-US" style="font-family:Wingdings;mso-fareast-font-family:Wingdings; mso-bidi-font-family:Wingdings">o</span> <!--[endif]-->
As seen above the mso-list:Ignore that we relies on to identify a list item has gone that breaks this feature.
comment:15 Changed 12 years ago by
Reported Chromium bug: http://code.google.com/p/chromium/issues/detail?id=162800
We would meanwhile work on a urgent hot fix for the editor, depending on the reaction at Chromium side.
comment:16 Changed 12 years ago by
Thanks for your answers, it's crystal clear now that it's a chrome regression. Hope you will be abble to add a hot fix pending chrome team fix it.
comment:17 Changed 12 years ago by
Milestone: | → CKEditor 4.0.1 |
---|---|
Owner: | set to Garry Yao |
Status: | confirmed → review |
Opened 7ac9b14 for a hot fix in v4, as chrome seems not reactive on that actively.
Changed 12 years ago by
Attachment: | list_paste_from_msword.docx added |
---|
comment:18 Changed 12 years ago by
Owner: | changed from Garry Yao to Piotrek Koszuliński |
---|---|
Status: | review → assigned |
comment:20 Changed 12 years ago by
Status: | assigned → review |
---|
With few more tests and fixes for Webkit back on review.
comment:21 Changed 12 years ago by
Status: | review → review_passed |
---|
comment:22 Changed 12 years ago by
Resolution: | → fixed |
---|---|
Status: | review_passed → closed |
Fixed with git:e5090f2.
comment:23 Changed 11 years ago by
Component: | Core : Lists → Plugin : Paste from Word |
---|
HTML generated after the paste operation