Opened 8 years ago
Last modified 8 years ago
#16833 assigned Bug
IE11 malformed list pasted from Word
Reported by: | Tade0 | Owned by: | Marek Lewandowski |
---|---|---|---|
Priority: | Nice to have (we want to work on it) | Milestone: | |
Component: | Plugin : Paste from Word | Version: | 4.7.0 |
Keywords: | Cc: |
Description
Steps to reproduce
- Open
Full editor
demo from: http://nightly.ckeditor.com/ - Open the attached file in Word, copy and paste it into the editor.
Expected result
List structure is the same as in the Word file.
Actual result
Lists are malformed - they are shifted one level up.
Other details (browser, OS, CKEditor version, installed plugins)
IE11 Only.
Attachments (1)
Change History (17)
Changed 8 years ago by
Attachment: | Numbered.and.Bulleted.Lists.with.Headings.docx added |
---|
comment:1 Changed 8 years ago by
Component: | General → Plugin : Paste from Word |
---|
comment:2 Changed 8 years ago by
Status: | new → confirmed |
---|
comment:3 Changed 8 years ago by
Status: | confirmed → review |
---|
comment:4 Changed 8 years ago by
Status: | review → review_failed |
---|
Please add also other browsers to https://github.com/cksource/ckeditor-dev/tree/02e84ce223e6e0f345a4052f390797551028bcbf/tests/plugins/pastefromword/generated/_fixtures/Tickets/16818Numbered_lists - we need to ensure that it works well also for other browsers.
I'd say edge, chrome, firefox with one version of Word is enough.
comment:5 Changed 8 years ago by
Including chrome and firefox was straightforward, but there are still lingering problems in Edge, namely some list elements are not detected - this is a known problem in Edge for which there are heuristics in place, but they need to be adjusted.
A solution has been provided and is currently tested for regressions.
comment:6 Changed 8 years ago by
Status: | review_failed → review |
---|
Updated tests.
Changes pushed to branch:t/16833.
comment:7 follow-ups: 8 11 Changed 8 years ago by
Status: | review → review_failed |
---|
Regarding functionallity, the branch actually does a good job.
There are just a couple things to be fixed thoguh:
- Edge fixture was generated with word2013 - please update it as it will simplify test file a bit.
- We need filters for all headers, in HTML we have it all the way up to
h6
correctLevelShift
,isShifted
are missing API docs.- Calling multiple Array.concat have really poor readability, it would be better just to do:
var preceeding = isShiftedList( list ) ? [ list ] : list.children; return preceeding.concat( acc );
- I was also concerned about empty paragraphs being propagated by PFW in IE11, but these get filtered out later, so it could be extracted into a follow-up ticket.
comment:8 follow-up: 9 Changed 8 years ago by
Status: | review_failed → review |
---|
Replying to m.lewandowski:
- Calling multiple Array.concat have really poor readability, it would be better just to do:
var preceeding = isShiftedList( list ) ? [ list ] : list.children; return preceeding.concat( acc );
The rule I'm going by is: Don't introduce new names if you don't have to.. For me personally it reduces cognitive load.
Corrected everything according to the list. Changes pushed to branch:t/16833.
comment:9 Changed 8 years ago by
Resolution: | → fixed |
---|---|
Status: | review → closed |
Replying to Tade0:
Replying to m.lewandowski:
- Calling multiple Array.concat have really poor readability, it would be better just to do:
var preceeding = isShiftedList( list ) ? [ list ] : list.children; return preceeding.concat( acc );The rule I'm going by is: Don't introduce new names if you don't have to.. For me personally it reduces cognitive load.
Corrected everything according to the list. Changes pushed to branch:t/16833.
It's a good role, but it's usage strongly depends on given case. Here it contained simply unnecessary hacks to perform an operation that .could be conveyed more easily. As long as you're not working with more than 7 names at once.
Fix LGTM, good job.
Fixed with git:2d07c870ee91d1cecf42e1cac0a6bb8f875f7f84, merged to major.
comment:10 Changed 8 years ago by
Resolution: | fixed |
---|---|
Status: | closed → reopened |
comment:11 Changed 8 years ago by
Replying to m.lewandowski:
- I was also concerned about empty paragraphs being propagated by PFW in IE11, but these get filtered out later, so it could be extracted into a follow-up ticket.
Actually mentioned paragraphs are visible in IE11 on Windows 10. It's a critical issue and needs to be fixed.
comment:12 Changed 8 years ago by
Status: | reopened → confirmed |
---|
comment:13 Changed 8 years ago by
Owner: | changed from Tade0 to Marek Lewandowski |
---|---|
Status: | confirmed → assigned |
comment:14 Changed 8 years ago by
There were three ways to fix this markup:
- use CKE4 parser (e.g. htmlParser.fromHtml method)
- use browser DOM parsing.
- use RegExp
RegExp
Regexp first: there's not much to elaborate as RegExps are not simply a good tool for parsing, it's error prone.
Browser DOM parser
The idea is simply to preformat data before it gets used by the CKEditor, so that:
- Internet Explorer pastes the data into our
dataValue
variable. dataValue
gets loaded into a Document Fragment in the memory, where we traverse the tree and do the necesarry modifications.- After our modifications are done, we get the output HTML, and feed it into the CKEditor.
Problem with this approach is that browsers do fix malformed HTML on it's own. You could check it on this demo: http://codepen.io/mlewand/pen/pRZoYG (see the console logs there).
See what happened to <strong><em>
paris.
And that's the very reason why CKEditor 4 comes with it's own HTML parser.
CKEditor4 parser
Ultimately the only option to fix this issue is to make a fix in CKEditor HTML parser.
In this case our parser does a good job of moving inline elements like strong
, but it does bad job with paragraphs because it moves it before the list (and this is exactly what you're seeing here).
This decission has been made in #3195 - it was meant to have the same handling for wrong elements directly inside of a table and lists.
What we could do here is to bring a fix for lists only, so that:
{{{<ul>
<li>1</li> <p>2</p> <li>3</li>
</ul>}}}
Gets transformed into:
{{{<ul>
<li>1</li> <li><p>2</p></li> <li>3</li>
</ul>}}}
Alternatively we could restrict the change only to remove empty paragraphs like that. So that:
{{{<ul>
<li>1</li> <p></p> <li>3</li>
</ul>}}}
Gets transformed into:
{{{<ul>
<li>1</li> <li>3</li>
</ul>}}}
But no paragraph is moved around.
comment:15 Changed 8 years ago by
So with that above, changing our parser is the only way to fix this issue. This brings very high risk and needs to be carried out extremely carefully.
Another thing here is that it's IE11 that produces totally malformed HTML - the only problem is that as a result this problem is very visible to the end user.
comment:16 Changed 8 years ago by
Milestone: | CKEditor 4.7.0 |
---|---|
Priority: | Normal → Nice to have (we want to work on it) |
Added a method that check whether the list levels are shifted and corrects that.
Added in Heuristics, because without the mentioned method the list is still technically valid.
Changes pushed to branch:t/16833.