#7646 closed Bug (fixed)
Paste with CKEDITOR.config.pasteFromWordRemoveFontStyles = false
Reported by: | naresh.sivaraman | Owned by: | Garry Yao |
---|---|---|---|
Priority: | Normal | Milestone: | CKEditor 4.6.0 |
Component: | Plugin : Paste from Word | Version: | 3.6.2 |
Keywords: | Oracle | Cc: |
Description
While trying to paste the following text from word, with config variable pasteFromWordRemoveFontStyles = false set, there seems to be few issues:
3/28現在、発生した全200件の障害のうち対応完了、対応不要、再現待ちを除く残事項は15件です。対応状況は、表1の通りです。
First Part till number 200 is fine. Pasted correctly. After 200, till the end, the font size increases from 10 to 10.5. And there is a font family change when 15 appears.
When copying without the numbers, the paste seems to be working fine. This has something to do with the numbers interspersed with the text. Please see the same.
Attachments (8)
Change History (41)
comment:1 Changed 14 years ago by
Status: | new → pending |
---|
comment:2 Changed 14 years ago by
Sorry, Saw your reply today only.
Have attached three files to the ticket. Please see them. This is what we see in our environment.
And It is not reproducible in the ckeditor demo because pasteFromWordRemoveFontStyles = false is not set by default in the ckeditor demo.
comment:3 Changed 14 years ago by
I see you are using IE7. I have checked that browser and all other IE versions but none of them produced what you have showed on screen shots. I have received the following results:
<span style="font-size: 10pt">3/28</span><span style="font-family: ms mincho"><span style="font-size: 10pt">現在、発生した全</span></span><span style="font-size: 10pt">200</span><span style="font-family: ms mincho"><span style="font-size: 10pt">件の障害のうち対応完了、対応不要、再現待ちを除く残事項は</span></span><span style="font-size: 10pt">15</span><span style="font-family: ms mincho"><span style="font-size: 10pt">件です。対応状況は、表</span></span><span style="font-size: 10pt">1</span><span style="font-family: ms mincho"><span style="font-size: 10pt">の通りです。</span></span></p>
Have you tried to download a clean version of CKEditor, set pasteFromWordRemoveFontStyles=false and than run your example?
I have and I've got what I've pasted above. This makes me rather convinced that there is no problem with CKEditor but with your environment. If you can prove me wrong and reproduce this issue in clean version of CKEditor please do so and provide me with a test case so that I could reproduce it to.
Waiting for your comments.
comment:4 Changed 14 years ago by
I have been using my FF3 and still I am able to reproduce the issue using a plain vanilla instance of CKEDITOR. Will just detail the steps that i did.
(1) Downloaded CKEDITOR 3.6 Zip.
(2) Changed config.js and added config.pasteFromWordRemoveFontStyles=false;
CKEDITOR.editorConfig = function( config ) {
config.pasteFromWordRemoveFontStyles=false; Define changes to default configuration here. For example: config.language = 'fr'; config.uiColor = '#AADC6E';
};
(3) Took up one of the samples provided and tried to paste this : 3/28現在、発生した全200件の障害のうち対応完了、対応不要、再現待ちを除く残事項は15件です。対応状況は、表1の通りです。(Originally MS Pincho, Size 10 in Word.) Will attach the word document along.
(4) Copied to the editor and copied back to word. Found the formatting had changed as I have mentioned earlier. The first part till 200 is fine, copied with MS Pincho Size 10. Second part from 200 to 15 is in MS Pincho Font, but size is different. Third part, after 15 is of a different font family altogether.
Please let me know If I am doing something wrong here.
Changed 14 years ago by
Attachment: | testcase.doc added |
---|
comment:6 Changed 14 years ago by
@krst and me also could not reproduce the issue. Perhaps this problem is related to OS font settings.
Have you tried to reproduce this problem on different OS perhaps different machine?
comment:7 Changed 14 years ago by
I am using Microsoft Windows XP(SP3) and using Microsoft Word 2003. I am using English Locale.
We have also been able to reproduce the issue on Windows 2003 server (SP2) and using MS Word 2007 with Japanese locale.
comment:8 Changed 14 years ago by
Status: | pending → confirmed |
---|---|
Version: | 3.4.2 → 3.0 |
I have received some confirming results but they are varying depending on a browser
IE has always been changing the font name, but the font size from CKEditor 3.1: IE6, IE7, IE8 - Name is changed to Arial Unicode MS and from CKEditor, size to 9.
IE9 and Webkit has always been changing font name to MS Gothic and from CKEditor 3.1, font size to 9.
FF and Opera leave the font name untouched but starting from CKEditor 3.1 have been changing font size to 10.5
comment:9 Changed 13 years ago by
Do you have any new updates on this ticket? It has been five months since the last update and it would be really helpful to get the fix for this bug asap.
comment:10 Changed 13 years ago by
It would be really helpful if you can give us an update on this. My customer has been waiting for the fix for months. Would you please at least let us know a tentative time by which this will be fixed.
comment:11 Changed 13 years ago by
This is most likely the side effect of having pasteFromWordRemoveFontStyles disabled and I don't think that we will be able to change there much.
Please bear in mind Word is not made for producing web contents. CKEditor makes its best to translate the information send from Word through the clipboard but this is not always possible to translate it 1 to 1.
As I was saying in comment:8, I didn't get the same results as you did - there were font-name or font-size changes but they concerned the whole text not parts of it. IMO those differences are not that critical.
You have set the version for CKEditor 3.4.2 - have you tried to use latest CKEditor 3.6.2 - we have done some improvements since then.
comment:12 follow-ups: 14 17 Changed 13 years ago by
It is reproducible in CKEditor 3.6.2 as well. We think this the bug in the CKEditor. I've uploaded the testdoc with bigger font so it will be easier to see the buggy behaviour in CKEditor also. To reproduce, disable the pasteFromWordRemoveFontStyles and paste the contents of the doc in the CKEditor.
We have tested this on other Online editors and they seem to be working fine. One example is http://developer.yahoo.com/yui/examples/editor/code_editor.html
After some debugging we found that removing Lang attribute in the pastefromword filter is causing this issue. We are able to get this bug fixed by removing the below rule from \_source\plugins\pastefromword\filter\default.js
***936*** - // Remove lang/language attributes. - [ ( /^lang/ ), '' ]
Can you please review this fix and let us know if it would have any side effects.
Thank you.
comment:13 Changed 13 years ago by
Version: | 3.0 → 3.6.2 |
---|
comment:14 follow-up: 15 Changed 13 years ago by
Replying to yaswanth.ravella:
We have tested this on other Online editors and they seem to be working fine. One example is http://developer.yahoo.com/yui/examples/editor/code_editor.html
Sorry, but this is out of comparison. There is no cleanup happening at all in the above editor. If you're happy with that, you can have the same results with CKEditor by adding this configuration line:
config.removePlugins = 'pastefromword';
But of course, this is strongly not recommended and there is no guarantee on the behavior of CKEditor when editing over the raw HTML sent by Word.
comment:15 follow-up: 20 Changed 13 years ago by
Replying to fredck:
Replying to yaswanth.ravella:
Hi Fred,
Thanks a lot for responding.
I understand that CKEditor is doing a cleanup, which is good and what we need. But during the cleanup, the span tags have clearly gone out of sync.
For example in my case. Below is the HTML data in CKEDITOR.cleanWord function before calling data.replace( /<span>/g, ):
<p> <span><span style="font-family:ms pmincho,serif;"><span style="font-size:36.0pt;">現在全</span></span></span> <span><span style="font-family:ms pmincho,serif;"><span style="font-size:36.0pt;">3/28<span>現在全</span>200<span>件の事項は</span>15<span>件です。対応状況は、</span></span></span></span> </p>
The above html is perfect and works fine. But then I see a call to replace( /<span>/g, ) on the above HTML which removed all the opening span tags but didn't bother removing the corresponding closing tags. Below is the result of the replace operation.
<p> <span style="font-family:ms pmincho,serif;"><span style="font-size:36.0pt;">現在全</span></span></span> <span style="font-family:ms pmincho,serif;"><span style="font-size:36.0pt;">3/28現在全</span>200件の事項は</span>15件です。対応状況は、</span></span></span></span> </p>
The above html is invalid and has clearly lost the formatting context. The ideal fix for this would have been to remove the corresponding closing tags for the empty span tags we just deleted and then the HTML would look like the below one, which is Valid.
<p> <span style="font-family:ms pmincho,serif;"><span style="font-size:36.0pt;">現在全</span></span> <span style="font-family:ms pmincho,serif;"><span style="font-size:36.0pt;">3/28現在全200件の事項は15件です。対応状況は、</span></span> </p>
Since, span tags in msWordHTML contained lang attribute and I didn't mind lang attributes lying around in the output(for the greater good of not loosing the formatting context), I thought it would be fine to leave the lang attribute during cleanup so that span tags won't be removed. May be we should fix this at the removal of empty span tags code and make sure we remove the corresponding closing tags properly :)
But for now. I am okey with lang attribute lying around in the output as long as it doesn't break the formatting and doesn't cause any regression. Can you please let me know if you think this fix would have any other side effects.
comment:17 follow-up: 19 Changed 13 years ago by
Owner: | set to Frederico Caldeira Knabben |
---|---|
Status: | confirmed → assigned |
Replying to yaswanth.ravella:
The problem is not really the "lang" attribute removal. That's something that just fixes your case, but the real bug will still remain there and we could have other cases that would bring the same issue to light.
I'll be proposing a patch to it.
Changed 13 years ago by
Attachment: | 7646.patch added |
---|
comment:18 Changed 13 years ago by
Component: | General → Core : Pasting |
---|---|
Keywords: | Oracle added |
Status: | assigned → review |
comment:19 Changed 13 years ago by
Replying to fredck:
The problem is not really the "lang" attribute removal. That's something that just fixes your case, but the real bug will still remain there and we could have other cases that would bring the same issue to light.
Yes. My fix was merely a hack to keep the span tags intact.
I'll be proposing a patch to it.
Thanks for taking this up. I will test the patch and will update this ticket accordingly.
comment:20 follow-up: 21 Changed 13 years ago by
Replying to yaswanth.ravella:
You understood the problem well. The issue you've described is exactly what the patch is fixing.
comment:21 Changed 13 years ago by
Replying to fredck:
You understood the problem well. The issue you've described is exactly what the patch is fixing.
I just got confirmation from the customer that the fix worked for them. Thanks for helping out in this.
comment:22 follow-up: 23 Changed 13 years ago by
Status: | review → review_failed |
---|
If this patch is taken, it would double the performance cost on paste from word, a better approach has to be found.
comment:23 Changed 13 years ago by
Milestone: | → CKEditor 3.6.3 |
---|---|
Owner: | Frederico Caldeira Knabben deleted |
Status: | review_failed → confirmed |
Replying to garry.yao:
If this patch is taken, it would double the performance cost on paste from word, a better approach has to be found.
That's the only reliable way I've found to replace the hack we had there in the code. Maybe you can propose something better?
comment:24 follow-up: 25 Changed 13 years ago by
comment:25 Changed 13 years ago by
Changed 13 years ago by
Attachment: | 7646_2.patch added |
---|
comment:26 Changed 13 years ago by
Owner: | set to Garry Yao |
---|---|
Status: | confirmed → review |
New patch is based on the previous one, while it makes it possible to run filter multiple times without reparsing, so it should be minimum on performance.
comment:27 Changed 13 years ago by
Status: | review → review_failed |
---|
I've checked the patch and the basic idea of it is interesting. Just the way it is proposed is not looking clean, in the API point of view.
The fact is that we're introducing new API stuff here. So let's do it in the right way.
The first thing to note is that the filters defined on beforeProcessData will work on element nodes only. Filters on text or comments will have no effect.
Then, we need to have coherence with the method purposes. The patch says that writeHtml can be used also NOT to write HTML, which sounds strange. In fact, we're hacking it to make it became a two purposes thing.
So, if we have two purposes, let's have two separated methods at this point: writeHtml() (the current signature) and runFilter(), which does the thing we need for this ticket. The first method should call the second.
Of course, this will involve a review on the way writeHtml works. But if the latest patch idea is really the one to follow, than this is the price we pay to go with it.
We have still another weak point in the patch. I see that it is implementing two new events: beforeProcessData and beforeProcessHtml.
These new events are ok, but again they were made to satisfy this ticket needs only, being poorly implement. They look quite generic instead and so enhancements need to be done to propose it in the right way.
As the name says, the events must be called BEFORE PROCESSING the data/html. This means that the event must be the very first thing to happen on those methods, BEFORE any PROCESSING could have taken place.
Additionally, we're just giving the possibility to set filters with those events, but to bring more sense to these generic need events, we must also pass the data/html which is about to be processed, so the event listeners can manipulate the data/html before processing.
Finally, to make it complete, the beforeProcessData/Html need its counterpart afterProcessData/Html event, again making it possible to change the data/html AFTER processing.
Changed 13 years ago by
Attachment: | 7646_3.patch added |
---|
comment:28 Changed 13 years ago by
Status: | review_failed → review |
---|
The new patch addresses the above review comments, and make it a step further, to avoid completely the HTML parsing happens inside of word paste file.
comment:29 Changed 13 years ago by
Milestone: | CKEditor 3.6.3 |
---|
This new patch is making the API too counter-intuitive. It's hard to understand what CKEDITOR.cleanWord does at this point. It started to do some magic that no one understands where it gonna be execute.
I'm not saying that the solution is not good. But due to its drastic changes, I would avoid having it on the upcoming v3 release.
comment:30 Changed 11 years ago by
Component: | Core : Pasting → Plugin : Paste from Word |
---|
comment:31 Changed 8 years ago by
Resolution: | → fixed |
---|---|
Status: | review → closed |
Fixed with new Paste From Word plugin in 4.6.0.
comment:32 Changed 8 years ago by
Milestone: | → CKEditor 4.6.0 |
---|
I could not reproduce this issue. CKEditor doesn't show font-size or family on selected text pasted form word so I have tried to switch between source and WYSIWYG, pasting form CKEditor back to word, checking text with FF font plug-in but none of this resulted in what you have described above. Line of text had one font-size and two font-families: one for text and the other for numbers.
Could you tell me what OS and browser are you using. Also have you tested this bug in the latest demo version http://nightly.ckeditor.com/demo? Maybe it has already been fixed?
If this issue is still reproducible could you provide me with more detailed test case. I'm also very interested in how did you managed to get that font size:)