Ticket #2291 (closed Bug: duplicate)

Opened 6 years ago

Last modified 6 months ago

[FF3] simple copy & paste from Word document - extra code not stripped

Reported by: icedblind Owned by:
Priority: Normal Milestone: FCKeditor 2.6.3
Component: Plugin : Paste from Word Version: FCKeditor 2.6.1
Keywords: Confirmed Cc:

Description

You can check this bug by yourself trying to copy and paste (CTRL+C/V) some text from a Microsoft Word document using first FF2 and after FF3 in Demo FCKeditor pages.

In FF3, viewing the source code of the copied text, you can see some extra information that in FF2 is stripped (meta tags, xml and style definitions):

<meta content="text/html; charset=utf-8" http-equiv="Content-Type" />
<meta content="Word.Document" name="ProgId" />
<meta content="Microsoft Word 11" name="Generator" />
<meta content="Microsoft Word 11" name="Originator" />
<link href="file:///[...]" rel="File-List" /><!--[if gte mso 9]><xml>
<w:WordDocument>
<w:View>Normal</w:View>
[...]
</xml><![endif]--><style type="text/css">
<!--
 /* Style Definitions */
[...]
</style>
<![endif]-->

Change History

comment:1 Changed 6 years ago by fredck

  • Keywords ff3 copy paste word removed
  • Status changed from new to closed
  • Resolution set to invalid
  • Milestone FCKeditor 2.6.2 deleted

You should use the "Paste from Word" button to have better results. The normal pasting will take the clipboard data as is, and we can see that the Firefox team have worked to make it "better" for their version 3.

comment:2 Changed 6 years ago by icedblind

I'm sorry fredck, but that does not resolve the problem: if you try, also using the "Paste from Word" button you get quite the same result (ok, something more is stripped, but you don't get the same result as using FF2).

<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
<meta content="Word.Document" name="ProgId">
<meta content="Microsoft Word 11" name="Generator">
<meta content="Microsoft Word 11" name="Originator">
<link href="file:///[...]" rel="File-List" /><!--[if gte mso 9]><xml>
  Normal
  0
  false
  false
  false
  [...]
  MicrosoftInternetExplorer4
</xml><![endif]--><!--[if gte mso 9]><![endif]--><style type="text/css">
<!--
 /* Style Definitions */
[...]
</style><!--[if gte mso 10]>
<style>
 /* Style Definitions */
 [...]
</style>
<![endif]-->

comment:3 Changed 6 years ago by icedblind

  • Status changed from closed to reopened
  • Resolution invalid deleted

fredck, i've taken a look at fckeditor/editor/dialog/fck_paste.html code, inside CleanWord function and i propose to modify some code - if you all don't see any problem, i'm not a smart js programmer''

fckeditor/editor/dialog/fck_paste.html - line 248

actual

html = html.replace(/<\!--.*?-->/g, '' ) ;

new lines

html = html.replace( /<w:[^>]*>(.*?)<\/w:[^>]*>/gi, '' ) ;
html = html.replace( /<meta[^>]*>/gi, '' ) ;
html = html.replace( /<link[^>]*>/gi, '' ) ;
html = html.replace( /<style[^>]*>([\w|\W|\n]*?)<\/style>/gim, '' ) ;
html = html.replace( /<\!--([\w|\W|\n]*?)-->/gm, '' ) ;

This at least strips away the lines that i've reported in FF3, and also extends to multiplelines the comment removal using the "paste from word" button.

comment:4 Changed 6 years ago by fredck

  • Keywords Confirmed added
  • Milestone set to FCKeditor 2.6.2

Your suggestion makes sense icedblind. Thanks for it.

The provided regexes are not the definitive though. The "catch all" for JavaScript is /[\s\S]*/. Also, <meta> and <link> could be caught on a single regex /(?:meta|link)/.

We are aware that the Word cleanup procedure is to be refined with time. We can't catch all cases in a set of tests, so additions like yours would just make it better.

I had previously invalidated the ticket because it was making reference to the plain pasting operation, which is out of our control. Now, enhancements to Paste from Word are definitely acceptable.

comment:5 Changed 6 years ago by icedblind

I think it's great. Thank-you fredck!

comment:6 Changed 6 years ago by alfonsoml

#2272 does already suggest some changes to the regexp cleaning.

comment:7 Changed 6 years ago by fredck

  • Status changed from reopened to closed
  • Resolution set to duplicate

Actually, the suggestions at #2272 make a lot of sense. Closing this one as DUP.

comment:8 Changed 6 months ago by fredck

  • Component changed from General to Plugin : Paste from Word
Note: See TracTickets for help on using tickets.
© 2003 – 2012 CKSource – Frederico Knabben. All rights reserved. | Terms of use | Privacy policy