Opened 16 years ago

Closed 16 years ago

Last modified 10 years ago

#2272 closed Bug (fixed)

FF3: Paste from word leaves lots of garbage tags

Reported by: Josh Clark Owned by: Martin Kou
Priority: Normal Milestone: FCKeditor 2.6.3
Component: Plugin : Paste from Word Version: FCKeditor 2.6.2
Keywords: Confirmed Firefox3 Review+ Cc:

Description

In Firefox 3 RC3, the "paste from word" feature leaves lots of garbage tags behind. Specifically:

  • It does not remove comments.
  • It does not remove <style> elements.
  • It does not remove <meta> elements.
  • It does not remove <link> elements.

The comments issue can be fixed by changing this line in fck_paste.html's CleanWord function:

html = html.replace(/<\!--.*?-->/g, '' ) ;

...to this:

html = html.replace(/<\!--[\s\S]*?-->/g, '' ) ;

(Because . does not match new lines, multi-line comments are not removed; [\s\S] does the trick instead.)

To be safe, I recommend making similar changes to all of the fck_paste.html instances where .* is used. Specifically, these lines:

html = html.replace(/<o:p>.*?<\/o:p>/g, '&nbsp;') ;
html = html.replace( /<SPAN\s*>(.*?)<\/SPAN>/gi, '$1' ) ;

html = html.replace( /<FONT\s*>(.*?)<\/FONT>/gi, '$1' ) ;
html = html.replace( /<(\w+)[^>]*\sstyle="[^"]*DISPLAY\s?:\s?none(.*?)<\/\1>/ig, '' ) ;
html = html.replace( /<(H\d)><FONT[^>]*>(.*?)<\/FONT><\/\1>/gi, '<$1>$2<\/$1>' );
html = html.replace( /<(H\d)><EM>(.*?)<\/EM><\/\1>/gi, '<$1>$2<\/$1>' );
var re = new RegExp( '(<P)([^>]*>.*?)(<\/P>)', 'gi' ) ;	// Different because of a IE 5.0 error

...should be changed respectively to:

html = html.replace(/<o:p>[\s\S]*?<\/o:p>/g, '&nbsp;') ;
html = html.replace( /<SPAN\s*>([\s\S]*?)<\/SPAN>/gi, '$1' ) ;

html = html.replace( /<FONT\s*>([\s\S]*?)<\/FONT>/gi, '$1' ) ;
html = html.replace( /<(\w+)[^>]*\sstyle="[^"]*DISPLAY\s?:\s?none([\s\S]*?)<\/\1>/ig, '' ) ;
html = html.replace( /<(H\d)><FONT[^>]*>([\s\S]*?)<\/FONT><\/\1>/gi, '<$1>$2<\/$1>' );
html = html.replace( /<(H\d)><EM>([\s\S]*?)<\/EM><\/\1>/gi, '<$1>$2<\/$1>' );
var re = new RegExp( '(<P)([^>]*>[\s\S]*?)(<\/P>)', 'gi' ) ;	// Different because of a IE 5.0 error

Also, to get rid of the <meta>, <link> and <style> elements, I suggest adding these additional replacements:

// Remove meta/link tags
html = html.replace(/<(META|LINK)[^>]*>\s*/gi, '' ) ;

// Remove style tags
html = html.replace( /<STYLE[^>]*>([\s\S]*?)<\/STYLE[^>]*>/gi, '' ) ;

Attachments (2)

2272.patch (3.5 KB) - added by Martin Kou 16 years ago.
2272_2.patch (3.7 KB) - added by Martin Kou 16 years ago.

Download all attachments as: .zip

Change History (11)

comment:1 Changed 16 years ago by Wojciech Olchawa

Keywords: Confirmed Firefox3 added
Version: FCKeditor 2.6.2

Confirmed using FCKeditor 2.6.2 and the latest SVN version in Firefox3

comment:2 Changed 16 years ago by Frederico Caldeira Knabben

Milestone: FCKeditor 2.6.3

#2291 has been marked as DUP. The following regex has been also proposed there:

html = html.replace( /<w:[^>]*>[\s\S]*?<\/w:[^>]*>/gi, '' ) ;

Also, many of the proposed suggestions are defining capturing groups, like ([\s\S]*?), which impact on performance. They can be avoided in many cases, and if the grouping is needed but not really to be captured the (?:) syntax is to be used.

comment:3 Changed 16 years ago by Martin Kou

Owner: set to Martin Kou
Status: newassigned

Changed 16 years ago by Martin Kou

Attachment: 2272.patch added

comment:4 Changed 16 years ago by Martin Kou

Keywords: Review? added

comment:5 Changed 16 years ago by Frederico Caldeira Knabben

Keywords: Review- added; Review? removed

The regex at comment:2 should also be considered, shouldn't it?

Changed 16 years ago by Martin Kou

Attachment: 2272_2.patch added

comment:6 Changed 16 years ago by Martin Kou

Keywords: Review? added; Review- removed

comment:7 Changed 16 years ago by Frederico Caldeira Knabben

Keywords: Review+ added; Review? removed

comment:8 Changed 16 years ago by Martin Kou

Resolution: fixed
Status: assignedclosed

Fixed with [2174].

Click here for more info about our SVN system.

comment:9 Changed 10 years ago by Frederico Caldeira Knabben

Component: UI : DialogsPlugin : Paste from Word
Note: See TracTickets for help on using tickets.
© 2003 – 2022, CKSource sp. z o.o. sp.k. All rights reserved. | Terms of use | Privacy policy