Opened 8 years ago

Closed 8 years ago

Last modified 3 years ago

#2272 closed Bug (fixed)

FF3: Paste from word leaves lots of garbage tags

Reported by: joshclark Owned by: martinkou
Priority: Normal Milestone: FCKeditor 2.6.3
Component: Plugin : Paste from Word Version: FCKeditor 2.6.2
Keywords: Confirmed Firefox3 Review+ Cc:

Description

In Firefox 3 RC3, the "paste from word" feature leaves lots of garbage tags behind. Specifically:

  • It does not remove comments.
  • It does not remove <style> elements.
  • It does not remove <meta> elements.
  • It does not remove <link> elements.

The comments issue can be fixed by changing this line in fck_paste.html's CleanWord function:

html = html.replace(/<\!--.*?-->/g, '' ) ;

...to this:

html = html.replace(/<\!--[\s\S]*?-->/g, '' ) ;

(Because . does not match new lines, multi-line comments are not removed; [\s\S] does the trick instead.)

To be safe, I recommend making similar changes to all of the fck_paste.html instances where .* is used. Specifically, these lines:

html = html.replace(/<o:p>.*?<\/o:p>/g, '&nbsp;') ;
html = html.replace( /<SPAN\s*>(.*?)<\/SPAN>/gi, '$1' ) ;

html = html.replace( /<FONT\s*>(.*?)<\/FONT>/gi, '$1' ) ;
html = html.replace( /<(\w+)[^>]*\sstyle="[^"]*DISPLAY\s?:\s?none(.*?)<\/\1>/ig, '' ) ;
html = html.replace( /<(H\d)><FONT[^>]*>(.*?)<\/FONT><\/\1>/gi, '<$1>$2<\/$1>' );
html = html.replace( /<(H\d)><EM>(.*?)<\/EM><\/\1>/gi, '<$1>$2<\/$1>' );
var re = new RegExp( '(<P)([^>]*>.*?)(<\/P>)', 'gi' ) ;	// Different because of a IE 5.0 error

...should be changed respectively to:

html = html.replace(/<o:p>[\s\S]*?<\/o:p>/g, '&nbsp;') ;
html = html.replace( /<SPAN\s*>([\s\S]*?)<\/SPAN>/gi, '$1' ) ;

html = html.replace( /<FONT\s*>([\s\S]*?)<\/FONT>/gi, '$1' ) ;
html = html.replace( /<(\w+)[^>]*\sstyle="[^"]*DISPLAY\s?:\s?none([\s\S]*?)<\/\1>/ig, '' ) ;
html = html.replace( /<(H\d)><FONT[^>]*>([\s\S]*?)<\/FONT><\/\1>/gi, '<$1>$2<\/$1>' );
html = html.replace( /<(H\d)><EM>([\s\S]*?)<\/EM><\/\1>/gi, '<$1>$2<\/$1>' );
var re = new RegExp( '(<P)([^>]*>[\s\S]*?)(<\/P>)', 'gi' ) ;	// Different because of a IE 5.0 error

Also, to get rid of the <meta>, <link> and <style> elements, I suggest adding these additional replacements:

// Remove meta/link tags
html = html.replace(/<(META|LINK)[^>]*>\s*/gi, '' ) ;

// Remove style tags
html = html.replace( /<STYLE[^>]*>([\s\S]*?)<\/STYLE[^>]*>/gi, '' ) ;

Attachments (2)

2272.patch (3.5 KB) - added by martinkou 8 years ago.
2272_2.patch (3.7 KB) - added by martinkou 8 years ago.

Download all attachments as: .zip

Change History (11)

comment:1 Changed 8 years ago by w.olchawa

  • Keywords Confirmed Firefox3 added
  • Version set to FCKeditor 2.6.2

Confirmed using FCKeditor 2.6.2 and the latest SVN version in Firefox3

comment:2 Changed 8 years ago by fredck

  • Milestone set to FCKeditor 2.6.3

#2291 has been marked as DUP. The following regex has been also proposed there:

html = html.replace( /<w:[^>]*>[\s\S]*?<\/w:[^>]*>/gi, '' ) ;

Also, many of the proposed suggestions are defining capturing groups, like ([\s\S]*?), which impact on performance. They can be avoided in many cases, and if the grouping is needed but not really to be captured the (?:) syntax is to be used.

comment:3 Changed 8 years ago by martinkou

  • Owner set to martinkou
  • Status changed from new to assigned

Changed 8 years ago by martinkou

comment:4 Changed 8 years ago by martinkou

  • Keywords Review? added

comment:5 Changed 8 years ago by fredck

  • Keywords Review- added; Review? removed

The regex at comment:2 should also be considered, shouldn't it?

Changed 8 years ago by martinkou

comment:6 Changed 8 years ago by martinkou

  • Keywords Review? added; Review- removed

comment:7 Changed 8 years ago by fredck

  • Keywords Review+ added; Review? removed

comment:8 Changed 8 years ago by martinkou

  • Resolution set to fixed
  • Status changed from assigned to closed

Fixed with [2174].

Click here for more info about our SVN system.

comment:9 Changed 3 years ago by fredck

  • Component changed from UI : Dialogs to Plugin : Paste from Word
Note: See TracTickets for help on using tickets.
© 2003 – 2016 CKSource – Frederico Knabben. All rights reserved. | Terms of use | Privacy policy