#2272 closed Bug (fixed)
FF3: Paste from word leaves lots of garbage tags
Reported by: | Josh Clark | Owned by: | Martin Kou |
---|---|---|---|
Priority: | Normal | Milestone: | FCKeditor 2.6.3 |
Component: | Plugin : Paste from Word | Version: | FCKeditor 2.6.2 |
Keywords: | Confirmed Firefox3 Review+ | Cc: |
Description
In Firefox 3 RC3, the "paste from word" feature leaves lots of garbage tags behind. Specifically:
- It does not remove comments.
- It does not remove <style> elements.
- It does not remove <meta> elements.
- It does not remove <link> elements.
The comments issue can be fixed by changing this line in fck_paste.html's CleanWord function:
html = html.replace(/<\!--.*?-->/g, '' ) ;
...to this:
html = html.replace(/<\!--[\s\S]*?-->/g, '' ) ;
(Because . does not match new lines, multi-line comments are not removed; [\s\S] does the trick instead.)
To be safe, I recommend making similar changes to all of the fck_paste.html instances where .* is used. Specifically, these lines:
html = html.replace(/<o:p>.*?<\/o:p>/g, ' ') ;
html = html.replace( /<SPAN\s*>(.*?)<\/SPAN>/gi, '$1' ) ; html = html.replace( /<FONT\s*>(.*?)<\/FONT>/gi, '$1' ) ;
html = html.replace( /<(\w+)[^>]*\sstyle="[^"]*DISPLAY\s?:\s?none(.*?)<\/\1>/ig, '' ) ;
html = html.replace( /<(H\d)><FONT[^>]*>(.*?)<\/FONT><\/\1>/gi, '<$1>$2<\/$1>' ); html = html.replace( /<(H\d)><EM>(.*?)<\/EM><\/\1>/gi, '<$1>$2<\/$1>' );
var re = new RegExp( '(<P)([^>]*>.*?)(<\/P>)', 'gi' ) ; // Different because of a IE 5.0 error
...should be changed respectively to:
html = html.replace(/<o:p>[\s\S]*?<\/o:p>/g, ' ') ;
html = html.replace( /<SPAN\s*>([\s\S]*?)<\/SPAN>/gi, '$1' ) ; html = html.replace( /<FONT\s*>([\s\S]*?)<\/FONT>/gi, '$1' ) ;
html = html.replace( /<(\w+)[^>]*\sstyle="[^"]*DISPLAY\s?:\s?none([\s\S]*?)<\/\1>/ig, '' ) ;
html = html.replace( /<(H\d)><FONT[^>]*>([\s\S]*?)<\/FONT><\/\1>/gi, '<$1>$2<\/$1>' ); html = html.replace( /<(H\d)><EM>([\s\S]*?)<\/EM><\/\1>/gi, '<$1>$2<\/$1>' );
var re = new RegExp( '(<P)([^>]*>[\s\S]*?)(<\/P>)', 'gi' ) ; // Different because of a IE 5.0 error
Also, to get rid of the <meta>, <link> and <style> elements, I suggest adding these additional replacements:
// Remove meta/link tags html = html.replace(/<(META|LINK)[^>]*>\s*/gi, '' ) ; // Remove style tags html = html.replace( /<STYLE[^>]*>([\s\S]*?)<\/STYLE[^>]*>/gi, '' ) ;
Attachments (2)
Change History (11)
comment:1 Changed 16 years ago by
Keywords: | Confirmed Firefox3 added |
---|---|
Version: | → FCKeditor 2.6.2 |
comment:2 Changed 16 years ago by
Milestone: | → FCKeditor 2.6.3 |
---|
#2291 has been marked as DUP. The following regex has been also proposed there:
html = html.replace( /<w:[^>]*>[\s\S]*?<\/w:[^>]*>/gi, '' ) ;
Also, many of the proposed suggestions are defining capturing groups, like ([\s\S]*?)
, which impact on performance. They can be avoided in many cases, and if the grouping is needed but not really to be captured the (?:) syntax is to be used.
comment:3 Changed 16 years ago by
Owner: | set to Martin Kou |
---|---|
Status: | new → assigned |
Changed 16 years ago by
Attachment: | 2272.patch added |
---|
comment:4 Changed 16 years ago by
Keywords: | Review? added |
---|
comment:5 Changed 16 years ago by
Keywords: | Review- added; Review? removed |
---|
The regex at comment:2 should also be considered, shouldn't it?
Changed 16 years ago by
Attachment: | 2272_2.patch added |
---|
comment:6 Changed 16 years ago by
Keywords: | Review? added; Review- removed |
---|
comment:7 Changed 16 years ago by
Keywords: | Review+ added; Review? removed |
---|
comment:8 Changed 16 years ago by
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Fixed with [2174].
Click here for more info about our SVN system.
comment:9 Changed 11 years ago by
Component: | UI : Dialogs → Plugin : Paste from Word |
---|
Confirmed using FCKeditor 2.6.2 and the latest SVN version in Firefox3