Changes between Initial Version and Version 1 of MSWordFilter


Ignore:
Timestamp:
Nov 27, 2009, 3:15:21 PM (14 years ago)
Author:
Garry Yao
Comment:

Initialize paste from office reference

Legend:

Unmodified
Added
Removed
Modified
  • MSWordFilter

    v1 v1  
     1= Pasting from Microsoft Office Applications =
     2CKEditor perform automatically content cleanup filtering and intelligent transformation when pasting content is detected come from MS-Office application family, this including Word,Excel,Outlook, etc, which will result in good content more semantically correct, while trying to preserve as much as the original format.
     3The following plugins are required depends on demand:
     4 1. '''pastefromword''' plugin must present to deliver this functionality, when using it, you need to click the '''Paste From Word''' toolbar button to instruct the editor your clipboard content of this pasting is from MS Office applications.
     5 1. In order to auto-detect whether content is from Office application, the '''clipboard''' plugin must be configured, in this case, you just need to press keyboard 'Ctrl-V' or click on '''Paste''' toolbar button.
     6
     7
     8 
     9=== Browser Native Filtering ===
     10Each web browsers have it's own pasting system which may vary from the result that content been pasted from Office Word application, generally certain styles even text would loose during the pasting, the following table lists the known filter outs:[[BR]]
     11
     12||Browsers||IE||Firefox||Safari/Chrome||Opera
     13||Document Structure Styles||NO||Yes||NO||NO
     14
     15=== CKEditor Filtering ===
     16Further processing is handled by the editor based on browser's filtering result,the following table summarizes the rules that will affect the end result:[[BR]]
     17
     18
     19{{{
     20#!html
     21<table class="collapse">
     22 
     23<tr class="trbgeven">
     24<th><strong>Format Removed</strong></th>
     25<th><strong>Result Affected</strong></th>
     26</tr>
     27 
     28<tr class="trbgodd">
     29<td>Downlevel conditional comments content within 
     30 
     31<pre><code>&lt;!--[</code></pre>
     32 
     33and 
     34 
     35<pre><code>]--&gt;</code></pre>
     36 
     37<p><b class="bterm">Example</b></p>
     38 
     39<pre><code>&lt;!--[if gte mso 9]&gt;...&lt;![endif]--&gt;</code></pre>
     40</td>
     41<td>WordArt cannot be edited, only the resulting static image is left. These comments make some HTML markup invisible to browsers earlier than Microsoft Internet Explorer 5.
     42 
     43<p>For example, Office inserts XML blocks containing WordArt document properties inside these comments so that the contents of these XML elements do not show up as text in browsers earlier than Internet Explorer 5.</p>
     44</td>
     45</tr>
     46 
     47<tr class="trbgeven">
     48<td>Uplevel conditional comments within
     49 
     50<pre><code>
     51&lt;![
     52</code></pre>
     53 
     54and
     55 
     56<pre><code>
     57]&gt;
     58</code></pre>
     59 
     60<p><b class="bterm">Example</b></p>
     61 
     62<pre><code>
     63&lt;! [if !vml]&gt;
     64</code></pre>
     65</td>
     66<td>These comments make some HTML markup visible in browsers earlier than Internet Explorer 5 but invisible in Internet Explorer 5 or later. When the comments are removed, the markup indicating that static images should not be loaded in Internet Explorer 5 or later is lost.
     67 
     68<p>For example, WordArt is saved as HTML in two parts. One part is an XML block that describes the image. The other part is an actual image that makes the picture visible in older browsers that don't interpret XML. The static image is put inside uplevel comments to hide it from Internet Explorer 5 or later.</p>
     69</td>
     70</tr>
     71 
     72<tr class="trbgodd">
     73<td>XML tags in the "o", "v", "w", "x", and "p" namespaces
     74 
     75<p><b class="bterm">Example</b></p>
     76 
     77<pre><code>
     78&lt;o:p&gt;&lt;/o:p&gt;
     79</code></pre>
     80</td>
     81<td>Paragraph mark formatting (if different from the paragraph) is lost. The <pre><code>&lt;o:p&gt;&lt;/o:p&gt;</code></pre> tags represent the character that Word treats as the paragraph mark.</td>
     82</tr>
     83 
     84<tr class="trbgeven">
     85<td>@-rule definitions
     86 
     87<p><b class="bterm">Example</b><br></p>
     88 
     89<pre><code>
     90@page Section1
     91               {size: 8.5in 11in }
     92</code></pre>
     93</td>
     94<td>Page settings, such as page dimensions and orientation, are lost:
     95 
     96<ul>
     97<li> @page contains document page setup information</li>
     98 
     99<li> @font-face contains document font definitions</li>
     100 
     101<li> @list contains Office-specific bulleted and numbered list styles definitions</li>
     102</ul>
     103 
     104<p>To keep standard @ rule defintions, @page and @font-face, use the -a switch at the command prompt.</p>
     105</td>
     106</tr>
     107 
     108<tr class="trbgodd">
     109<td>CSS comments containing  /* and */
     110 
     111<p><b class="bterm">Example</b><br></p>
     112 
     113<pre><code>
     114/* List Definitions */
     115</code></pre>
     116</td>
     117<td>Minimal impact on HTML document.</td>
     118</tr>
     119 
     120<tr class="trbgeven">
     121<td>VML attributes, or any attribute with a colon ( : ) in the attribute name
     122 
     123<p><b class="bterm">Example</b><br></p>
     124 
     125<pre><code>
     126v:shapes="_x000_i1025"
     127</code></pre>
     128</td>
     129<td>WordArt, clip art, and AutoShapes cannot be edited; only the resulting static image is left.</td>
     130</tr>
     131 
     132<tr class="trbgodd">
     133<td>ProgID
     134 
     135<pre><code>
     136&lt;meta&gt;
     137</code></pre>
     138 
     139tags
     140 
     141<p><b class="bterm">Example</b><br></p>
     142 
     143<pre><code>
     144&lt;meta name=ProgID content=Word.Document&gt;
     145</code></pre>
     146</td>
     147<td>Minimal impact on HTML document. ProgID identifies the application the file was created in. 
     148 
     149<p>You can also remove <b class="bterm">GENERATOR</b> and <b class="bterm">ORIGINATOR</b> <b class="bterm">META</b> tags, which contain the information about the HTML document's originating program (for example, Word or Excel) and the latest generating program (Office HTML Filter). To remove the <b class="bterm">GENERATOR</b> and <b class="bterm">ORIGINATOR</b> <b class="bterm">META</b> tags, use the -m switch at the command prompt.</p>
     150</td>
     151</tr>
     152 
     153<tr class="trbgeven">
     154<td><b class="bterm">Link</b> elements with the <b class="bterm">rel</b> attribute set to any of the following:
     155 
     156<ul>
     157<li>"file-list"</li>
     158 
     159<li>"edit-time-data"</li>
     160 
     161<li>"ole-object-data"</li>
     162 
     163<li>"original-file"</li>
     164 
     165<li>"preview"</li>
     166</ul>
     167 
     168<p><b class="bterm">Example</b></p>
     169 
     170<pre><code>&lt;link rel=File-Listhref="./mydoc_files/filelist.xml"&gt;</code></pre>
     171</td>
     172<td>The association with all the special extra files that contain Office-specific data, such as OLE object binaries, is lost.</td>
     173</tr>
     174 
     175<tr class="trbgodd">
     176<td>The following XML namespace declarations - that is, the xmlns attribute setting:
     177 
     178<ul>
     179<li>"o"</li>
     180 
     181<li>"w"</li>
     182 
     183<li>"x"</li>
     184 
     185<li>"p"</li>
     186 
     187<li>"v"</li>
     188</ul>
     189 
     190<p><b class="bterm">Example</b></p>
     191 
     192<pre><code>
     193xmlns:v="urn:schemas-microsoft-com:vml"
     194</code></pre>
     195</td>
     196<td>The ability to render WordArt and clip art as vector images in the browser is lost. Instead, they become static images.
     197 
     198<p>To keep VML in the file, use the -v switch at the command prompt.</p>
     199 
     200<p>If either -o or -v is used at the command prompt, the XML namespace declarations remain in the file.</p>
     201</td>
     202</tr>
     203 
     204<tr class="trbgeven">
     205<td>Empty <b class="bterm">style</b> attributes, especially when they become empty as a result of processing their values
     206 
     207<p><b class="bterm">Example</b></p>
     208 
     209<pre><code>
     210style=""
     211</code></pre>
     212</td>
     213<td>Minimal impact on HTML document.</td>
     214</tr>
     215 
     216<tr class="trbgodd">
     217<td>"mso-" prefix properties
     218 
     219<p><b class="bterm">Example</b></p>
     220 
     221<pre><code>
     222mso-margin-top-alt: 12pt;
     223</code></pre>
     224</td>
     225<td>Office-specific formatting that stores Office document settings, which are are used when the HTML document is opened in Office. Some features, such as footnotes and customized bullet and numbering are lost. Word legacy frames become tables, and some edit-time language and font-formatting information is lost.
     226 
     227<p>To keep mso- prefix properties and other Office-specific properties, use the -o switch at the command prompt.</p>
     228</td>
     229</tr>
     230 
     231<tr class="trbgeven">
     232<td>Other non-standard properties such as: 
     233 
     234<ul>
     235<li>"tab-stops"</li>
     236 
     237<li>"tab-interval"</li>
     238 
     239<li>"language"</li>
     240 
     241<li>"text-underline"</li>
     242 
     243<li>"text-effect"</li>
     244 
     245<li>"text-line-through"</li>
     246 
     247<li>"font-color"</li>
     248 
     249<li>"horiz-align"</li>
     250 
     251<li>"list-image-1"</li>
     252 
     253<li>"list-image-2"</li>
     254 
     255<li>"list-image-3"</li>
     256 
     257<li>"separator-image"</li>
     258 
     259<li>"table-border-color-dark"</li>
     260 
     261<li>"table-border-color-light"</li>
     262 
     263<li>"vert-align"</li>
     264 
     265<li>"vnd.ms-excel.numberformat"</li>
     266</ul>
     267 
     268 
     269 
     270<p><b class="bterm">Example</b></p>
     271 
     272<pre><code>
     273tab-interval: .5in;
     274</code></pre>
     275</td>
     276<td>Tab settings are lost. All text underline styles become single underline. All underline colors become black. Engraved text and embossed text are lost.</td>
     277</tr>
     278 
     279<tr class="trbgodd">
     280<td>Empty inline HTML elements: <b class="bterm">FONT</b>, <b class="bterm">EM</b>, <b class="bterm">STRONG</b>, <b class="bterm">SAMP</b>, <b class="bterm">ACRONYM</b>, <b class="bterm">CITE</b>, <b class="bterm">CODE</b>, <b class="bterm">DFN</b>, <b class="bterm">KBD</b>, <b class="bterm">TT</b>, <b class="bterm">B</b>, <b class="bterm">I</b>, <b class="bterm">U</b>, <b class="bterm">S</b>, <b class="bterm">SUB</b>, <b class="bterm">SUP</b>, <b class="bterm">INS</b>, <b class="bterm">DEL</b>, <b class="bterm">VAR</b>, <b class="bterm">SPAN</b>. An element is considered empty if it contains no displayable contents.
     281 
     282<p><b class="bterm">Example</b></p>
     283 
     284<pre><code>
     285&lt;FONT COLOR=blue&gt;&lt;B&gt;&lt;/B&gt;&lt;/FONT&gt;
     286</code></pre>
     287</td>
     288<td>No impact on the display of the HTML document.</td>
     289</tr>
     290 
     291</table>
     292}}}
© 2003 – 2022, CKSource sp. z o.o. sp.k. All rights reserved. | Terms of use | Privacy policy