Changes between Initial Version and Version 1 of MSWordFilter

Nov 27, 2009, 3:15:21 PM (11 years ago)
Garry Yao

Initialize paste from office reference


  • MSWordFilter

    v1 v1  
     1= Pasting from Microsoft Office Applications =
     2CKEditor perform automatically content cleanup filtering and intelligent transformation when pasting content is detected come from MS-Office application family, this including Word,Excel,Outlook, etc, which will result in good content more semantically correct, while trying to preserve as much as the original format.
     3The following plugins are required depends on demand:
     4 1. '''pastefromword''' plugin must present to deliver this functionality, when using it, you need to click the '''Paste From Word''' toolbar button to instruct the editor your clipboard content of this pasting is from MS Office applications.
     5 1. In order to auto-detect whether content is from Office application, the '''clipboard''' plugin must be configured, in this case, you just need to press keyboard 'Ctrl-V' or click on '''Paste''' toolbar button.
     9=== Browser Native Filtering ===
     10Each web browsers have it's own pasting system which may vary from the result that content been pasted from Office Word application, generally certain styles even text would loose during the pasting, the following table lists the known filter outs:[[BR]]
     13||Document Structure Styles||NO||Yes||NO||NO
     15=== CKEditor Filtering ===
     16Further processing is handled by the editor based on browser's filtering result,the following table summarizes the rules that will affect the end result:[[BR]]
     21<table class="collapse">
     23<tr class="trbgeven">
     24<th><strong>Format Removed</strong></th>
     25<th><strong>Result Affected</strong></th>
     28<tr class="trbgodd">
     29<td>Downlevel conditional comments content within 
     37<p><b class="bterm">Example</b></p>
     39<pre><code>&lt;!--[if gte mso 9]&gt;...&lt;![endif]--&gt;</code></pre>
     41<td>WordArt cannot be edited, only the resulting static image is left. These comments make some HTML markup invisible to browsers earlier than Microsoft Internet Explorer 5.
     43<p>For example, Office inserts XML blocks containing WordArt document properties inside these comments so that the contents of these XML elements do not show up as text in browsers earlier than Internet Explorer 5.</p>
     47<tr class="trbgeven">
     48<td>Uplevel conditional comments within
     60<p><b class="bterm">Example</b></p>
     63&lt;! [if !vml]&gt;
     66<td>These comments make some HTML markup visible in browsers earlier than Internet Explorer 5 but invisible in Internet Explorer 5 or later. When the comments are removed, the markup indicating that static images should not be loaded in Internet Explorer 5 or later is lost.
     68<p>For example, WordArt is saved as HTML in two parts. One part is an XML block that describes the image. The other part is an actual image that makes the picture visible in older browsers that don't interpret XML. The static image is put inside uplevel comments to hide it from Internet Explorer 5 or later.</p>
     72<tr class="trbgodd">
     73<td>XML tags in the "o", "v", "w", "x", and "p" namespaces
     75<p><b class="bterm">Example</b></p>
     81<td>Paragraph mark formatting (if different from the paragraph) is lost. The <pre><code>&lt;o:p&gt;&lt;/o:p&gt;</code></pre> tags represent the character that Word treats as the paragraph mark.</td>
     84<tr class="trbgeven">
     85<td>@-rule definitions
     87<p><b class="bterm">Example</b><br></p>
     90@page Section1
     91               {size: 8.5in 11in }
     94<td>Page settings, such as page dimensions and orientation, are lost:
     97<li> @page contains document page setup information</li>
     99<li> @font-face contains document font definitions</li>
     101<li> @list contains Office-specific bulleted and numbered list styles definitions</li>
     104<p>To keep standard @ rule defintions, @page and @font-face, use the -a switch at the command prompt.</p>
     108<tr class="trbgodd">
     109<td>CSS comments containing  /* and */
     111<p><b class="bterm">Example</b><br></p>
     114/* List Definitions */
     117<td>Minimal impact on HTML document.</td>
     120<tr class="trbgeven">
     121<td>VML attributes, or any attribute with a colon ( : ) in the attribute name
     123<p><b class="bterm">Example</b><br></p>
     129<td>WordArt, clip art, and AutoShapes cannot be edited; only the resulting static image is left.</td>
     132<tr class="trbgodd">
     141<p><b class="bterm">Example</b><br></p>
     144&lt;meta name=ProgID content=Word.Document&gt;
     147<td>Minimal impact on HTML document. ProgID identifies the application the file was created in. 
     149<p>You can also remove <b class="bterm">GENERATOR</b> and <b class="bterm">ORIGINATOR</b> <b class="bterm">META</b> tags, which contain the information about the HTML document's originating program (for example, Word or Excel) and the latest generating program (Office HTML Filter). To remove the <b class="bterm">GENERATOR</b> and <b class="bterm">ORIGINATOR</b> <b class="bterm">META</b> tags, use the -m switch at the command prompt.</p>
     153<tr class="trbgeven">
     154<td><b class="bterm">Link</b> elements with the <b class="bterm">rel</b> attribute set to any of the following:
     168<p><b class="bterm">Example</b></p>
     170<pre><code>&lt;link rel=File-Listhref="./mydoc_files/filelist.xml"&gt;</code></pre>
     172<td>The association with all the special extra files that contain Office-specific data, such as OLE object binaries, is lost.</td>
     175<tr class="trbgodd">
     176<td>The following XML namespace declarations - that is, the xmlns attribute setting:
     190<p><b class="bterm">Example</b></p>
     196<td>The ability to render WordArt and clip art as vector images in the browser is lost. Instead, they become static images.
     198<p>To keep VML in the file, use the -v switch at the command prompt.</p>
     200<p>If either -o or -v is used at the command prompt, the XML namespace declarations remain in the file.</p>
     204<tr class="trbgeven">
     205<td>Empty <b class="bterm">style</b> attributes, especially when they become empty as a result of processing their values
     207<p><b class="bterm">Example</b></p>
     213<td>Minimal impact on HTML document.</td>
     216<tr class="trbgodd">
     217<td>"mso-" prefix properties
     219<p><b class="bterm">Example</b></p>
     222mso-margin-top-alt: 12pt;
     225<td>Office-specific formatting that stores Office document settings, which are are used when the HTML document is opened in Office. Some features, such as footnotes and customized bullet and numbering are lost. Word legacy frames become tables, and some edit-time language and font-formatting information is lost.
     227<p>To keep mso- prefix properties and other Office-specific properties, use the -o switch at the command prompt.</p>
     231<tr class="trbgeven">
     232<td>Other non-standard properties such as: 
     270<p><b class="bterm">Example</b></p>
     273tab-interval: .5in;
     276<td>Tab settings are lost. All text underline styles become single underline. All underline colors become black. Engraved text and embossed text are lost.</td>
     279<tr class="trbgodd">
     280<td>Empty inline HTML elements: <b class="bterm">FONT</b>, <b class="bterm">EM</b>, <b class="bterm">STRONG</b>, <b class="bterm">SAMP</b>, <b class="bterm">ACRONYM</b>, <b class="bterm">CITE</b>, <b class="bterm">CODE</b>, <b class="bterm">DFN</b>, <b class="bterm">KBD</b>, <b class="bterm">TT</b>, <b class="bterm">B</b>, <b class="bterm">I</b>, <b class="bterm">U</b>, <b class="bterm">S</b>, <b class="bterm">SUB</b>, <b class="bterm">SUP</b>, <b class="bterm">INS</b>, <b class="bterm">DEL</b>, <b class="bterm">VAR</b>, <b class="bterm">SPAN</b>. An element is considered empty if it contains no displayable contents.
     282<p><b class="bterm">Example</b></p>
     285&lt;FONT COLOR=blue&gt;&lt;B&gt;&lt;/B&gt;&lt;/FONT&gt;
     288<td>No impact on the display of the HTML document.</td>
© 2003 – 2021 CKSource – Frederico Knabben. All rights reserved. | Terms of use | Privacy policy