Ticket #1355 (closed Bug: fixed)

Opened 7 years ago

Last modified 6 years ago

Converting <p> to <pre> may add additional line breaks

Reported by: fredck Owned by: martinkou
Priority: Normal Milestone: FCKeditor 2.6
Component: General Version: SVN (FCKeditor) - Retired
Keywords: Review+ Cc:

Description

Steps to Reproduce

  1. Load the following HTML:
<p>Line 1<br />
Line 2<br />
Line 3</p>
  1. Change the paragraph Format to "Formatted".

Note that each line has been separated by a double line break. It doesn't happen if we have the following <p> instead:

<p>Line 1<br />Line 2<br />Line 3</p>

Things get even worst if the initial <p> looks like this:

<p>
    Some


    sample
</p>

It is evident that the whitespace of the original <p> is being preserved on the transformation, but the expected result for the above case is:

<pre>Some sample</pre>

This is not a problem with IE, as it automatically eats whitespaces when parsing.

Attachments

1355.patch (2.4 KB) - added by martinkou 6 years ago.
Proposed patch for fixing #1355.
1355_2.patch (9.3 KB) - added by martinkou 6 years ago.
1355_3.patch (9.3 KB) - added by martinkou 6 years ago.
Fixes a minor bug in 1533_2.patch discovered when writing test cases.
1355_4.patch (9.4 KB) - added by martinkou 6 years ago.
Simplified the 1355_3.patch.
1355_5.patch (8.8 KB) - added by martinkou 6 years ago.
Furthur simplified 1355_4.patch from Fred's suggestions.
1355_6.patch (10.9 KB) - added by martinkou 6 years ago.

Change History

comment:1 Changed 6 years ago by martinkou

  • Owner set to martinkou
  • Status changed from new to assigned

Changed 6 years ago by martinkou

Proposed patch for fixing #1355.

comment:2 Changed 6 years ago by martinkou

  • Keywords Review? added

comment:3 Changed 6 years ago by fredck

  • Keywords Review- added; Review? removed

It seems a much simpler solution could be achieved by using a couple of regexes and innerHTML.

We must also check if the other way (<pre> => <p>) is properly handled.

comment:4 follow-up: ↓ 7 Changed 6 years ago by martinkou

The regex method sounds simple in theory, unfortunately it seems to trigger a nasty Firefox bug which causes JavaScript error in the range.SelectBookmark() function.

After we modified the <pre> node by assigning innerHTML instead of by DOM tree operations, the bookmark nodes inside the <pre> block can still be retrieved by document.getElementId() in range.SelectBookmark(), which is correct. However, the bookmark nodes retrieved this way would then have a null parentNode property, which does not make sense. That caused the SelectBookmark() function to throw JavaScript errors.

This is very unfortunate, I'm trying to see if a workaround can be created.

comment:5 follow-up: ↓ 6 Changed 6 years ago by martinkou

The regex method does not work <pre> -> regular block because converting simple spaces to &nbsp; with a regex would destroy the contents inside tags. Also for cases like

<pre>
hello world!
</pre>

Converting the space between "hello" and "world" to &nbsp; is not necessary.

comment:6 in reply to: ↑ 5 Changed 6 years ago by fredck

Replying to martinkou:

The regex method does not work <pre> -> regular block because converting simple spaces to &nbsp; with a regex would destroy the contents inside tags. Also for cases like

<pre>
hello world!
</pre>

Converting the space between "hello" and "world" to &nbsp; is not necessary.

Martin, I still think using innerHTML and Regex is the easier solution. It would also avoid problems with successive space text nodes.

One idea would be using the <string>.split, to split the text using a regex that finds tags, like /(<[^\s].*?>)/, capturing the regex matches, so they are included in the split array. Then, processing each entry in the array, replacing spaces in those that don't look like tags (don't start with <[^\s], basically), and re-joining the array.

To replace the spaces, first replace all tabs with four spaces, then a simple replace regex with / {2,}/ using a replace function would do the &nbsp; magic.

comment:7 in reply to: ↑ 4 Changed 6 years ago by fredck

Replying to martinkou:

After we modified the <pre> node by assigning innerHTML instead of by DOM tree operations, the bookmark nodes inside the <pre> block can still be retrieved by document.getElementId() in range.SelectBookmark(), which is correct. However, the bookmark nodes retrieved this way would then have a null parentNode property, which does not make sense. That caused the SelectBookmark() function to throw JavaScript errors.

Be sure that bookmark.StartNode and bookmark.EndNode are null before calling range.SelectBookmark(). The node could have been cached in the bookmark for performance.

comment:8 Changed 6 years ago by martinkou

I've written a new patch based on the regex approach, with some simple XHTML parsing logic to differentiate between XHTML tags and non-tag XHTML code.

It turns out there are quite a number of gotchas in converting between <pre> and regular blocks. e.g. when converting from <pre> to <p>... the following cases need to be handled correctly:

<pre>some sample</pre>

vs

<pre>some   sample</pre>

The first case can be converted to <p> without any problem. But the spaces in the second case must be converted to &nbsp;.

and

<pre> some sample</pre>

vs

<pre>some sample</pre>

Simply changing the tag to <p> in the first case would cause the leading space to disappear, so even though it's only a single space it still must be converted to &nbsp;.

And then there are gotchas when people put weird stuff inside tags... e.g. trying to convert the following to <pre>

<p>some <b attr="<<               >>" attr2='>     ><     <'> text </b> here</p>

Sure it's not standard but even if the user wrote &gt; and &lt; within those quotes, IE would still convert them back to the literal form for us, which causes trouble for regex substitutions since simple dumb regex does not differentiate between text inside tags and text outside tags.

I think I should write some test cases in the FCKtest SVN tree first, to record down all those gotchas I've discovered so far, before putting my patch on review.

Changed 6 years ago by martinkou

Changed 6 years ago by martinkou

Fixes a minor bug in 1533_2.patch discovered when writing test cases.

comment:9 Changed 6 years ago by martinkou

  • Keywords Review? added; Review- removed

I've uploaded a number of test cases for this ticket to the FCKtest tree in [1452].

An updated patch was attached which solves a minor bug in the previous patch. The bug was discovered while writing the test cases. In particular, the previous patch (1355_2.patch) could not properly handle the following case when converting to <P>:

<pre>
some text ehre
 </pre>

comment:10 Changed 6 years ago by martinkou

This ticket is related to #1229.

Changed 6 years ago by martinkou

Simplified the 1355_3.patch.

Changed 6 years ago by martinkou

Furthur simplified 1355_4.patch from Fred's suggestions.

comment:11 Changed 6 years ago by fredck

  • Keywords Review- added; Review? removed

Ok... the code seemed ok to me now, but then I had some failures on some TCs.

For example. Load the following code in the editor.

<pre>
    Line 1
    Line 2
    Line 3
</pre>

Note that each line has 4 spaces in the beginning.

Now, convert the above block to <p>. You will immediately note that Line 1 is not aligned with the others. Switching to Source we have:

<p>&nbsp;&nbsp;&nbsp;&nbsp;Line 1<br>
&nbsp;&nbsp;&nbsp;Line 2<br>
&nbsp;&nbsp;&nbsp;Line 3</p>

So, none of the lines have been converted properly. Also, looking at the way browses handle followed spaces, the real space is always left "after" the &nbsp; sequence (which makes sense for line wrapping), while we are making it come before of it (my fault). So, changing line 781 to the following will fix Line 2 and Line 3:

return new Array( match.length ).join( '&nbsp;' ) + ' ' ;

The "Line 1" problem still persists though.

The expected output is:

<p>&nbsp;&nbsp;&nbsp; Line 1<br>
&nbsp;&nbsp;&nbsp; Line 2<br>
&nbsp;&nbsp;&nbsp; Line 3</p>

Changed 6 years ago by martinkou

comment:12 Changed 6 years ago by martinkou

  • Keywords Review? added; Review- removed

I've updated the patch (yet again) to fix the newly discovered problem. New test cases will be added to FCKtest shortly.

comment:13 Changed 6 years ago by martinkou

Two new test cases has been added to the FCKtest source tree for this ticket.

comment:14 Changed 6 years ago by fredck

  • Keywords Review+ added; Review? removed

Please add the changelog entry for it when committing.

comment:15 Changed 6 years ago by martinkou

  • Status changed from assigned to closed
  • Resolution set to fixed

Fixed with [1559].

Click here for more info about our SVN system.

Note: See TracTickets for help on using tickets.
© 2003 – 2012 CKSource – Frederico Knabben. All rights reserved. | Terms of use | Privacy policy