Opened 17 years ago
Closed 17 years ago
#1355 closed Bug (fixed)
Converting <p> to <pre> may add additional line breaks
Reported by: | Frederico Caldeira Knabben | Owned by: | Martin Kou |
---|---|---|---|
Priority: | Normal | Milestone: | FCKeditor 2.6 |
Component: | General | Version: | SVN (FCKeditor) - Retired |
Keywords: | Review+ | Cc: |
Description
Steps to Reproduce
- Load the following HTML:
<p>Line 1<br /> Line 2<br /> Line 3</p>
- Change the paragraph Format to "Formatted".
Note that each line has been separated by a double line break. It doesn't happen if we have the following <p> instead:
<p>Line 1<br />Line 2<br />Line 3</p>
Things get even worst if the initial <p> looks like this:
<p> Some sample </p>
It is evident that the whitespace of the original <p> is being preserved on the transformation, but the expected result for the above case is:
<pre>Some sample</pre>
This is not a problem with IE, as it automatically eats whitespaces when parsing.
Attachments (6)
Change History (21)
comment:1 Changed 17 years ago by
Owner: | set to Martin Kou |
---|---|
Status: | new → assigned |
Changed 17 years ago by
Attachment: | 1355.patch added |
---|
comment:2 Changed 17 years ago by
Keywords: | Review? added |
---|
comment:3 Changed 17 years ago by
Keywords: | Review- added; Review? removed |
---|
It seems a much simpler solution could be achieved by using a couple of regexes and innerHTML.
We must also check if the other way (<pre> => <p>) is properly handled.
comment:4 follow-up: 7 Changed 17 years ago by
The regex method sounds simple in theory, unfortunately it seems to trigger a nasty Firefox bug which causes JavaScript error in the range.SelectBookmark() function.
After we modified the <pre> node by assigning innerHTML instead of by DOM tree operations, the bookmark nodes inside the <pre> block can still be retrieved by document.getElementId() in range.SelectBookmark(), which is correct. However, the bookmark nodes retrieved this way would then have a null parentNode property, which does not make sense. That caused the SelectBookmark() function to throw JavaScript errors.
This is very unfortunate, I'm trying to see if a workaround can be created.
comment:5 follow-up: 6 Changed 17 years ago by
The regex method does not work <pre> -> regular block because converting simple spaces to with a regex would destroy the contents inside tags. Also for cases like
<pre> hello world! </pre>
Converting the space between "hello" and "world" to is not necessary.
comment:6 Changed 17 years ago by
Replying to martinkou:
The regex method does not work <pre> -> regular block because converting simple spaces to with a regex would destroy the contents inside tags. Also for cases like
<pre> hello world! </pre>Converting the space between "hello" and "world" to is not necessary.
Martin, I still think using innerHTML and Regex is the easier solution. It would also avoid problems with successive space text nodes.
One idea would be using the <string>.split, to split the text using a regex that finds tags, like /(<[^\s].*?>)/
, capturing the regex matches, so they are included in the split array. Then, processing each entry in the array, replacing spaces in those that don't look like tags (don't start with <[^\s]
, basically), and re-joining the array.
To replace the spaces, first replace all tabs with four spaces, then a simple replace regex with / {2,}/
using a replace function would do the magic.
comment:7 Changed 17 years ago by
Replying to martinkou:
After we modified the <pre> node by assigning innerHTML instead of by DOM tree operations, the bookmark nodes inside the <pre> block can still be retrieved by document.getElementId() in range.SelectBookmark(), which is correct. However, the bookmark nodes retrieved this way would then have a null parentNode property, which does not make sense. That caused the SelectBookmark() function to throw JavaScript errors.
Be sure that bookmark.StartNode and bookmark.EndNode are null before calling range.SelectBookmark(). The node could have been cached in the bookmark for performance.
comment:8 Changed 17 years ago by
I've written a new patch based on the regex approach, with some simple XHTML parsing logic to differentiate between XHTML tags and non-tag XHTML code.
It turns out there are quite a number of gotchas in converting between <pre> and regular blocks. e.g. when converting from <pre> to <p>... the following cases need to be handled correctly:
<pre>some sample</pre>
vs
<pre>some sample</pre>
The first case can be converted to <p> without any problem. But the spaces in the second case must be converted to .
and
<pre> some sample</pre>
vs
<pre>some sample</pre>
Simply changing the tag to <p> in the first case would cause the leading space to disappear, so even though it's only a single space it still must be converted to .
And then there are gotchas when people put weird stuff inside tags... e.g. trying to convert the following to <pre>
<p>some <b attr="<< >>" attr2='> >< <'> text </b> here</p>
Sure it's not standard but even if the user wrote > and < within those quotes, IE would still convert them back to the literal form for us, which causes trouble for regex substitutions since simple dumb regex does not differentiate between text inside tags and text outside tags.
I think I should write some test cases in the FCKtest SVN tree first, to record down all those gotchas I've discovered so far, before putting my patch on review.
Changed 17 years ago by
Attachment: | 1355_2.patch added |
---|
Changed 17 years ago by
Attachment: | 1355_3.patch added |
---|
Fixes a minor bug in 1533_2.patch discovered when writing test cases.
comment:9 Changed 17 years ago by
Keywords: | Review? added; Review- removed |
---|
I've uploaded a number of test cases for this ticket to the FCKtest tree in [1452].
An updated patch was attached which solves a minor bug in the previous patch. The bug was discovered while writing the test cases. In particular, the previous patch (1355_2.patch) could not properly handle the following case when converting to <P>:
<pre> some text ehre </pre>
Changed 17 years ago by
Attachment: | 1355_5.patch added |
---|
Furthur simplified 1355_4.patch from Fred's suggestions.
comment:11 Changed 17 years ago by
Keywords: | Review- added; Review? removed |
---|
Ok... the code seemed ok to me now, but then I had some failures on some TCs.
For example. Load the following code in the editor.
<pre> Line 1 Line 2 Line 3 </pre>
Note that each line has 4 spaces in the beginning.
Now, convert the above block to <p>. You will immediately note that Line 1 is not aligned with the others. Switching to Source we have:
<p> Line 1<br> Line 2<br> Line 3</p>
So, none of the lines have been converted properly. Also, looking at the way browses handle followed spaces, the real space is always left "after" the sequence (which makes sense for line wrapping), while we are making it come before of it (my fault). So, changing line 781 to the following will fix Line 2 and Line 3:
return new Array( match.length ).join( ' ' ) + ' ' ;
The "Line 1" problem still persists though.
The expected output is:
<p> Line 1<br> Line 2<br> Line 3</p>
Changed 17 years ago by
Attachment: | 1355_6.patch added |
---|
comment:12 Changed 17 years ago by
Keywords: | Review? added; Review- removed |
---|
I've updated the patch (yet again) to fix the newly discovered problem. New test cases will be added to FCKtest shortly.
comment:13 Changed 17 years ago by
Two new test cases has been added to the FCKtest source tree for this ticket.
comment:14 Changed 17 years ago by
Keywords: | Review+ added; Review? removed |
---|
Please add the changelog entry for it when committing.
comment:15 Changed 17 years ago by
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Fixed with [1559].
Click here for more info about our SVN system.
Proposed patch for fixing #1355.