Changes between Initial Version and Version 1 of Ticket #13174, comment 8
- Timestamp:
- May 28, 2015, 5:31:36 PM (10 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
Ticket #13174, comment 8
initial v1 1 Wow, that's crazy. In fact this is incorrect dat e we get at the very begging. I believe that it is Word who give us wrong data, but it could be a browser too. These paragraphs are marked as a list items (have `MsoListParagraph*` classes), so this is very hard to realizedthat they are not list items.1 Wow, that's crazy. In fact this is incorrect data we get at the very begging. I believe that it is Word who gives us wrong data, but it could be a browser too. These paragraphs are marked as a list items (have `MsoListParagraph*` classes), so this is very hard to realize that they are not list items. 2 2 3 3 Anyway the fix is not good enough. It is enough to put anything after the text node (image or shape) and it stops working. Another example is to have bold at the begging (structure like this: `<b>Test</b> C`) - and the paragraph is also replaced with the list item ("Test" became a bullet, because it was not the last element). 4 4 5 What is the real issue here, is lo osing data. We are taking the parent of the text node which should be a bullet and remove it, because we assume that this is a `<span>` with the bullet character, but in these cases it could be a whole bolded text or thewhole paragraph. Because of the custom bullet types we are not able to recognize the bullet, but we can assume the it is not longer then 5 character. With this assumption even if we incorrectly recognize something as a bullet we will at most break very small part of the document.5 What is the real issue here, is losing data. We are taking the parent of the text node which should be a bullet and remove it, because we assume that this is a `<span>` with the bullet character, but in these cases it could be a whole bolded text or a whole paragraph. Because of the custom bullet types we are not able to recognize the bullet, but we can assume the it is not longer then 5 character. With this assumption even if we incorrectly recognize something as a bullet we will at most break very small part of the document.