Opened 12 years ago

Last modified 12 years ago

#9829 closed New Feature

Data and features activation based on configurations — at Version 23

Reported by: Frederico Caldeira Knabben Owned by: Piotrek Koszuliński
Priority: Normal Milestone: CKEditor 4.1 RC
Component: General Version:
Keywords: Drupal Cc: wim.leers@…

Description (last modified by Piotrek Koszuliński)

The basic idea is introducing filtering that acts on the HTML accepte by the editor and the feature available in it.

  1. An as easy as possible way to configure the elements, attributes and transformations (e.g. <b> to <strong>) that the editor should support. Note: transformations part were extracted to #9989.
  1. By default, this configuration is defined by the plugins available in the editor. E.g. remove the link plugin and <a> will not any more be accepted.
  1. Developers can define this configuration. At that point, plugins, or part of their features, get enabled/disabled based on the configuration. Note: dialogs integration with allowed content filter was extracted to #9990.
  1. All data input (especially paste) in the editor will be then filtered, based on the configuration. Transformations first and then elements/styles checks.
  1. Input data normalization will take place before filtering, bringing to an acceptable DOM structure.
  1. Special pasting plugins, like Paste from Word, will be reviewed to use the normalization+filtering system. Extracted to #9991.
  1. Selection context participates on filtering as well. A <h1> editable should restrict the editor features accordingly. Another example, a selection inside <a> should not permit <a> elements being inserted. Status: 50%

As a result only allowed HTML will reach the contents and only allowed plugins (or their features) will be enabled.

Change History (23)

comment:1 Changed 12 years ago by Frederico Caldeira Knabben

Description: modified (diff)

comment:2 Changed 12 years ago by Wim Leers

Cc: wim.leers@… added

I would +9000 if I could, but I'll just stick to a +1 :)


The first three bullets make total sense.

Can you clarify what: "filtering" in the fourth bullet and "input data normalization" in the fifth bullet entail precisely?
And why isn't "transformations" mentioned in the sixth bullet, just normalization & filtering? If a Word doc contains the equivalent of a <b> tag, it should also be possible to transform that to a <strong> tag.

Overall, it seems you have several concepts in CKEditor: "transformations" and "elements/styles checks" which together seem to form "filters", as well as "normalization".

I need to understand this before I can give useful feedback.

I apologize if this is to be found plain and clear somewhere in the CKEditor docs, but if you could reply with links, then I can be certain I'm reading the best docs. Thanks.

comment:3 Changed 12 years ago by Piotrek Koszuliński

Description: modified (diff)
Status: newconfirmed

comment:4 Changed 12 years ago by Piotrek Koszuliński

Hi Wim :) I would +9000 too. As pasting and data insertion are apples in my eye I was looking forward to finally add this feature to the editor. I'm really thrilled about it (omg... I'm a geek :D).

Answering to your question about filtering and data normalization. Filtering is what this ticket is all about. This means stripping elements (just tags, not content) and styles/attributes that are not allowed in content of the editor and transforming not allowed elements into accepted ones. For example - we've got small parts of this feature already implemented:

  • basing on DTD (CKEDITOR.dtd) we're stripping elements that can't be inserted into specific editable. We needed that because since 4.0 editor can be initialized on elements like headers, pre, address, etc (entire list - CKEDITOR.dtd.$editable). So e.g. if table is pasted into editor initialzed on header, then table and all tr, td, th tags are stripped and contents of their children are joined together. This part of filtering is done by editor.insertHtml() (in fact inside editable.insertHtml().
  • pastefromword is currently performing elements transformation basing on styles settings. E.g. it replaces <b> and <span style="font-weight:bold"> with <strong> if bold style is configured this way. We'll review Word filter in this ticket, because we'll be able to remove that parts (see 6.).

But before we're able to transform data, we need to normalize (fix) them. This is done differently depending on input source. The most important part of normalization is done by htmlDataProcessor. It can fix really, really bad HTML. However, Word generates such a mess (or rather browsers generate mess for messy Word data in clipboard) that we need some special pre-normalization, specific for this input - this is done by pastefromword plugin. Another example of specific input is normal pasting (especially in Webkit), for which we have some preprocessing in clipboard plugin.

comment:5 Changed 12 years ago by Frederico Caldeira Knabben

Reinmar gave a full overview of the things that we have now. Just to not make it confusing, this ticket is all about redesigning this entire system, in a simpler and more efficient model. We still have to put details on paper.

Btw, when I said "normalization+filtering", I've combined "transformation+filtering" into "filtering".

comment:6 Changed 12 years ago by Frederico Caldeira Knabben

Description: modified (diff)

Reinmar said something important: the system is also aware about the editable or selection context. I'm adding point "7" in the list for it.

comment:7 Changed 12 years ago by Wim Leers

Thanks for the explanation in comment:4, it sounds like that will be a very complete system. Can't wait to test the initial version!

Last edited 12 years ago by Wim Leers (previous) (diff)

comment:8 Changed 12 years ago by Jakub Ś

#588 and #8726 were marked as duplicates.

comment:9 Changed 12 years ago by Piotrek Koszuliński

Owner: set to Piotrek Koszuliński
Status: confirmedassigned

comment:10 Changed 12 years ago by Piotrek Koszuliński

Weekly update.

You can follow current status of implementation on t/9829 branch. There's a new datafilter.html sample which presents this new feature.

Note: This is still kind of prototype, so e.g. code lacks good documentation and some features work only partially.

What's already done:

  1. CKEDITOR.dataFilter - new class which contains entire filtering logic and rules parsing, etc.
  2. config.allowedContent - as I proposed on mailing list I implemented two formats - string and object based. Second and third editors in data filtering sample show examples of allowedContent configurations.
    • The string based configuration sample shows nearly all its features which are currently implemented. I haven't only used '*' in elements list. Asterix makes it possible to define rulles accepting defined classes, attributes and styles for all elements accepted by other rules. E.g. 'img[src]; p b; *.red' == 'img.red[src]; p b.red'.
    • As you can see I implemented simple classes validator (tokens after '.' are classes), because for the semantics they are very important and especially in inline editor they can ruin the content when pasted. That forced me to use semicolon rather than comma proposed by Fred as a separator of rules, because e.g. this would be unclear 'a.red,b' (b is a class or element in next rule?). I think that this is for good, because it made this string more readable.
    • When implementing classes parser I forgot that there are parentheses on my keyboard which I could use to wrap classes list instead of using dot format. Which format do you like more: 'a(red,blue)[href]' or 'a.red,blue[href]'?
    • Data filtering sample contains example of pretty simple object based configuration, but in fact it has more features. I needed to add them when I started defining content allowed by plugins. You can see more complicated rules in justify and fakeobject plugins.
    • I haven't implemented attributes and styles values checking yet (in fact it's possible in object based format, but it's not straightforward). And to KISS for now, I won't add values validators unless I'll really need to do that.
  3. Rules defined by editor and plugins (so called 'default rules') or by the user (then plugins' rules are ignored) are used to filter content that goes into the editor. I attached dataFilter to dataProcessor what means that all ways of setting/inserting content are influenced by allowed content rules - i.e. setData, source mode, pasting.
    • This part of feature isn't working well yet. E.g. block elements stripping creates a lot of mess. It's veeery nontrivial and I need more time to polish that.
    • Also, dataFilter is attached to dataProcessor in a little wrong way, because it doesn't really filter the 'data format', but 'internal HTML format' so it may break some things right now.
  4. If default allowed content configuration is used (because user hasn't defined his), then buttons and plugins add content allowed by them to the dataFilter. You can see on editor 3 (which has stripped few plugins and buttons) that e.g. strike through, anchors and image disappeared from the content. And when you compare to editor 1 you can see that they are present there, because buttons and plugins are enabled. Magic! :)
  5. If custom configuration for allowed content was defined (editors 2a and 2b), then features that aren't allowed are rejected by the editor. Currently only some plugins support data filter. These are:
    • basicstyles - see missing buttons in editors 2a and 2b,
    • ol ul lists - missing in editor 2a (content and buttons),
    • blockquote,
    • justify - here I had problem how to verify if these buttons are allowed - I'm checking only for 'p{text-align}' - justify is removed in editor 3, so content footer is aligned to left and as text-align isn't allowed in editor 2a button is removed and content is filtered too,
    • link and anchor - anchors and anchor button are removed in editors 2a, 2b and 3,
    • table - missing in editor 2b (content and buttons),
    • format combo - in editors 2a and 2b stripped down version of formats is available.
  6. This magic behaviour has to be spread to the other editor features. E.g. commands have to be disabled if they aren't allowed, so they can't be triggered by keystrokes. Dialogs' fields have to be removed and so on.

That would be it. I hope that you liked my child :). Feedback is greatly appreciated. Especially regarding API. I'm never sure about names, so I'm open for changes.

Last edited 12 years ago by Piotrek Koszuliński (previous) (diff)

comment:11 Changed 12 years ago by Wim Leers

CKEDITOR.dataFilter

I tried to review the code at core/datafilter.js, but it's hard for me to get into it because the code seems to have a lot of "CKEDITORisms", which is absolutely fine, it just means I doubt I'll be able to give deep code feedback soon.

Scanning the entire file though, it is in line with what I'd expect to be necessary for a feature like this. One thing I'm not sure about is all the mock*() functions — why are those necessary? If they're only for tests, then why do they need to live here? And in fact, why is there a test() function? Is this all just temporary? I can't imagine why you would want to have your testing code be part of the normal code, because then you'd be shipping pointless, precious bytes to every end user for no reason. On second thought and on grepping the code base, it seems like this is necessary to figure out whether a button should be allowed to appear at all in the toolbar, i.e. to check whether it doesn't conflict with the dataFilter configuration?

In any case, even the most sparse commenting would be most helpful in datafilter.js to be able to better review the code :)

Asterix makes it possible to define rulles accepting defined classes, attributes and styles for all elements accepted by other rules.

That's not what I'd expect the asterisk to do, but I think that makes sense :)

Which format do you like more: a(red,blue)[href] or a.red,blue[href]?

The latter IMO reads like this: "an a tag with a red class OR a blue tag with a href attribute". I think the former is therefor clearer. Even better would be if it would resemble a well-known syntax: inspired by regular expressions syntax, it could become this:

a(red|blue)[href]

Thoughts?

I won't add values validators unless I'll really need to do that.

+1

If default allowed content configuration is used (because user hasn't defined his), then buttons and plugins add content allowed by them to the dataFilter. You can see on 4th editors (which has stripped few plugins and buttons) that e.g. strike through, anchors and image disappeared. And when you compare to 1st editor you can see that they are present there. Magic! :)

When I read this, I was super excited, but when I went to look at the example, I wasn't anymore :( Unless I'm missing something, you're simply setting removeButtons on those things that have "disappeared as if by magic"? If they were disappearing due to the allowedContent setting, now that would be magic, but if you explicitly state that those buttons should be removed, I don't see how that is magic? :( Maybe you forgot to commit the right sample? :)

Also: in the sample, I only see editors 1, 2a, 2b and 3, not 4. So … I think you might indeed have not yet committed the right sample? If not, then the numbering you use in comment:10 is a mismatch with the numbering in the actual sample and thus very confusing.

justify - here I had problem how to verify if these buttons are allowed - I'm checking only for 'p{text-align}'

That makes sense, but I just wanted to make sure that one thing will become possible eventually: in Drupal, we don't ever want style attributes in our HTML content. Everything should be CSS-overridable. What we want instead for alignment, is the use of classes. IIRC, it's possible to have the alignment buttons set classes on elements, so that's a non-issue, but I want to make sure that it's also possible that we can also configure allowedContent accordingly.

Dialog's fields have to be removed and so on.

Indeed— do you think that will be hard or easy to achieve? (I don't know enough about CKE's internal structures or metadata to assess the complexity of this).

I hope that you liked my child :). Feedback is greatly appreciated. Especially regarding API.

I do! If you could add some high-level comments to your code, and also to the entry points (the points where the rest of CKE is using dataFilter), that would enable me to give more useful feedback.

My feedback is mostly superficial but I hope there's at least a few useful bits in there. Overall: please add more comments to enable us to give better feedback!

comment:12 in reply to:  11 Changed 12 years ago by Piotrek Koszuliński

Scanning the entire file though, it is in line with what I'd expect to be necessary for a feature like this. One thing I'm not sure about is all the mock*() functions — why are those necessary? If they're only for tests, then why do they need to live here? And in fact, why is there a test() function? Is this all just temporary? I can't imagine why you would want to have your testing code be part of the normal code, because then you'd be shipping pointless, precious bytes to every end user for no reason.

These methods aren't part of unit tests. DataFilter#test() is a public method which plugins use to check if some content is allowed. It accepts the string based format accepted by allowedContent configuration and basic CKEDITOR.style objects.


On second thought and on grepping the code base, it seems like this is necessary to figure out whether a button should be allowed to appear at all in the toolbar, i.e. to check whether it doesn't conflict with the dataFilter configuration?

Exactly :). BTW. git diff master...t/9892 is convenient way of checking what's changed since this branch diverged from master.


In any case, even the most sparse commenting would be most helpful in datafilter.js to be able to better review the code :)

Hmm. You're right. I asked for feedback, so at least basic comments for the most important stuff would be great. I'll add them asap.


Which format do you like more: a(red,blue)[href] or a.red,blue[href]?

The latter IMO reads like this: "an a tag with a red class OR a blue tag with a href attribute". I think the former is therefor clearer.

"blue" doesn't look like a tag for me, because semicolons are used for rules separation. But you may be right that grouping classes by "()" will be clearer for people who hasn't spent hours starring at allowedContent like me :D. Let's see what others think.


Even better would be if it would resemble a well-known syntax: inspired by regular expressions syntax, it could become this:

a(red|blue)[href]

Thoughts?

I omitted pipes leaving them for values separator. If we'll decide to implement simple values checking then it would be great to have: img[attr=val1|val2|val3, attr2=...]. Also in regexp like fashion. Comma also has one advantage over pipe for being an attributes, classes and styles separator - in natural language we used to put space after it, what makes text more readable. And this is now supported by dataFilter. However, such a space after pipe would change regexp meaning.


If default allowed content configuration is used (because user hasn't defined his), then buttons and plugins add content allowed by them to the dataFilter. You can see on 4th editors (which has stripped few plugins and buttons) that e.g. strike through, anchors and image disappeared. And when you compare to 1st editor you can see that they are present there. Magic! :)

When I read this, I was super excited, but when I went to look at the example, I wasn't anymore :( Unless I'm missing something, you're simply setting removeButtons on those things that have "disappeared as if by magic"? If they were disappearing due to the allowedContent setting, now that would be magic, but if you explicitly state that those buttons should be removed, I don't see how that is magic? :( Maybe you forgot to commit the right sample? :)

Ahh. Sorry for that - I forgot that I made a stupid decision to name editors 1, 2a, 2b and 3 :) I updated my previous comment to align it to this sample.

And there's a magic ;> Check editors 2a and 2b - buttons (and even options in format combo) are removed from toolbar basing on allowedContent settings.

Opposite way is presented by the 'editor 3' (4th one :). Buttons are explicitly removed from toolbar by 'removeButtons' and 'removePlugins' configs and their features are magically stripped out from content of the editor. See that there are no image, no anchors, no strike-through *in the content* of editor 3.


justify - here I had problem how to verify if these buttons are allowed - I'm checking only for 'p{text-align}'

That makes sense, but I just wanted to make sure that one thing will become possible eventually: in Drupal, we don't ever want style attributes in our HTML content. Everything should be CSS-overridable. What we want instead for alignment, is the use of classes. IIRC, it's possible to have the alignment buttons set classes on elements, so that's a non-issue, but I want to make sure that it's also possible that we can also configure allowedContent accordingly.

:) I was wondering if anyone will notice that I forgot about this. You're right that it is possible to force justify to apply classes rather than inline styles. Plugin should take that into consideration and basing on configuration allow styles or classes. It's of course doable, but I forgot about that :).


Dialog's fields have to be removed and so on.

Indeed— do you think that will be hard or easy to achieve? (I don't know enough about CKE's internal structures or metadata to assess the complexity of this).

I'm not sure what's possible now. Fred was opting for removing inputs, but this could lead to broken dialogs' layouts. I'm rather for disabling inputs, but only because it's now simpler for us, cause the UX may not be so great.

In any case - updating dialogs will require some amount of work, but it would be pretty easy work - each config dependent field should be displayed/enabled if it is validated by dataFilter.test(). We need to check if we can simplify this task even more by some magic.


I hope that you liked my child :). Feedback is greatly appreciated. Especially regarding API.

I do! If you could add some high-level comments to your code, and also to the entry points (the points where the rest of CKE is using dataFilter), that would enable me to give more useful feedback.

OK. I'll describe the architecture on Monday. It's quite simple and... it's not :).

Thanks for your very thorough review :).

Last edited 12 years ago by Piotrek Koszuliński (previous) (diff)

comment:13 Changed 12 years ago by Piotrek Koszuliński

Daily update.

  1. Renamed:
    • dataFilter -> filter
      • #test() -> #check()
      • #addRules() -> #allow()
  2. I replaced command's and button's content definition object (content.allowed and content.required) with simple #allows and #requires properties (more about them later in this comment).
  3. I wrote simple documentation for CKEDITOR.filter API.
  4. I fixed issue in justify plugin which was pointed out by Wim - now it allows classes instead of styles if defined in config.classesList.

Few words about architecture

In previous comments I haven't described how features (buttons, commands) are affected by filter. Let's take a closer look at this now.

The simplest way:

  1. Feature enables itself with filter#allow(). Note that executing allow() won't affect filter when custom config.allowedContent has been set by developer.
  2. Feature checks if it's allowed by executing filter#check(). If yes, then it somehow enables itself (e.g. adds button to toolbar).

When implementing buttons integration with filter, to make things even simpler, I added two properties to button's definition - allows and requires. 'Allows' contains content rules that should be enabled by this button. 'Requires' means content that has to be allowed by filter. If these properties aren't defined in button defition and this button has command associated with it (common case), I'm getting them from command definition.

Then, when adding button to toolbar, I'm calling button#registerContent( allows, requires ) which contains the logic described in previous paragraph and which calls similar method - filter#registerContent( allows, requires ) which contains logic described earlier in those two points. These methods return true if feature is allowed in editor, so e.g. I was able to do this in format plugin:

// Register content allowed by style.
if ( editor.filter.registerContent( style, style ) ) {
	styles[ tag ] = style; // Add this style to format combo.
}

One method to rule them all! Filter#registerContent contains also small improvement, which I haven't mentioned earlier. If filter has default configuration (config.allowedContent is undefined) it doesn't check if required content is allowed, because it assumes that feature allows itself.

The only problem with this method, which I find really useful, is that despite the fact that it's really short, it's very confusing. I was thinking about new name - 'addFeature', 'registerFeature', 'checkAndAllow' - does any of them sound better? Have you got other ideas? Would changing the name be enough to make it clear?

comment:14 Changed 12 years ago by Wiktor Walc

Dialog fields

Imho inputs should be simply removed if particular style/attribute is not allowed. The UI should not be broken, as long as:

  1. Empty tabs (all inputs are disallowed) will be hidden automatically
  2. When removing one of two input fields with width set to 50%, the only one that will remain will have width 100%
  3. When the whole line (hbox) with input elements will be empty, the rest of fields below, will go up.

(2) & (3) should be happening already iirc.

Syntax

Parentheses for classes

The second one looks strange to me:

a(red,blue)[href] or a.red,blue[href]?

the one proposed by Wim:

a(red|blue)[href]

looks more clear. Suppose I have classes named table and error for div elements:

div.error,table // ??

vs

div(error|table) // :-)

In any case I believe it's not issue with pipe vs comma, the thing that makes the difference are parentheses. I'm okay with pipe or comma.

Separate HTML elements with comma

This is what we usually have:

// config options
config.removePlugins = 'elementspath,save,font';

// in plugin...
requires: 'dialog,fakeobjects',
lang: 'af,ar,bg,

// how we used to list things...
buy: milk, beer, vodka...

so, for me, the following syntax would be more appropriate:

h1,h2,h3,p,blockquote,strong,em

instead of:

h1 h2 h3 p blockquote strong em

Eventually, if we go ahead with 7)

"Selection context participates on filtering as well. A <h1> editable should restrict the editor features accordingly. Another example, a selection inside <a> should not permit <a> elements being inserted."

We can use shorthand syntax like h1,h2,ul,ul li,table,table tr,... and use space to allow certain elements if only they are children of other elements.

What with attributes validation?

I omitted pipes leaving them for values separator. If we'll decide to implement simple values checking then it would be great to have: img[attr=val1|val2|val3, attr2=...]. Also in regexp like fashion. Comma also has one advantage over pipe for being an attributes, classes and styles separator - in natural language we used to put space after it, what makes text more readable. And this is now supported by dataFilter. However, such a space after pipe would change regexp meaning.

img[attr=val1|val2|val3, attr2=...]

I believe that attributes validation (if implemented) will never be that simple, we'll have rather things using regular expressions like:

widthPattern = /^(\d+(?:\.\d+)?)(px|%)$/,

which will enforce using more complex object based format. I don't think that shorthand notation should even support that, just to keep things simple.

Asterisk

The string based configuration sample shows nearly all its features which are currently implemented. I haven't only used '*' in elements list. Asterix makes it possible to define rulles accepting defined classes, attributes and styles for all elements accepted by other rules. E.g.

img[src]; p b; *.red' == 'img.red[src]; p b.red

I think we could achieve the same thing even without asterisk:

img[src],p,b,(red,blue)

This way asterisk could be used as a wildcard character: {font*,background*}. Also this way the rule would be a bit less confusing (will I allow "all" elements if I use it?).

In fact, I'm not convinced to the idea of being able to use wildcard characters in white-list approach, so I'm ok with not using this character at all...

this part did not have any sense because I totally forgot about the meaning of semicolons :-)

Last edited 12 years ago by Wiktor Walc (previous) (diff)

comment:15 Changed 12 years ago by Piotrek Koszuliński

Ok. So I see that parenthesis for classes are accepted.

Regarding using comma as a separator of elements - now I also find this better. BTW. vodka without juice? It doesn't work well with beer and milk :P.

As for attributes validation - that would indeed usually be done by regexp. I can only think of few examples of list of values. Anyway, not important now.

Summing up, if no one protest new format will look like this (based on editor 2a from sample):

'h1, h2, h3, p, blockquote, strong, em;' +
'a[href];' +
'img(left, right)[src, alt]{width, height};' +
'table, tr, th, td, caption;' +
'span{font-family, color}'

(I used more spaces but they are not required.)

comment:16 Changed 12 years ago by Wim Leers

comment:15: I think the format is not a truly important aspect at this point; we can still change this down the line.


comment:13:

The simplest way:

  1. Feature enables itself with filter#allow(). Note that executing allow() won't affect filter when custom config.allowedContent has been set by developer.
  2. Feature checks if it's allowed by executing filter#check(). If yes, then it somehow enables itself (e.g. adds button to toolbar).

When implementing buttons integration with filter, to make things even simpler, I added two properties to button's definition - allows and requires. 'Allows' contains content rules that should be enabled by this button. 'Requires' means content that has to be allowed by filter. If these properties aren't defined in button defition and this button has command associated with it (common case), I'm getting them from command definition.

Most of this (very important part!) is very hard to understand/parse.

If I put together what you wrote there, with what I see in plugins/link/plugin.js and https://github.com/cksource/ckeditor-dev/blob/5e297a293335cc58ccaa3c236783658c0258ae12/core/filter.js brings this thoughts to mind:

  • you frequently mix "allows", "allowed", "requires" in one sentence, which makes it all extremely hard to understand, e.g.:

    If {@link #customConfig} is true (custom {@link CKEDITOR.config#allowedContent} was defined) this method returns true (what means that tested feature is allowed) if requires is allowed or hasn't been defined. So it returns false only if required content isn't allowed.

    -> "If requires is allowed": this is obviously very confusing (though I do understand what you mean here).
  • Then, in another location, you have this:
    * @param [allows] Rules to be added as allowed.
    * @param [requires] Content to be checked by {@link #check}.
    
    -> In practice, both allows and requires are using similar syntax, but here they seem to be completely different things!
  • My understanding: each plugin's allows indicates which HTML it might generate, each requires indicates which is the minimal HTML that a plugin must be allowed to generate in order for it to make sense to be enabled at all. Correct?

I was only able to understand all of the above by reading the code.

It's very hard to come up with good names for these things of course. I also can't think of better ones right away, but this definitely needs to be improved if we want to make this understandable and thus maintainable.


My conclusions/proposals:

  • filter#registerContent() is only very hard to understand because it is has special handling for the case where the user doesn't have a custom configuration.
  • Looking at the code and taking the comments into account, I think I found a bug. Shouldn't this:
    	registerContent: function( allows, requires ) {
    			// If default configuration, then add allowed content rules.
    			this.allow( allows );
    			// If custom configuration, then check if required content is allowed.
    			if ( this.customConfig && requires )
    				return this.check( requires );
    
    be like this?
    	registerContent: function( allows, requires ) {
    			// If default configuration, then add allowed content rules.
                            if ( !this.customConfig )
      			       this.allow( allows );
    			// If custom configuration, then check if required content is allowed.
    			if ( this.customConfig && requires )
    				return this.check( requires );
    
  • The per-plugin allows setting is a bad name, because it implies that the plugin itself does filtering too, and it does not. A clear example is the table plugin: it's possible to insert e.g. an <img> or a <blockquote> in a table cell, but that's not listed in its allows setting. That appears to imply that a table does not allow for images or blockquotes, but that's of course wrong; instead allows simply indicates the entire range of HTML that might/could be generated.

I'm pretty certain you called it allows because you're essentially building a default allowedContent when no CKEDITOR.config#allowedContent is set. From that POV, it makes sense, but it just isn't the right nuance and that's why the name becomes unclear when used in other contexts. (*1)

  • Proposal for registerContent(), allows and requires in a single view so it hopefully makes sense. Note that "FOOBAR data structure" is how I refer to the "format" outlined in preceding comments and summarized in comment:15.[[br]]
    • Before:
                      /**
                       * Shorthand function that can be used during feature activation.
                       * …
                       * @param [allows] Rules to be added as allowed.
      		 * @param [requires] Content to be checked by {@link #check}.
      		 * @returns {Boolean} Whether feature is allowed.
      		 */
      		registerContent: function( allows, requires ) {
      
    • After:
                      /**
                       * Checks whether a feature can be enabled for the HTML restrictions in place
                       * for the current CKEditor instance, based on the HTML the feature might
                       * generate and the minimal HTML the feature needs to be able to generate.
                       * …
                       * @param [generatableHTML] HTML that can be generated by this feature. MUST
                       *                                            be a FOOBAR data structure.
      		 * @param [requiredHTML] Minimal HTML that this feature must be allowed to
                       *                                       generate for it to be able to function at all. MUST be
                       *                                       a FOOBAR data structure.
      		 * @returns {Boolean} Whether this feature can be enabled.
      		 */
      		featureIsCompatible: function( generatableHTML, requiredHTML ) {
      
      

I hope this was somewhat helpful.

Sidenotes

(*1): Do we actually even need to build a default allowedContent? Do we even want to do filtering when CKEDITOR.config#allowedContent has not been set? What if users paste in a Vimeo or YouTube HTML embed code, and that gets stripped away just because they don't have any plugin enabled that can insert the tags required for those embedded things? (I'm thinking about the general CKEditor user base here, this is mostly irrelevant for Drupal.)

comment:17 Changed 12 years ago by Piotrek Koszuliński

Most of this (very important part!) is very hard to understand/parse.

Sorry for that :). That's perfectly clear for me, but I see that I've started using my own language.


you frequently mix "allows", "allowed", "requires" in one sentence, which makes it all extremely hard to understand

That sentence would be a little bit clearer if we had better (visible) styles for code sample. In this sentence: "If requires is allowed" the word "requires" is a variable name, but this is invisible when read on Trac.

Anyway, I'll stop using pseudo-code when describing things :).


My understanding: each plugin's allows indicates which HTML it might generate, each requires indicates which is the minimal HTML that a plugin must be allowed to generate in order for it to make sense to be enabled at all. Correct?

Only one thing isn't correct: "that a plugin must be allowed to generate". Not "plugin" but "feature" - button, command, etc. Plugin is not a feature itself, it adds features. This may change in the future if we'll see that it makes sense to switch off entire plugins.


It's very hard to come up with good names for these things of course. I also can't think of better ones right away, but this definitely needs to be improved if we want to make this understandable and thus maintainable.

Exactly. And it is the biggest problem for me, because I implemented that :D. Anyway, at the end documentation will be proofread by Anna or final reviewer, so we'll end up with something understandable. But any help is of course greatly appreciated :).


Looking at the code and taking the comments into account, I think I found a bug.

It's ok. This check is done inside #allow().


The per-plugin allows setting is a bad name, because it implies that the plugin itself does filtering too, and it does not.

As I mentioned earlier - there's no per-plugin allows setting.


A clear example is the table plugin: it's possible to insert e.g. an <img> or a <blockquote> in a table cell, but that's not listed in its allows setting. That appears to imply that a table does not allow for images or blockquotes, but that's of course wrong; instead allows simply indicates the entire range of HTML that might/could be generated. I'm pretty certain you called it allows because you're essentially building a default allowedContent when no CKEDITOR.config#allowedContent is set. From that POV, it makes sense, but it just isn't the right nuance and that's why the name becomes unclear when used in other contexts.

Hm... I'll let others speak, because I've got distorted point of view :).


Do we actually even need to build a default allowedContent? Do we even want to do filtering when CKEDITOR.config#allowedContent has not been set? What if users paste in a Vimeo or YouTube HTML embed code, and that gets stripped away just because they don't have any plugin enabled that can insert the tags required for those embedded things? (I'm thinking about the general CKEditor user base here, this is mostly irrelevant for Drupal.)

I think that this feature is essential. That even if you use default editor's configuration you can't paste (or insert in any other way, including source mode) content that cannot be handled by editor. So if table cannot be edited without table plugin it should be stripped away if table plugin isn't loaded.

Of course, this is not backward compatible change, so I think that it should be possible to disable filtering at all. But I'm for making it enabled by default, because this is a feature that will improve editing a lot and if we'll leave it disabled by default less people will learn about it.

comment:18 Changed 12 years ago by Wim Leers

Yes, in the first half or so of my comment, I was saying "per-plugin" instead of "per-feature". Towards the end, I'm also saying "per-feature". I forgot to update the first half — sorry.

You didn't comment on my proposal though? :(

comment:19 Changed 12 years ago by Piotrek Koszuliński

You didn't comment on my proposal though? :(

I think that the best feedback will give people without so deep knowledge about implementation as mine. For me simple names like "allows", "requires", "addFeature( allows, requires )" are sufficient - they are clear and... simple. And we like simple things :).

Your proposal is verbose, so it should be clear for users, but it is also long. Especially that we should use these names ("generatableHTML" and "requiredHTML") also in commands' and buttons' definitions (so "allows" and "requires" properties would be entirely replaced).

Also, I'm not sure if I like word "generatable" and if "HTML" is a right word for what I called "content rules". And "featureIsCompatible" doesn't suggest that this method changes something and in current form it does.

So, I don't find it better than current API. But if others will like it I'll accept it too :).

BTW 1. Note that by following CKEditor coding style we would have "generatableHtml".

BTW 2. I've just thought about this:

 /**
 * Checks whether a feature can be enabled for the HTML restrictions in place
 * for the current CKEditor instance, based on the HTML the feature might
 * generate and the minimal HTML the feature needs to be able to generate.
 *
 * @param feature
 * @param feature.generates HTML that can be generated by this feature. MUST
 * be a FOOBAR data structure.
 * @param feature.requires Minimal HTML that this feature must be allowed to
 * generate for it to be able to function at all. MUST be a FOOBAR data structure.
 * @returns {Boolean} Whether this feature can be enabled.
 */
addFeature( feature ) {

The "feature" is a kind of interface that buttons and commands implement (because they may have "generates" and "requires" properties). In the future every object may become a feature, so e.g. if we'll decide that we want to disable entire plugins, then it will be enough to:

  • add "generates" and "requires" properties to chosen plugins,
  • check them by calling: addFeature( plugin ).

comment:20 in reply to:  19 Changed 12 years ago by Wim Leers

Replying to Reinmar:

Your proposal is verbose, so it should be clear for users, but it is also long. Especially that we should use these names ("generatableHTML" and "requiredHTML") also in commands' and buttons' definitions (so "allows" and "requires" properties would be entirely replaced).

I'm approaching this with a narrow POV, I don't have the holistic view of the CKEditor code base (or even the entire under-the-hood feature set), so it's of course very hard for me to take all these things into account.

I just hope that by providing feedback from another POV, the end result will have more understandable names, that require less knowledge about other CKEditor internals. That is all :)

I am of course also looking forward to feedback from others!

comment:21 Changed 12 years ago by Wiktor Walc

Keywords: Drupal added

comment:22 Changed 12 years ago by Piotrek Koszuliński

Interesting, related issue: #9964.

comment:23 Changed 12 years ago by Piotrek Koszuliński

Description: modified (diff)

I updated current status of works.

Note: See TracTickets for help on using tickets.
© 2003 – 2022, CKSource sp. z o.o. sp.k. All rights reserved. | Terms of use | Privacy policy