tags:

views:

213

answers:

4

The marketing people want to have the ability to write direct inline HTML in the (xml based) CMS. xhtml compliance and the like potentially goes down the drain, but they're the boss(es). The CMS uses a regular xml/xslt transformation pipeline. Currently we just use a single node with a cdata node containing all the nastiness, created using some nasty concatenations.

Any other ways to do this ?

Edit: I may be able to convince them that the HTML should be a well formed HTML fragment of some sort, but I cannot in the known universe get them to agree upon xhtml/strict compliance like the rest of the stuff actually is. But from what I understand, well formed simply doesn't help me anything ?

+4  A: 

CDATA is the only way to do this, there is simply no way invalid markup will go in an XML doc in any parsed structure.

May I suggest an alternative solution though? Fix the problem markup as it's inserted into the XML - definitely not trivial, but frankly the task they're giving you is absurd.

Check out HTML Tidy or Beautiful Soup which can take tag soup and turn it in to valid, well formed xhml.

annakata
+2  A: 

One solution aside from using CDATA sections would be to encode all less-thans and ampersands that the marketers write, and decode them before display.

However, I do think that a solution involving something like HTML Tidy would probably be optimal.

Chris Marasti-Georg
+1  A: 

I am pretty sure you could filter the HTML the marketing people enter through a XHTML converter.

Such as SgmlReader.

Alex Baranosky
A: 

You can embed all their nastiness by using CDATA sections or by explicitly escaping the relevant characters (these two options are effectively equialent). As has been noted, there are some tools such as Tidy that will help, and of course, once you've got a well-formed document, you can transform it with XSLT to something less unpleasant (depending, of course, on the CMS you are using).

Having said all that, I'd suggest that now is the time to have the discussion about who is "the boss" in which areas. The marketing folks wouldn't take it too well if you started overruling them in discussions of branding or whatever. You have your area of expertise, and they have theirs, and theirs is definitely not HTML. Fight this fight now, or you will face a world of pain in the future.

Dominic Cronin