views:

214

answers:

2

How would you store formatted blocks of text (line breaks, tabs, lists - etc.) in a database (nothing specific) to be displayed on the web (XHTML) while maintaining a level of abstraction so that the data can be used in other applications or if the structure of the website were to change in the future?

+4  A: 

I would store the structure of the document using XML, and always apply some XSLT transformation before showing it in the web browser. That way the information can be adapted to different browsers, or other usages like displaying in a normal UI or exporting to some plain text document.

The structure would have to be something meaningful, not only formatting information. Ideally it would be a representation of some domain-specific data model.

Of course nothing stops you, if the meaningful information is the document structure, to define something like:

<document>
  <title>SomeTitle</title>
  <paragraph>Some Long paragraph text</paragraph>
</document>

Another advantage of using XML in this context is that if your database supports it (like Oracle), you could query the contents of the text.

We are assuming that the text is something that doesn't need to be queried often, or that the content is really for display purposes only. Otherwise it might be better normalize the database.

Mario Ortegón
+2  A: 

There are two ideas that clash slightly in your question - that of keeping the data separate to the content so it can be re purposed, and that of including formatting data.

Is the formatting data part of the data, or just meta data?

Haven't we seen this before; it basically seems to be a CSS / HTML conundrum.

If these blocks of text fit into a known data scheme (as Mario's answer assumes) then yeah, I'd go with his answer, but re-reading your questions I'll answer (and assume) you have bits of formatting within, say, the paragraph tag that Mario used?

Going with the idea the formatting is basically part of the data, not just an added extra, I'd suggest adopting something like the CSS / HTML solution. Store the text with standard XHTML tags, ready for your CSS. This could then be parsed when you want to use a standard UI (as in non web app?), and just strip the tags and replace as necessary.

Of course, you could make up your own markup ([myBitOfText #] instead of < span class="myBitOfText />) but you may as well have one return from your database that requires no re-purposing or string manipulation.

Retne
If the document is really just a document, then this idea is better. After all, XHTML can also be parsed by an XSLT if required, combining the best of both solutions
Mario Ortegón
The text in this case would just be a document which doesn't conform to a data scheme. Storing it as XHTML does make sense now you mention it, especially seeing as the main way of displaying it will be on the web.
Tom