views:

46

answers:

1

I'm building a frontend for an ecms. The main purpose of the fronted is to render html pages based on a content stored in a database.

I have one table per page, each table consist of an xml field with page content and some metadata. (I chose to store xml in the database to leverage features offered by SQL Server 2008 like xquery or xsd validation)

I'm thinking how to store the links between pages so that the url schema can be changed in the future (I'm not entirely sure that it is required).

Currently I identified two types of internal links:

  • Normal links - like an <a> tag surrounded by text.
  • List of links with embedded content - Like a list of related news items (for each item I need to display the title with the link and the abstract)

Currently I see the following solutions:

for a normal link:

  • Store it surrounded by a token like: $PAGE_LINK[unique identifier of the content] and replace such tokens with correct links during the rendering phase. (This approach is taken from Livesite an ECMS that I was worked with)
  • Store it as <a href="local://unique identifier of the content"> and replace content of the href attribute with proper link.
  • Assume that the schema of urls is static.

and for list of links with embedded content:

  • Store it as many to many relations in the database (to leverage my ORM).
  • Store it as xml list of unique identifiers of the content as part of xml content of the page (this gives ordering for free and is a bit more flexible). Example:

    <related_articles>
      <resource type="news" content_id="233"/>
      <resource type="news" content_id="231"/>
      <resource type="blog_post" content_id="233"/>
    </related_articles>
    

The questions: Do you know any other approaches to this problem? How it is done in the leading CMS systems like: Durpal, Joomla, N2, Wordpress and so on? Which of the solution would you chose?

+1  A: 

I guess it depends for a large part on what functionality you expect from links. I've used a couple of CMS' in the past and one of the things I loved about some of them is that you could block removing or deactivating pages that had links to them. This way I dodged a lot of helpdesk calls. Another great feature was reporting on broken links if you did allow people to break links.

What they did was have a default linking scheme you could configure and then it would pattern match links in the content to find related content. This would be done as content got saved / published. For links generated by the system, the internal code would register the link in the same index as it generated it.

A pitfall to try to avoid is caching. If you cache rendered pages, you should invalidate that cache if a link changes in any way.

HTH.

Jonathan van de Veen