views:

915

answers:

6

This question is partially related to an older question (Any CMS is Google App Engine compatible?) , but is slightly more general. It seems that in most CMS systems, the most fragile failure point is the database. Traditional database implementations scale poorly and will never be able to handle unforeseen spikes of traffic. Since Google App Engine was designed to help even small businesses overcome that problem, I had the same question that was asked earlier this year with less than satisfactory answers.

But more generally, where are the CMS projects that support NOSQL databases? Looking over Wikipedia's list of CMS platforms, I see without much effort that only traditional RDBMS are supported by every single vendor on the list. I would have expected to see at least one or two projects handling CouchDB or similar engines. I understand the complexities of implementing a NOSQL solution to a problem that is typically solved using the relations cleanly expressed in any RDBMS, but there seems to be a rather wide market gap.

Since databases are, today, easily outsourced to Google, Amazon, and others which use NOSQL models, I am amazed that there are not more projects actively pursuing this path. Am I simply not aware? Can someone please point me to projects that have real momentum that are developing on this path? I'm looking for two things:

  • a CMS that has as its backbone a NOSQL database enabling easy database outsourcing (hosted MySQL clusters and similar solutions are not what I'm looking for)
  • a project that is built to run on either a PaaS architecture like Google App Engine or an IaaS architecture like Amazon EC2

Any pointers in that direction would be most welcome.

A: 

App Engine Site Creator "is designed to be a highly extensible and light weight content management system. It features a user-friendly content editing interface, a high degree of flexibility and customization, a file sharing mechanism, full support for page hierarchies, and fine-grained mechanisms for user management and access controls. It is built to run on Google App Engine and to scale well with minimal engineering maintenance."

I haven't used it, but I think it at least claims to be what you want.

aem
I did see that project, and I like where they're heading, but they seem more like a minimal wiki/Google Sites kind of project based on the limited available documentation. That's fine, but I'm hoping that a more heavyweight application in the vein of Drupal/Joomla will enter this arena. Bookmarked it though.
Michael
A: 

What you will find is that whilst the database itself is the reason for a slow performing site, you need to think about the site itself as a whole.

CMS systems use a database to store the content of pages simply so that they are easily editable. In high traffic scenarios, there is absolutely no appreciable change in content from one user to the next. As such, most CMS systems also provide caching mechanisms to overcome the load required to interact with a database. A typical flowchart of this is in action is:

1. Is the page already cached in memory/disk?
2. If already cached, goto step 5.
3. If not, access the database and format the page.
4. Store the page to memory/disk.
5. retrieve that page from memory/disk.
6. serve the page.

Obviously, things get a little tricky if you want to show custom login details on the page. However, by using a judicial balance of reducing database load, caching all/some parts of the page, the effect of being slashdotted/digged can be reduced significantly.

Don't forget that you can also specify cache header (Cache-Control) information in your returned pages so that the same user returning to the page can reuse previously sent information. See this link for some information.

So, to answer your question. The best way to reduce database issues in high traffic scenarios, it's best not to use the database at all :)

cmroanirgo
I regret now differentiating between the brittleness of the database and the system as a whole, because having systems designed to be scalable from the start is really where my interests lie. Cache management is a great strategy that will certainly help with load, but it will not by itself solve scaling issues for you. And, as you yourself point out, there is almost always going to be some amount of dynamic content for an interactive site. What I hope for is a CMS that: scales cleanly to arbitrary numbers of instances, and whose database is not the part I worry about most.
Michael
+1  A: 

When you say NoSQL, I'm assuming you mean solutions such as CouchDB, MongoDB, Cassandra, etc. I personally don't know if there are any CMS solutions that support these, but that doesn't mean there aren't.

However, there are plenty of CMS systems using Apache Jackrabbit (an implementation of JCR - Java Content Repository), which is not a relational datastore. As indicated on their site:

A content repository is a hierarchical content store with support for structured and unstructured content, full text search, versioning, transactions, observation, and more.

There are many CMS solutions that use JCR/Jackrabbit as the datastore. I personally use Brix-CMS with my Apache Wicket projects. There is also the very capable Hippo CMS.

Perhaps this isn't the type of solution you are looking for (especially if you aren't a Java developer), but in many ways JCR fits the needs of a CMS better than most NoSQL solutions. Since you refer to GAE, I guess there's a 50% chance you would consider Java.

I have not used GAE myself, but I've read that other have wicket applications running within it. You would need to check and see if Hippo or Brix or some other JCR implementation would run within it.

Good luck!

Tauren
+1  A: 

There's a MongoDB-based CMS called Harmony currently in private beta. It's being developed in part by John Nunemaker, the guy behind MongoMapper. There's a blog post up on RailsTips where he talks about the advantages of using MongoDB for a problem like this. I don't know anything about how it'll be hosted, etc. since I'm not in the private beta, but it's certainly a step in the NOSQL direction, and looks quite interesting.

Emily
A: 

Avinu Beyond the Cloud uses MongoDB and the Vork framework http://www.Avinu.org

Mr. PHP
+1  A: 

Check out Drupal 7 with MongoDB integration module

Paul Strugger
Nothing here was exactly what I was looking for, but this very recent update wins out for informing me that some level of NoSQL integration is coming to my current favorite CMS. I still hold out high hopes for a fully integrated solution that is redeployable by anyone.
Michael