views:

92

answers:

3

I want to have dynamic fields in my database records.

For example: I want to build an application for users to create their own forms.

A user could create the following forms:

Personal profile:

  • Full Name
  • Street
  • Job
  • Phone
    • Home
    • Work
    • Mobile
  • Interests
    • Interest 1
    • Interest 2
    • Interest 3

Work:

  • First name
  • Last name
  • Work
    • Department
      • Specialty 1
      • Specialty 2
    • Department
      • Specialty 1
      • Specialty 2

Countries:

  • United states
    • States
      • New York
        • Cities
          • New York
          • Foo
      • Alabama
        • Cities
          • Bar
          • Baz

As you can see this is a very dynamical structure:

  • No predefined number of fields
  • No predefined field names
  • User creates the structure of the database

So I wonder, what is the best database for this: relational (mysql/postgresql) or non-relational like mongodb/couchdb/cassandra or even xml databases like xindice?

And even if I choose non-relational databases for this, would it be smart to store security-critical information on it like customer and billing information?

I have heard people say that if your information require uniqueness then use relational database. "We don't want to risk to bill our customers twice". What problems on non-relational databases do they actually mean? Can't you store unique data in non-relational databases?

Another thing I was thinking about: Won't saving data in non-relational databases mean that I will have duplicated entries?

Consider this example:

Categories:

  • Office

    • Applications
      • Textmate
        • Author : Foobar
        • Price : 120
      • Foo
        • Author : Foobar
        • Price : 120
  • Office

    • Applications
      • Textmate
        • Author : Foobar
        • Price : 120
      • Bar
        • Author : Foobar
        • Price : 120

As you see there are situations for identical entries. How do non-relational databases handle these? Im so used to relational databases.

I sum up my questions:

  • What type of database for user-created database structure?
  • Are non-realtional databases for storing security critical information?
  • How do non-realtional databases handle duplications?
A: 

The database to choose depends more on what and how you want to query something moreso than what you want to store. All the DBs will let you store pretty much whatever you want.

RDBMS are particularly good at querying based on the relational model, and doing so reasonably arbitrarily. Through ad hoc filters and joins, you can do all sorts of magic.

The NOSQL DBs tend to be less flexible on their queries, but do well at other tasks (such as working better on "unstructured" data for example).

Given what you've posted here, I'd just use a SQL database and define the tables as the user wants them defined. Set up the indexes, set up the queries. Sounds like a real no brainer to me. SQL DBs handle all of that "defining fields on the fly" stuff handily, because...that's what they do. So use that.

Will Hartung
Really? I wouldn't count that as a strength for relational DBs. The alternatives are much better at handling dynamic data. The only reason you'd want to go with a relational DB for something like this is support--there is a lot more reading and support for relational databases than there is for the NoSQL alternatives.
musicfreak
"SQL DBs handle all of that "defining fields on the fly" stuff handily". Could you show me link to that feature of defining fields/columns on the fly? "The NOSQL DBs tend to be less flexible on their queries, but do well at other tasks (such as working better on "unstructured" data for example)". Isn't this an example of just that, unstructured data?
never_had_a_name
You could issue `ADD COLUMN` statements in response to user actions to "define fields on the fly", but that's not a good idea. SQL database implementations don't necessarily handle hundreds of columns well.
Joey Adams
@joey adams. I agree! And then since each entry/row has it's own fields/columns I really shouldn't add new columns to a whole table!
never_had_a_name
@ajsie: Actually, I don't think that's the case. Adding columns (in PostgreSQL, at least) doesn't go through and fatten all of the table rows, so I doubt that many nulled columns in a new row will take up extra space. However, PostgreSQL has a [maximum columns per table of 2500 - 1600 depending on column types](http://www.postgresql.org/about/).
Joey Adams
+1  A: 

If your data fits the relational model pretty well, but you need to store some dynamically formatted data that isn't enormous, then you will probably be better off storing JSON, XML, or similar into a column. Although you lose some advantages of first-class SQL typing by doing this (indexing, foreign key constraint checking, type checking, etc.), it's good for storing dynamically-structured documents when your queries don't care much about their internals.

If you're interested in storing mostly relational data with a touch of JSON/XML/etc., I recommend looking to PostgreSQL. PostgreSQL has an XML data type, but I don't recommend using it since I hate XML :P . Nobody's stopping you from storing JSON in a TEXT field, but PostgreSQL will soon have a JSON data type with supporting functions. The hstore contrib module provides an efficient way to store key/value pairs, and also provides full-text index support.

Although shoving JSON or similar into a SQL database column flies in the face of the relational model, you're usually better off doing it anyway (when it makes sense!). Otherwise, you have to explain the entire schema of your application to the database, resulting in a lot of SQL and database mapping code that really doesn't do anything.

Joey Adams
isn't pg getting a JSON datatype soon? I thought I heard something about it in 9.0 or 9.1
xenoterracide
Storing the json as a text field seems pretty pointless from a data storage point of view. Then you have to serialize and deserialize to make sure the json is valid. I don't like this solution. For the requirements it seems a noSQL solution like MongoDB or CouchDB is far superior.
Amala
+1  A: 

I highly recommend you check out CouchDB for this.

  1. You communicate with CouchDB using a straightforward REST API. In other words, it is "Made of the Web" rather than simply being a backend db like MongoDB and others. CouchDB can actually serve the forms and receive submissions since has a built-in web server.
  2. Being a JSON document store it is well-suited for storing structured-yet-schemaless data. (Forms and their submissions are really documents and it makes more sense to model them this way, IMO.)
  3. You could easily store a JSON document that describes each web form in the same "bucket" as the form submissions. (CouchDB can even parse form POSTs and turn them into JSON docs however you see fit. Having it automatically timestamp form submissions, for example, is simple.)
  4. You could write what is known as a "_show" function to actually generate each form's html code within CouchDB. Also check out "_update" and validation functions.
  5. It has the security features you would need.
  6. Document conflicts can be identified easily. Even better, CouchDB automatically determines a "winning" version of the document but you will continue to have access to the "losing" document versions (until you tell CouchDB to compact the database, which removes old revisions.)
    • Regarding uniqueness: instead of having CouchDB generate unique doc _id's you'll want to assign an _id that truly represents a unique form submission. If each user is only allowed one submission per form then use something along these lines for each JSON document created from a form submission: submission:user:5:form:a3df2a712

Using CouchDB you can avoid the pain of dynamically creating unique tables for every form a user might create.

duluthian
I have looked into both MongoDB and CouchDB. They seem to be a good solution for this type of dynamical structure. Have you tried the former one also?
never_had_a_name
MongoDB is very different. For what you want to do--which involves serving and processing web form responses--couchdb seems a much better fit. MongoDB does not have a built-in web server and isn't able to internally handle forms. Of course, you can always use MongoDB for just storing JSON data. But CouchDB gives you a number of tools built-in that you otherwise have to write yourself.
duluthian
You may want to check out how CouchApps work. (http://www.couchapps.org) CouchApps are a perfect example of something MongoDB can't provide because it's "just" a database.
duluthian