views:

103

answers:

2

When building objects that make use of data stored in a RDBMS, it's normally pretty clear what you're getting back, as dictated by the tables and columns being queried. However, when dealing with NoSQL, document-based systems, it's less clear what is being retrieved.

What are common methods of keeping track of structure in which data is stored?

A: 

It depends on the driver. With the NORM driver you can "serialize" and "deserialize" an instance of an object in and out the db. It will throw an error when there is an extra field in the db that isn't present in the class definition. This is the default behaviour of NORM but they are adding the possibility to make it more flexible.

Read here: http://groups.google.com/group/norm-mongodb/browse_thread/thread/31102ec553a50e19

Theo
A: 

Not only does this depend on what database you're using, but it also depends on the language/framework you're coding with.

Most opinionated frameworks expect an ODM of some sort where you define a schema that is enforced in your models - like Rails, for example - and other frameworks let you do whatever you want, which puts you at risk of having data in multiple formats and not knowing what to do with it...

For MongoDB I've toyed with the notion of a soft schema, where every collection (table) has a document with a title of "schema" and defines the different elements and their datatypes in an embedded array called "definition." This allows me to generate dynamic scaffolds based on each collection, and can come in very handy when integrating with non-ODM platforms - in my case, Joomla.

Another approach is to store those schema definitions in a separate collection called schemas or schemata or some such.

You most certainly want to lock down some sort of schema in your code to ensure your data is in a predictable format; this is also important to address whenever your schemas change, and they invariably will.

spacemonkey