tags:

views:

258

answers:

3

I would like to store very large sets of serialized Ruby objects in db (mysql).

1) What are the cons and pros?
2) Is there any alternative way?
3) What are technical difficulties if the objects are really big?
4) Will I face memory issues while serializing and de-serializing if the objects are really big ?

+2  A: 

As an alternative you could use any of the multitude of NoSQL databases. If you can serialize your object to JSON then it should be easily stored in CouchDB.

Farrel
I approve, JSON file serialization is better if you want to store human readable data. However, if you want to store the objets as binary, the database can also be a good solution and a bin serialization is by far faster than a JSON serialization.
Niklaos
+3  A: 

Pros

  • Allows you to store arbitrary complex objects
  • Simplified your db schema (no need to represent those complex objects)

Cons

  • Complicates your models and data layer
  • Potentially need to handle multiple versions of serialized objects (changes to object definition over time)
  • Inability to directly query serialized columns

Alternatives

As the previous answer stated, an object database or document oriented database may meet your requirements.

Difficulties

If your objects are quite large you may run into difficulties when moving data between your DBMS and your program. You could minimize this by separating the storage of the object data and the meta data related to the object.

Memory Issues

Running out of memory is definitely a possibility with large enough objects. It also depends on the type of serialization you use. To know how much memory you'd be using, you'd need to profile your app. I'd suggest ruby-prof, bleak_house or memprof.


I'd suggest using a non-binary serialization wherever possible. You don't have to use only one type of serialization for your entire database, but that could get complex and messy.

If this is how you want to proceed, using an object oriented dbms like ObjectStore or a document oriented dbms like CouchDB would probably be your best option. They're better designed and targeted for object serialization.

TheClair
A: 

You have to bear in mind that the serialized objects in terms of disk space are far larger than if you saved them in your own way, and loaded them in your own way. I/O from the hard drive is very slow and if you're looking at complex objects, that take a lot of processing power, it may actually be faster to load the file(s) and process it on each startup; or perhaps saving the data in such a way that's easy to load.

Harry