views:

325

answers:

2

I'm currently trying to pick a database vendor.

I'm just seeking some personal opinions from fellow database developers out there.

My question is especially targeted towards people who:

1) have used Main Memory DB (MMDB) that supports replicating to disk (hybrid) before (i.e. ExtremeDB)

or

2) have used Versant Object Database and/or Objectivity Database and/or Progress ObjectStore

and the question is really: if you could recommend a database vendor, based on your experience, that would suit my application.

My application is a commercial real-time (read: high-performance) object-oriented C++ GIS kind of app, where we need to do a lot of lat/lon search (i.e. given an area, find all matching targets within the area...R-Tree index).

The types of data that I would like to store into the database are all modeled as objects and they make use of std::list and std::vector, so naturally, Object Database seems to make sense. I have read through enough articles to convince myself that a traditional RDBMS probably isnt what I'm really looking for in terms of

  1. performance (joins or multiple tables for dynamic-length data like list/vector)
  2. ease of programming (impedance mismatch)

However, in terms of performance,

  1. Input data is being fed into the system at about 40 MB/s.

  2. Hence, the system will also be doing insert into the database at the rate of roughly 350 inserts per second (where each object varies from 64KB to 128KB),

  3. Database will consistently be searched and updated via multiple threads.

From my understanding, all of the Object DBs I have listed here use cache for storing database objects. ExtremeDB claims that since it's designed especially for memory, it can avoid overhead of caching logic, etc. See more by googling: Main Memory vs. RAM-Disk Databases: A Linux-based Benchmark

So..I'm just a bit confused. Can Object DBs be used in real-time system? Is it as "fast" as MMDB?

+4  A: 

Fundamentally, I difference between a MMDB and a OODB is that the MMDB has the expectation that all of its data is based in RAM, but persisted to disk at some point. Whereas an OODB is more conventional in that there's no expectation of the entire DB fitting in to RAM.

The MMDB can leverage this by giving up on the concept that the persisted data doesn't necessarily have to "match" the in RAM data.

The way anything with persistence is going to work, is that it has to write the data to disk on update in some fashion.

Almost all DBs use some kind of log for this. These logs are basically "raw" pages of data, or perhaps individual transactions, appended to a file. When the file gets "too big", a new file is started.

Once the logs are properly consolidated in to the main store, the logs are discarded (or reused).

Now, a crude, in RAM DB can exist simply by appending transactions to a log file, and when it's restarted, it just loads the log in to RAM. So, in essence, the log file IS the database.

The downside of this technique is the longer and more transactions you have, the bigger your log/DB is, and thus the longer the DB startup time. But, ideally, you can also "snapshot" the current state, which eliminates all of the logs up to date, and effectively compresses them.

In this manner, all the routine operations of the DB have to manage is appending pages to logs, rather than updating other disk pages, index pages, etc. Since, ideally, most systems don't need to "Start up" that often, perhaps start up time is less of an issue.

So, in this way, a MMDB can be faster than an OODB who has a different contract with the disk, maintaining logs and disk pages. In this way, an OODB can be slower even if the entire DB fits in to RAM and is properly cached, simply because you incur disk operations outside of the log operations during normal operations, vs a MMDB where these operations happen as a "maintenance" task, which can be scheduled during down time and/or quiet time.

As to whether either of these systems can meet you actual performance needs, I can't say.

Will Hartung
thanks for a very good explanation for 2 different technologies, Will. Did you ever use any OODBMS or MMBDBMS? If so, which ones? How did you like them compared to a traditional RDBMS?
ShaChris23
No, I've not used either in any "legitimate" way, and even then, none of my projects have had your bandwdith requirements, so even if I had, the experience may not have been valid.
Will Hartung
+1  A: 

The back ends of databases (reader and writer processes, caching, lock managing, txn log files, ACID semantics) are the same, so RDBs and OODB are actually very similar here. The difference is the interface to the application programmer. Is your data model complicated, consists of lots of classes with real inheritance relationships? Then OO is good. Is it relatively flat and simple? Then go RDB. What is the nature of the relationships? Is it pointer-like and set like? Then go RDB. Is is more complicated, like (ordered) list, array, map? Then you should go OO. Also, do you have a stand-alone application with no need to integrate with other apps? Then OO is ok. Do you have to share data with other apps (i.e. several apps access the same database)? Then that's a deal-breaker for OO, and you should stick with RDB. Is the schema of your database stable or do you expect it to evolve frequently? OODBs are bad ad schema evolution, so if you expect frequent changes, stick with RDBs.

Carsten Kuckuk
Thanks for all the "questions". My project definitely fits under OO. My industry is in real-time digital signal processing + navigation. So the data structure is quite complicated.
ShaChris23