views:

2322

answers:

7

I've found databases typically come in two flavors, your traditional row-oriented RDBMS or an object oriented database (OODBMS). However, in the mid 90s I remember, a new breed of databases showing up that were column oriented. Some of these were given the term 4GL, but I don't think it was a term that stuck.

What I'd like to know is the following:

  • What column oriented databases still exist?
  • What are the performance characteristics of these databases?
  • Are there any open source column oriented databases?
  • What platforms do they interoperate with (.NET, Java, etc)
  • What's been your general experience with them?

The two column oriented databases that I remember working with are FAME and KDB.

+3  A: 

HBase is an open-source column-oriented database system modelled on Google's BigTable.

Philip Morton
+2  A: 

Here's the different column oriented DBMS wiki has Column-Oriented DBMS Implementations

dragonjujo
+2  A: 

Sybase IQ is one I have heard of.

Tony Andrews
+1  A: 

Infobright
It's a column oriented MySQL engine
You can use (almost) all MySQL api's/interfaces/tools but it's column oriented.

It's open-source and has a free version.
It's very good for warehousing. I had a 10Gig fact table in SQL server. Infobright compressed it to 15MB.

Julian de Wit
+1  A: 

Sybase IQ is column oriented . All columns are automatically indexed when you create a table and data is nicely compressed in the columns.

It's a nice OLAP database (...data warehouse) but I would not recommend it for any kind of transaction processing as it is designed for data warehouse operations.

As for performance characteristics, SELECTS are very fast for large volumes of data but INSERT / UPDATE / DELETEs are very slow compared to a standard OLTP DB such as Sybase ASE for example. Table locking is also very different to a OLTP database so expect exclusive table locks for write operations (INSERTS etc) when working in the MAIN data store.

Otherwise it supports T-SQL (Sybase version) and Watcom SQL.

Cheers,

Kevin

Kevin Horgan
+2  A: 

Also check out Michael Stonebraker's C-store: C-store (includes links to source code and research paper)

The paper contains an excellent viewpoint on column oriented databases, that should answer most of your questions.

To quote the paper,

"Most major DBMS vendors implement record-oriented 
storage systems, where the attributes of a record (or tuple) 
are placed contiguously in storage.  With this row store 
architecture, a single disk write suffices to push all of the 
fields of a single record out to disk.  Hence, high 
performance writes are achieved, and we call a DBMS 
with a row store architecture a write-optimized system.  

In contrast, systems oriented toward ad-hoc querying 
of large amounts of data should be read-optimized.  Data 
warehouses represent one class of read-optimized system,
in which periodically a bulk load of new data is 
performed, followed by a relatively long period of ad-hoc 
queries. Other read-mostly applications include customer 
relationship management (CRM) systems, electronic 
library card catalogs, and other ad-hoc inquiry systems.  In 
such environments, a column store architecture, in which 
the values for each single column (or attribute) are stored 
contiguously, should be more efficient.  This efficiency 
has been demonstrated in the warehouse marketplace by 
products like Sybase IQ [FREN95, SYBA04], Addamark  
[ADDA04], and KDB [KDB04]. In this paper, we discuss 
the design of a column store called C-Store that includes a 
number of novel features relative to existing systems."
Suvesh Pratapa
A: 

InfiniDB was recently released open source (GPLv2) by Calpont. It supports most of the MySQL API and stores data in a column-oriented fashion, and is optimized for large-scale analytic processing.

Wayne Young