views:

7171

answers:

26

I have been working with relational databases for sometime, but it only recently occurred to me that there must be other types of databases that are non-relational.

What are some examples of non-relational databases, and where/how are they used in the real world? Why would you choose to use a non-relational database over relational databases?

Edit: Two other similar questions have been mentioned in the answers:

+1  A: 

The PI historical database from OSIsoft is non-relational. It's only made to archive timestamped data. It's used a lot by industry, especially as the back-end database for all those 'dashboards'.

There's no need to be relational in it, since there are no joins.

Lance Roberts
+10  A: 
  • Flat file
    • CSV or other delimited data
    • spreadsheets
    • /etc/passwd
    • mbox mail files
  • Hierarchical
    • Windows Registry
    • Subversion using the file system, FSFS, instead of Berkley DB
Michael Burr
Are these really databases?
TomC
yes, just very limited. I was thinking Filemaker Pro was a version of a flat file system.
Lance Roberts
I wouldn't call a flat file a DB. DB should really be DBMS with an emphasis on MANAGEMENT SYSTEM. A DBMS should give you an interface to the data, it MANAGES the things it stores.
If a flat file is, then so is an piece of paper...
At least some people think so. The Microsoft Computer Dictionary, Fifth Edition, defines the registry as: A central hierarchical database used in Microsoft Windows... (http://support.microsoft.com/kb/256986). And http://en.wikipedia.org/wiki/Flat_file
Michael Burr
"If a flat file is, then so is an piece of paper..." like maybe a phone book or a card catalog index.
Michael Burr
Then perhaps these answers should be separated. A flat file is not while yes, the registry is a database.
TomC
I'd argue that there are possibly more flat file databases than any other type. Just because they're simple doesn't mean they aren't databases.
Michael Burr
I agree whit Mike B. Its been only last few years that people use rdbms's for everything.
Luka Marinko
A file is a database, a filesystem is a DBMS.
JacquesB
Guess every DBMS is eventually a flat file in back end. :o
Nrj
+3  A: 

I would think a flat-file database in Excel is non-relational and used by quite a few people.

It is really just a database table that can not be joined with other tables.

BoltBait
+12  A: 

A non-relational document oriented database we have been looking at is Apache CouchDB.

Apache CouchDB is a distributed, fault-tolerant and schema-free document-oriented database accessible via a RESTful HTTP/JSON API. Among other features, it provides robust, incremental replication with bi-directional conflict detection and resolution, and is queryable and indexable using a table-oriented view engine with JavaScript acting as the default view definition language.

Our interest was in providing a distributed access user preferences store that would be immune to shape changes to which we could serialize preference objects from Java and access those just as easily with Javascript from a XULRunner based client application.

TomC
+2  A: 

Object-Oriented Databases are one interesting type of non-relational database.

The trading sector sometimes uses OO Databases since each deal/contract can look sort of like others in that category but have unique attributes as well. VERY difficult to represent it relationally.

+1  A: 

Any file or group of files that contains data but does not express relationships within that data is a non-relational database.

EBGreen
+5  A: 

Google App Engine Datastore :

The App Engine datastore is not a relational database. While the datastore interface has many of the same features of traditional databases, the datastore's unique characteristics imply a different way of designing and managing data to take advantage of the ability to scale automatically.

Guido
+1  A: 

dBase. Although it was marketed as such, it doesn't meet the requirements.

Chris Lively
+1  A: 

As an OO database, Intersystems Caché comes to mind. Some medical and library systems are built on this.

Tomalak
A: 

Also see this duplicate question

John Meagher
+1  A: 

RRDtool is designed to store and aggregate log data. You configure a sampling interval and feed data into it, then it returns time-based results. It's optimized for fixed-size storage, and starts aggregating past results after a time. For example, suppose you have a round-robin database with a 5-minute time interval. Even if you send it temperature data once per second, it still only stores the results in 5-minute increments. After a week, it averages those results into hourly values. After a month, hourly results are averaged into daily numbers, and so on.

RRDtool is commonly used as the backend for tools like Cricket and MRTG to track network and environmental data for months and years at a stretch.

Just Some Guy
+3  A: 

Dimensional Databases are great examples of non-relational databases. They are very commonly used for 'Business Dashboards'/'Business Intelligence' for KPIs and other types of aggregate or statistical data. They are usually populated from relational databases and can offer better performance in certain situations.

http://en.wikipedia.org/wiki/Dimensional_database

Mario
+5  A: 

All databases were originally non-relational, it was only with the arrival of DB2 and Oracle in the mid 1980's that they became common. Before that most databases where either flat files or hierarchical.

Flat files are inherently boring, but hierarchical database are much less so, particularly as DB2 was actually implemented on top of an hierarchical implementation (namely VSAM) in the first instance. VSAM is I believe still around on mainframe systems and is of some considerable importance.

DB/1 (so obscure now I can't even find a wikipedia link) was IBM's predecessor prime-time database to DB2 (hence the name). This was hierarchical - basically you had a file which consisted of any number or 'root' records, generally directly accessible by a key. Each root record could then have any number of child records off it, each of which could in turn have it's own children. The net effect is a index file or root records with each root being the top of a potential tree-like structure. Accessing the child records could be tricky - there were limitations of direct access so usually you ended up traversing the tree looking for the record you needed. A 'database' could have any number of these files in it, usually related by keys.

This had major disadvantages - not least that actually doing anything demanded a full program written - basically the equivalent of a days work for what we can now do in SQL in a few minutes. However it really did score on execution speed, in those days a mainframe had about the processing power of your iPhone (albeit optimized for data I/O) and poor DB2 queries could kill a multi-million dollar installation dead. This was never an issue with DB/1 and in a world where programmers were less expensive than CPU time it made sense.

Cruachan
I think mid-80s is a little late in the day - especially for Oracle. Late 70s or early 80s would be when they were first released.
Jonathan Leffler
Yes, but they had major restrictions which limited use until the mnid 80s. For instance Oracle didn't have relation integrity until about that time and DB2 didn't support outer joins. I once worked on a Personnel system on DB2 which jumped though all sorts of hoops to get around that one
Cruachan
+7  A: 

Any database that claims to be a "Berkley style Database" or "Key/Value" Database is not relational.

These databases are usually based off complex hashing algorithms and provide a very fast lookup O(1) based off a key, but leave any form of relational goodness to end user.

For example, in a relational database, you would normalize your structure and join many tables together to create a single result set.

In a key/value database, you would denormalize as much as possible and then use a unique key to lookup data.

If you need to pull data from two sources, you would have to join the resulting set together by hand.

FlySwat
One correction: BDBs (et al) generally use B-Trees, NOT hashing, by default. There are hash-based variants too, but that's not the only choice.
StaxMan
+4  A: 
  1. XML databases e.g. xindice
  2. Object databases e.g. db4o

Be aware that the concept of relational databases is highly contentious. Purists such as C. J. Date would argue that many databases in common use (such as Oracle and SQL Server) do not comply sufficiently with the relational model to be termed 'relational'.

johnstok
+1  A: 

For a graph based dbms you have neo4j

For a hierarichal dbms you have any standard filesystem or with "schema" support any LDAP implementation.

John Nilsson
A: 

You might like to look at this question concerning when you wouldn't use a relational database.

Rob Wells
+2  A: 

eXist-db is an xml database that has been around for a long time. It is particularly useful for xquery over tons of xml documents.

JarrettV
A: 
  1. In my company, www.smartsgroup.com, we have a proprietary database engine we call a "transaction log database". It is built on flat files, each file containing a sequence of "events" or "messages", in binary format, plus various indexes on this data and algorithms for reproducing the state of a stock exchange's orderbook. It is highly optimised for sequential updates and sequential access.

  2. In scientific applications, it is also common to use proprietary database engines rather than RDBMS's. I also worked for a company that has the world's largest database of EEG brain recordings: www.brainresource.com . There we use a flat file database, and it worked well for us.

  3. SmartsGroup also uses a temporal database, which is like a non-relational database table, except that we store a history of all changes to all fields so we can reproduce the state of a particular row on a particular date.

Tim Cooper
+11  A: 

An admittedly obscure but interesting alternative to the types of databases mentioned here is the associative database, such as Sentences, from LazySoft Technology. There is a free personal version you can download and try on your own. The Enterprise Edition is also free, but requires a request to the company.

Essentially, an associative database allows you to store information in much the same way as our brains do: as things and associations between those things. The name "Sentences" comes from the way this information can be represented in a subject-verb-object syntax:

  • Tom is brother to Laura
  • San Francisco is located in California
  • Mike has a credit limit of $10,000

A sentence may be the subject or object of another sentence:

  • (Bus 570 arrives at 8:15am) on Sundays
  • Mary says (the pie was baked by William)

So, everything can be boiled down to entities and associations.

There is, of course, much more to Sentences than what can be expressed here. I recommend that you take some time to read more about it in a white paper from LazySoft.

"The Associative Model of Data" is a book available in PDF format by Simon Williams, one of the creators of Sentences.

Jeff Jones
WAAAAY late +1 cause its a good answer and cause its a good excuse to say HI DAD!
Matthew Jones
A: 

The Wiki page for Dimensional Databases linked to above seems to have disappeared.

Some OLAP Systems have are backed by Multidimensional databases (MOLAP) these are used often in financial analysis. They afford interactive clients that allow one to navigate through the data at different levels of aggregation.

Tom
+1  A: 

There are many answers but they all end up being in one of two major categories:

  1. Navigational. Includes Tree/Hierarchy databases and Graph databases.

  2. Databases that break first normal form (multiple values). Includes Pick databases and Lotus Notes and its offspring like CouchDB.

EDIT: And of course key/value stores like BDB aren't relational, but that just goes without saying doesn't it? I mean, they're just key/value stores.

David Mathers
+3  A: 

Non-Relational databases just do not meet Codd’s requirements. Intersystems Caché seams a total re-write/re-design of the old Pick Operating system’s database. From the little I’ve read of Caché it appears to be a nicely done redesign. It permits .net programs to access the database just like it would SQL. Caché’s run’s the Pick OS programs without requiring any changes. By importing your Pick files into Caché you can still run your old green screen applications with it, but also write new programs using .net so you can migrate to Windows Applications without abandoning the years of data design you’ve already invested in. Here is some background on the Pick DB model . A Pick database uses totally variable length records, and fields. All table are keyed by a single unique key and are accessible without reading an index. Pick designed the system to use a Hashing algorithm that reads the item from disk on generally on the 1st physical read (assuming system maintenance was performed correctly). Fields in Pick are Non-Typed. All data is stored as string and Casting is up to the programmer. Nulls are stored as a empty string, thus a null does not take up disk space as it does in SQL. There is no need for Foreign Keys. In the ‘Relational world’ the DBA has to create and Order Header table, and an Order Line Item table. In the “Pick Model’ there is a single table. An example would be, ‘Order Date’ is a field that would store a # of days since ‘Dec 13, 1967’ (the data Pick OS was turned on for the first time). Pick programmers did not have Y2k problems. A 2nd column would be Customer Number. The big difference is when you get to the Product Number Column, it would be ‘Multi-Valued’ (the Codd Non-Conformance). In other words, the database can handle 1-32000 product#s in that column. Other columns like Quantity Ordered would be in a controlling/dependant relationship with the Product Number and would also be multi-valued. When you get to the Quantity Shipped, Pick would go to a third dimension and have Sub-Multi-Valued field. You would have a Shipment Number Column, and it would be Multi-Valued by Line Item and Sub-Multi-Valued containing The Shipment Quantity for that line for that shipment number. There are No Inner Joins needed. All data for that Order is stored in One Table and in a single record. No orphan rows ever! Secondly the data definition is a bit different. Our dictionaries can contain definitions for data that is not in this table or is being manipulated. A couple of examples are, Customer Name. It would be defined as ‘Use the Customer number column and return the Name field from the Customer Table. Another Example is Line Item Extension would be defined as a calculation of Quantity*Price/PricePer. I believe I read somewhere Caché claims to have over 100,000 installations.

CurtTampa
It would be more correct to say Pick dbs have no concept of null. Empty Strings are not NULL values.
ShaneD
+2  A: 

Other two types of databases that haven't come up yet:

  1. Content Repositories are databases designed for content (i.e. files, documents, images, etc). They typically have additioan constructs such as a hierarchical way to browse content, search, transformation between different formats, versioning, and many other things. Examples - Alfresco, Documentum, JackRabbit, Day, OpenText, many other ECM vendors.

  2. Directories, i.e. Active Directory, or LDAP Directories. These are databases designed for low-write / high read scenarios and highly distributed across high geographical distances / high latency connections. While mostly used for authentication / authorization, they don't have to be if your use case matches the requirements.

Jean Barmash
A: 

At my university there is a group that researches deductive databases.

mdm