views:

958

answers:

2

I've been diving into Erlang recently, and I decided to use Mnesia to do my database work given it can store any kind of Erlang data structure without a problem, scale with ease, be used with list comprehensions, etc.

Coming from standard SQL databases, most rows can and should be identified by a primary key, usually an auto-incrementing integer. By default Mnesia considers the first field of a row to be its key. It also gives no way to have an auto-incrementing integer key as far as I know.

Given I have these fictional records representing my tables:

-record(user, {name, salt, pass_hash, email}).
-record(entry, {title, body, slug}).
-record(user_entry, {user_name, entry_title}).

I figure using the username may be good enough for some purposes, as with the entry title, in order to identify the resource, but how do I go about maintaining integrity?

Say the user changes its name, or that the entry's title changes after an edit. How do I make sure my data is still correctly related? Updating every table using the username when it changes sounds like a terrible idea no matter how it's put.

What would be the best way to implement some kind of primary key system in Mnesia?

Also, how would an intermediary table like 'user_entry' do if the first field is usually the key? Otherwise, what would a better way be to represent a many-to-many relationship in Mnesia?

+6  A: 

I prefer using GUIDs instead of auto-incrementing ints as artificial foreign keys. There is an Erlang uuid module available at GitHub, or you can use {now(), node()}, given that now/0 doc says: "It is also guaranteed that subsequent calls to this BIF returns continuously increasing values."

Using something that can change as the primary key seems to me to be a bad idea independent of the database system.

Don't forget that you don't need to normalise data in Mnesia even to first normal form; in your example, I would consider the following structure:

-record(user, {id, name, salt, pass_hash, email, entries}).
-record(entry, {id, title, body, slug, users}).

where entries and users are lists of ids. Of course, this depends on the queries you want.

EDIT: fixed to be many-to-many instead of many-to-one.

Alexey Romanov
Is it really the good way? it seems make_ref/0 resets its count after restarting the shell and will have multiple similar values. Wouldn't that mean you could get bad keys after a server restart? Coupling it with now/0 could help, maybe.
I GIVE TERRIBLE ADVICE
It is probably better to combine now/0 with node/0, actually.
Alexey Romanov
Or for that matter, you can just use the uuid module: http://github.com/travis/erlang-uuid/tree/master
Alexey Romanov
Edited to remove a bad idea.
Alexey Romanov
+6  A: 

Mnesia does support sequences (auto-incrementing integers) in the form of mnesia:dirty_update_counter(Table, Key, Increment). To use it you need a table with two attributes Key and Count. Despite the name, dirty_update_counter is atomic even though it doesn't run inside a transaction.

Ulf Wiger did some work on providing typical RDBMS features on top of mnesia in his rdbms package. His code provides foreign key constraints, parametized indices, field value constraints and so on. Unfortunately this code has not been updated in two years and will probably be difficult to get running without quite a bit of Erlang experience.

When designing for and using mnesia you should remember that mnesia is not a relational database. It is a transactional Key/Value store and is much easier to use when you don't normalise.

If your usernames are unique, you could use the schema:

-record(user, {name, salt, pass_hash, email}).
-record(entry, {posted, title, body, slug, user_name}).

Where posted is the erlang:now() time when the article is uploaded. user_name might need a secondary index if you often need to retreive a list of all articles for a user. As this data is split over two tables, you will have to enforce any integrity constraints in your application code (for instance, not accepting entries without a valid user_name).

Each field value in mnesia can be any erlang term, so if you're at a loss for a unique key on any one particular field, you can often combine some fields to give you a value that will always be unique - perhaps {Username, DatePosted, TimePosted}. Mnesia allows you to search partial keys via mnesia:select(Table, MatchSpec). MatchSpecs are quite difficult to write by hand, so remember that ets:fun2ms/1 can convert a psuedo erlang function into a matchspec for you.

In this example, fun2ms generates us a matchspec for searching a blog entry table -record(entry, {key, title, slug, body}). where key is {Username, {Year, Month, Day}, {Hour, Minute, Second}} - the username of the author and the date and time the article was posted. The example below retrieves the titles of all the blog posts by TargetUsername during December 2008.

ets:fun2ms(fun (#entry{key={U, {Y,M,_D}, _Time}, title=T})
             when U=:=TargetUsername, Y=:=2008, M=:=12 ->
               T
           end).
archaelus
I think "where" should instead be "when." Thanks for the great explanations, too.
I GIVE TERRIBLE ADVICE