views:

527

answers:

3

I am developing an Erlang application which requires a LOT of DB writes. My schema has, in addition to the primary key, another attribute with a unique constraint enforced.

Say I have an ID, an unique_constraint_field, and some other fields. I need to now update a row in the DB corresponding to the unique ID, given that no other row should already have the value of the unique_constraint_field value I am going to update.

Due to the high volume of updates (each update will affect only 1 row) I need to perform, (requiring low latency too) I rely on the primary key and the unique constraint on that attribute to catch a duplication, instead of an update statement using a subquery. This allows me to perform the update in a single query (which happens 95% of the time), and in the remaining 5%, I can catch the exception to take necessary action about the primary key or unique attribute violation.

I am currently using the ODBC mysql driver. However the driver returns a very generic error message for ANY error. Although right now my prototype is working well when I assume any error to be a key violation, this model obviously is pretty much flawed. I cannot find any other decent driver/way to connect to mysql from erlang.

I am thinking of switching to Mnesia (memory-only mode for my speed requirements) as Erlang and Mnesia meld so seamlessly. However, I see that Mnesia does not have any unique key constraints which I can use to perform my DB update in a single query.

I need suggestions as how best to implement this requirement from within Erlang. Is there any way of performing a conditional update in Mnesia? Or, is there any other high speed DB alternative I should be looking at? Any help/insight is greatly appreciated.

+1  A: 

I don't know what is best solution but what I'd try is make two tables one for records and one for index with unique_constraint_field and handle each CUD from CRUD operations in transaction which check and update index. Reason is that in mnesia you can't set index type and always is duplicate bag. I think that because your index will be unique anyway, it should not introduce any additional performance penalty. If you use mnesia index feature you would still have to write your own CUD operations and result seems almost same as using two tables. Fortunately mnesia handles nested transaction with minimal developer effort and this thing is relatively easy compared to classical RDBMSs.

Hynek -Pichi- Vychodil
+1  A: 

Ulf Wiger has released a library that lets you use mnesia as a relational database. It's called 'rdbms', it's a few years old and hasn't been updated in a long time, but you can probably use it as it is or at least base yourself on his work to deal with it. Grab the source if you want.

His description of it:

I dust off my standard response that the 'rdbms' contrib offered a solution for this, by providing support for compound attributes and user-defined indexing, including the option to specify that an index value must be unique.

Rdbms /has/ been used commercially, but I don't consider it ready for commercial use in general. I've not done anything on it for quite a while, since I don't perceive any user pressure, but anyone who wants that to change is of course welcome to contact me and argue their case.

http://ulf.wiger.net/rdbms/doc/rdbms.html (The docs leave a lot to be desired, I know - see above.)

The doc mentionning the 'unique' constraint can be found here. There's the possibility of having performance hits; mnesia is meant to be a key-value storage. I can't exactly remember, but it's possible defining the 'unique' indexes could involve a full-table traversal when checking them.

All in all, as it's old you'll probably have trouble running it. See the trapexit thread about it. Using it to study how it was done could be a better idea.

I GIVE TERRIBLE ADVICE
+2  A: 

mnesia can crash with a high volume of writes unless you tune it. But you can use complex primary keys that look like {ID, UniqueConstraint}, which could make your updates much simpler. There's also a new erlang library for called osmos for on disk ordered_set tables created specifically to handle high volume writes.

Jacob
Thanks, that sounds interesting, will take a look.
jeffreyveon