tags:

views:

32

answers:

1

I just started playing with Berkeley DB a few days ago so I'm trying to see if there's something I've been missing when it comes to storing data as fast as possible.

Here's some info about the data: - it comes in 512 byte chunks - chunks come in order - chunks will be deleted in FIFO order - if i lose some data off the end because of power failure that's ok as long as the whole db isn't broken

After reading the a bunch of the documentation it seemed like a Queue db was exactly what I wanted.

However, after trying some test code my fastest results were about 1MByte per second just looping through a DB->put with DB_APPEND set. I also tried using transactions and bulk puts but both of these slowed things down considerably so I didn't pursue them for much time. I was inserting into a fresh db created on a NANDFlash chip on my Freescale i.MX35 dev board.

Since we're looking to get at least 2MBytes per second write speeds, I was wondering if there's something I missed that can improve my speeds since I know that my hardware can write faster than this.

+1  A: 

Try putting this into your DB_CONFIG:

set_flags DB_TXN_WRITE_NOSYNC
set_flags DB_TXN_NOSYNC

From my experience, these increase write performance a lot.


DB_TXN_NOSYNC If set, Berkeley DB will not write or synchronously flush the log on transaction commit or prepare. This means that transactions exhibit the ACI (atomicity, consistency, and isolation) properties, but not D (durability); that is, database integrity will be maintained, but if the application or system fails, it is possible some number of the most recently committed transactions may be undone during recovery. The number of transactions at risk is governed by how many log updates can fit into the log buffer, how often the operating system flushes dirty buffers to disk, and how often the log is checkpointed Calling DB_ENV->set_flags with the DB_TXN_NOSYNC flag only affects the specified DB_ENV handle (and any other Berkeley DB handles opened within the scope of that handle). For consistent behavior across the environment, all DB_ENV handles opened in the environment must either set the DB_TXN_NOSYNC flag or the flag should be specified in the DB_CONFIG configuration file.

The DB_TXN_NOSYNC flag may be used to configure Berkeley DB at any time during the life of the application.


DB_TXN_WRITE_NOSYNC If set, Berkeley DB will write, but will not synchronously flush, the log on transaction commit or prepare. This means that transactions exhibit the ACI (atomicity, consistency, and isolation) properties, but not D (durability); that is, database integrity will be maintained, but if the system fails, it is possible some number of the most recently committed transactions may be undone during recovery. The number of transactions at risk is governed by how often the system flushes dirty buffers to disk and how often the log is checkpointed. Calling DB_ENV->set_flags with the DB_TXN_WRITE_NOSYNC flag only affects the specified DB_ENV handle (and any other Berkeley DB handles opened within the scope of that handle). For consistent behavior across the environment, all DB_ENV handles opened in the environment must either set the DB_TXN_WRITE_NOSYNC flag or the flag should be specified in the DB_CONFIG configuration file.

The DB_TXN_WRITE_NOSYNC flag may be used to configure Berkeley DB at any time during the life of the application.

See http://www.mathematik.uni-ulm.de/help/BerkeleyDB/api_c/env_set_flags.html for more details.

Vlad Lazarenko
Thanks for the comment. However, I have found out that simply enabling the environment produces too much of a performance degradation compared to not using one. I think it has to do with the WAL so these flags would help me, but even without an environment everything is too slow.
jjfine
@jjfine: I believe that environment is used implicitly with anonymous (auto-commit) transactions if you don't do that explicitly. So not using environment won't help.
Vlad Lazarenko