views:

131

answers:

4

Hi,

i'm trying to build a hash with berkeley db, which shall contain many tuples (approx 18GB of key value pairs), but in all my tests the performance of the insert operations degrades drastically over time. I've written this script to test the performance:

#include<iostream>
#include<db_cxx.h>
#include<ctime>

#define MILLION 1000000

int main () {
    long long a = 0;
    long long b = 0;

    int passes = 0;
    int i = 0;
    u_int32_t flags = DB_CREATE;

    Db* dbp = new Db(NULL,0);
    dbp->set_cachesize( 0, 1024 * 1024 * 1024, 1 );

    int ret = dbp->open(
            NULL,
            "test.db",
            NULL,
            DB_HASH,
            flags,
            0);
    time_t time1 = time(NULL);

    while ( passes < 100 ) {
        while( i < MILLION ) {

            Dbt key( &a, sizeof(long long) );
            Dbt data( &b, sizeof(long long) );

            dbp->put( NULL, &key, &data, 0);
            a++; b++; i++;  
        }

        DbEnv* dbep = dbp->get_env();
        int tmp;
        dbep->memp_trickle( 50, &tmp );

        i=0;
        passes++;
        std::cout << "Inserted one million --> pass: " << passes << " took: " << time(NULL) - time1 << "sec" << std::endl;
        time1 = time(NULL);
    }

}

Perhaps you can tell me why after some time the "put" operation takes increasingly longer and maybe how to fix this.

Thanks for your help, Andreas

A: 

Your performance degrade can have several reasons, which aren't actually connected with your code. I may be mistaken, but I think this is all about internal database structure (and used data structure).

Think of a situation where database uses some approach other than a hash table, for example an RB Tree. Inserting in that tree would take O(logN) in Big-O sense and every inserted element increases the time needed for the next insert.

Unfortunately, the same could happen with a plain hash-table, so that initial O(1) insertion operation time degrades to something worse. This could have several reasons, but it's all about hash collisions, which could happen due to wrong hash function, wrong data (that is bad for currently used hash function) or even due to the moon phase.

If I were you, I would try to dig into your db internal structure. Also, I think testing your keys with something other than your db (e.g boost::unordered_map) could also benefit your testing and profiling.

Edit: Also to mention, did you try to change that cache_size stuff in your sample? Or maybe there are some other performance-related parameters that could be modified?

Kotti
Yep I tried manipulating the cache_size but it does not seem to have any influence on the performance of the writes.
Kungi
A: 

You may want to look at the information provided by the db_stat utility and the HASH-specific tuning functions that are available. Please see BDB Reference Guide section on configuring a HASH database.

I would expect you to get 10s of thousands of inserts per second on commodity hardware. What are you experiencing and what is your performance target?

Regards,

Dave

dsegleau
A: 

Hi Andreas.

I would recommend trying the bulk insert API, you can read about that in the documentation here: http://www.oracle.com/technology/documentation/berkeley-db/db/api_reference/CXX/dbput.html#put_DB_MULTIPLE_KEY

Also, I would guess that your call to memp_trickle is responsible for most of the slowdown. As the cache becomes dirtier, finding pages to trickle becomes more expensive. In fact, since you are only writing, having a large cache only hurts (once you've written the data, you don't use it again, so you don't want it to hang around in the cache.) I would recommend testing different (smaller) cache sizes.

Finally, if your sole concern is insert performance, using a larger page size will help. You'll be able to fit more data on each page and that will result in fewer disk writes.

-Ben

Ben Schmeckpeper
A: 

Hello Andreas,

The memp_trickle is almost certainly slowing things down. It's often good to use trickle, but it belongs in its own thread to be effective. BDB (except when you get into the higher level replication APIs) creates no threads for you -- nothing happens behind the scenes (thread-wise). trickle will be effective when you have dirty pages forced from the cache (look at a stat output to see if that's what's happening).

You might also consider using a BTREE instead of a HASH. Yeah, I know you specifically said hash, but why? If you are looking to maximize performance, why add that restriction? You may be able to take advantage of locality of reference to reduce your cache footprint - there is often much more locality that you believe, or perhaps you can create some -- if you generate keys that are random digits, for example, prepend the date and time. That usually introduces locality into a perceived 'random' system. If you do use btree, you'll need to pay some attention to byte ordering for your keys for your system (See http://en.wikipedia.org/wiki/Endianness), if you are using a Little Endian system, you'll need to swap the bytes. Using BTREE with the right ordering and introduced locality means your key/value pairs will be stored in 'key-generation-time' order, so if you see the most action on the recent keys, you'll be tending to hit the same pages over and over (see your cache hit rate in your stats). So you'll need less cache. Another way to think of it is with the same cache amount, your solution will scale by a larger multiple.

I expect your actual app really doesn't insert integers keys in order (if it does, you'll be lucky). So you should write a benchmark that closely simulates your access patterns, at least with respect to: size of key, size of data, access pattern, number of items in the database, mix of read/write. Once you have that, look at the stats -- pay close attention to anything that implies IO or contention.

Don Anderson