Tokyo cabinet - Slower inserts after hitting 1million | ansaurus

tags:

tokyocabinet

views:

1432

answers:

3

+3 Q:

Tokyo cabinet - Slower inserts after hitting 1million

I am evaluating tokyo cabinet Table engine. The insert rate slows down considerable after hitting 1 million records. Batch size is 100,000 and is done within transaction. I tried setting the xmsiz but still no use. Has any one faced this problem with tokyo cabinet?

Details

Tokyo cabinet - 1.4.3
Perl bindings - 1.23
OS : Ubuntu 7.10 (VMWare Player on top of Windows XP)

+1 A:

I just set the cache option and it is now significantly faster.

Bharani 2009-03-04 06:53:21

A:

I think modifying the bnum parameter in the dbtune function will also give a significant speed improvement.

2009-04-29 14:08:35

+2 A:

I hit a brick wall around 1 million records per shard as well (sharding on the client side, nothing fancy). I tried various ttserver options and they seemed to make no difference, so I looked at the kernel side and found that

echo 80 > /proc/sys/vm/dirty_ratio

(previous value was 10) gave a big improvement - the following is the total size of the data (on 8 shards, each on its own node) printed every minute:

total:  14238792  records,  27.5881 GB size
total:  14263546  records,  27.6415 GB size
total:  14288997  records,  27.6824 GB size
total:  14309739  records,  27.7144 GB size
total:  14323563  records,  27.7438 GB size
(here I changed the dirty_ratio setting for all shards)
total:  14394007  records,  27.8996 GB size
total:  14486489  records,  28.0758 GB size
total:  14571409  records,  28.2898 GB size
total:  14663636  records,  28.4929 GB size
total:  14802109  records,  28.7366 GB size

So you can see that the improvement was in the order of 7-8 times. Database size was around 4.5GB per node at that point (including indexes) and the nodes have 8GB RAM (so dirty_ratio of 10 meant that the kernel tried to keep less than ca. 800MB dirty).

Next thing I'll try is ext2 (currently: ext3) and noatime and also keeping everything on a ramdisk (that would probably waste twice the amount of memory, but might be worth it).

mjy 2009-12-11 00:27:31

related questions

Ruby Rack: startup and teardown operations (Tokyo Cabinet connection)

What is the best way to determine the count of records that will be returned by a query with rufus-tokyo?

Picking a database technology

Approximate/fuzzy string lookup using Tokyo Cabinet

key/value (general) and tokyo cabinet (python tc-specific) question

Tokyo Cabinet cluster and PHP (via memcache)

SimpleDB vs Tokyo Cabinet

Closing db file when using Tokyo Cabinets Java bindings

tokyo cabinet perl api libtokyocabinet.dylib, file is not of required architecture

What are some techniques to push changes from tokyo cabinet in a multi-service setup?

TokyoCabinet: Segmentation fault at hdb->close()

Tokyo Cabinet and variable size C++ objects

Sharing DB connections across objects using class methods in ruby?

how to build one to many rows in tokyo cabinet?

Tokyo Cabinet vs SQLite3 on iPhone

Tokyo Cabinet & .Net

Object-oriented C++ API for Tokyo Cabinet?

How does Tokyo Cabinet handle large integers?

Why does tokyo tyrant slow down exponentially even after adjusting bnum?

good combination of a c++ toolkit/library, cross platform db (not necessarily sql)

Tokyo Cabinet and SQLite compatible interfaces?

Python Table engine binding for Tokyo Cabinet

BerkeleyDB vs. Tokyo Cabinet

(How can/What should) I implement a database that scales to the upper tens of thousands requests/second?