I strongly agree with Daan's point: create a test program, and make sure the way in which it accesses data mimics as closely as possible the patterns you expect your application to have. This is extremely important with BDB because different access patterns yield very different throughput.
Other than that, these are general factors I found to be of major impact on throughput:
1. Access method (which in your case i guess is BTREE).
2. Level of persistency with which you configured DBD (for example, in my case the 'DB_TXN_WRITE_NOSYNC' environment flag improved write performance by an order of magnitude, but it compromises persistency)
3. Does the working set fit in cache?
4. Number of Reads Vs. Writes.
5. How spread out your access is (remember that BTREE has a page level locking - so accessing different pages with different threads is a big advantage).
6. Access pattern - meanig how likely are threads to lock one another, or even deadlock, and what is your deadlock resolution policy (this one may be a killer).
7. Hardware (disk & memory for cache).
This amounts to the following point:
Scaling a solution based on DBD so that it offers greater concurrency has two key ways of going about it; ether minimize the number of locks in your design or add more hardware.