Read-only, rather small database - ways to optimise?

Hi all, I have a following design. There's a pool of identical worker processes (max 64 of them, on average 15) that uses a shared database for reading only. The database is about 25 MB. Currently, it's implemented as a MySQL database, and all the workers connect to it. This works for now, but I'd like to:

eliminate cross-process data transfer - i. e. execute SQL in-process
keep the data completely in memory at all time (I mean, 25 MB!)
not load said 25 MB separately into each process (i. e. keep it in shared memory somehow)

Since it's all reading, concurrent access issues are nonexistent, and locking is not necessary. Data refreshes happen from time to time, but these are unfrequent and I'm willing to shut down the whole shebang for those.

Access is performed via pretty vanilla SQL SELECTs. No subqueries, no joins. LIKE conditions are the fanciest feature ever used. Indices, however, are very much needed.

Question - can anyone think of a database library that would provide the goals outlined above?

But that creates separate, per-process instances. See requirement # 3. AFAIK, you cannot create a SQLite in-memory database over a given chunk of memory.

Seva Alekseyev 2010-09-03 14:05:58

The OS's buffer cache will keep the entire database in memory, and that is shared between all processes.

MarkR 2010-09-03 18:11:55

When you fork, a copy of the in-memory database is created, containing whatever was in it when you did the fork. If you use threads, this doesn't happen, all threads will use the same database, and writes from another thread are visible as well.

Ivo 2010-09-06 08:36:26

Threads are, unfortunately, not an option on this project - a 3rd party library that's absolutely essential to the project hates them with a passion. However, I do vaguely recall that fork() on Linux does not copy memory, it creates shared memory with copy on write. Gotta investigate...

Seva Alekseyev 2010-09-08 14:19:08

I know not in process but speed improvement might be enough ?

Simon Thompson 2010-09-03 14:08:54

Memcached does not do SQL, does it? I'll take a look at MongoDB.

Seva Alekseyev 2010-09-03 14:24:36

Memcached does not do SQL but depends on what your looking to do as denormalising your data and storing as formatted records/answer against keys might be the answer. Otherwise the whole nosql lot is worth considering mongodb , couchedb, etc they all have SQL like query abilities.

Simon Thompson 2010-09-03 14:59:39

Do they have more than one index per table? What about composite indices?

Seva Alekseyev 2010-09-03 16:45:47

Memcached doesn't cache the database, it consumes extra memory to cache things which are "hard" to make. Typically it's the wrong solution unless you have an infrastructure with lots of machines.

MarkR 2010-09-03 18:12:36

Markr my suggestion was to think about using cache tech instead of direct db access it does not cache a db but could be pre loaded with data. We do this for web clusters to speed up etc

Simon Thompson 2010-09-03 20:54:45

Nosql do support multiple inexies but they require you to think about the problem from slightly diff angle.

Simon Thompson 2010-09-03 20:55:42

ansaurus

tags:

views:

answers:

Read-only, rather small database - ways to optimise?

related questions