I'm coding a Merb application which uses a combination of SimpleDB and Tokyo Tyrant for storage. For both of these data stores I'm implementing IN (list) type-functionality by spinning up a thread for each value in list and then merging the result sets. Bearing in mind that this is a web application, is there a limit to the number of threads I should be creating? Ruby 1.8.7, so they're not kernel threads.
If you are using MRI then using threads in such cases won't be of a big help as MRI uses green threads that are not helpful when it comes to computational operations. I believe using JRuby(native threads) will be helpful then. I keep hearing that for native threads it's best to use (number of cores + 1) to make use of the available cores.
Threads seems like a bad approach for what you're trying to do here, and if you can't use JRuby, I'd just drop the threads altogether. However, you could create a ruby file loading the database and use the benchmark library to do some benchmarking on which number is the fastest. You probably want to look at the memory used too.
To me your problem sounds IO bound, so multi threading a single core may help out.
Most of the time in your main Ruby loop you will probably be waiting on tokyo tyrant and simple DB which are running in separate multi-threaded process.
So how many threads? Who knows? You are going to have to benchmark and measure.