How to build/design/program a Terabyte or Petabyte queue in memory? (Imagine a twitter like service with huge number of users.)
A:
Why not use twitter's queuing service? It's called kestrel and it's open source.
Sean Reilly
2010-04-17 16:07:18
do you have some idea of the queue size twitter have to deal with?
2010-04-17 16:10:20
As their readme file says, kestrel is scalable "to infinity and beyond".
echo
2010-04-17 16:13:11
that is a good reference!
2010-04-17 16:54:57
A:
It's going to depend a lot on what file system you're using and what elements are stored in the queue. The elements of the queue will need to be addressable somehow. Perhaps as filenames, or disk block addresses, or ... something. You're going to need to store the addresses of these elements in the queue. Depending on how many elements you're working with, you may even have to break it down even further and divide your queue up into blocks, where entire blocks of elements are considered one element in the queue, and each block is organized as its own sub-queue.
echo
2010-04-17 16:07:28
well, as long as you do adequate read-ahead. You do have some amount of memory to work with, so as long as you keep enough queued data in main memory to serve up to clients it should work fine. If your clients consume the data faster than you can retrieve it from disk, then you'll have a bottleneck. The only way around that is to do what Robert Davis suggested in a previous comment: spread it out over multiple servers.
echo
2010-04-17 16:25:18