I was looking into ruby's parallel/asynchronous processing capabilities and read many articles and blog posts. I looked through EventMachine, Fibers, Revactor, Reia, etc, etc. Unfortunately, I wasn't able to find a simple, effective (and non-IO-blocking) solution for this very simple use case:
File.open('somelogfile.txt') do |file|
while line = file.gets # (R) Read from IO
line = process_line(line) # (P) Process the line
write_to_db(line) # (W) Write the output to some IO (DB or file)
end
end
Is you can see, my little script is performing three operations read (R), process (P) & write (W). Let's assume - for simplicity - that each operation takes exactly 1 unit of time (e.g. 10ms), the current code would therefore do something like this (5 lines):
Time: 123456789012345 (15 units in total)
Operations: RPWRPWRPWRPWRPW
But, I would like it to do something like this:
Time: 1234567 (7 units in total)
Operations: RRRRR
PPPPP
WWWWW
Obviously, I could run three processes (reader, processor & writer) and pass read lines from reader into the processor queue and then pass processed lines into the writer queue (all coordinated via e.g. RabbitMQ). But, the use-case is so simple, it just doesn't feel right.
Any clues on how this could be done (without switching from Ruby to Erlang, Closure or Scala)?