ansaurus

Question

Ruby concurrency/asynchronous processing (with simple use case)

Answer 1

A:

Check out peach (http://peach.rubyforge.org/). Doing a parallel "each" couldn't be simpler. However, as the documentation says, you'll need to run under JRuby in order to use the JVM's native threading.

See Jorg Mittag's response to this SO question for a lot of detail on the multithreading capabilities of the various Ruby interpreters.

Mark Thomas 2010-10-25 13:06:04

Hmm, peach isn't really what I am looking for. I don't want to run the RPW in parallel, I want to detach the 3 task from each other and run them asynchronously. Jorg Mittag's response gives a great introduction. I am well aware of the offered options, but none of them seems to have a answer for my problem.

Dim 2010-10-27 08:09:17

Answer 2

+1 A:

If you need it to be truly parallel (from a single process) I believe you'll have to use JRuby to get true native threads and no GIL.

You could use something like DRb to distribute the processing across multiple processes / cores, but for your use case this is a bit much. Instead, you could try having multiple processes communicate using pipes:

$ cat somelogfile.txt | ruby ./proc-process | ruby ./proc-store

In this scenario each piece is its own process that can run in parallel but are communicating using STDIN / STDOUT. This is probably the easiest (and quickest) approach to your problem.

# proc-process
while line = $stdin.gets do
  # do cpu intensive stuff here
  $stdout.puts "data to be stored in DB"
  $stdout.flush # this is important
end

# proc-store
while line = $stdin.gets do
  write_to_db(line)
end

JEH 2010-10-25 17:41:47

I thought that Ruby 1.9's GIL allows you to do CPU stuff in one thread while another thread does I/O - that is, it only prohibits two threads doing CPU stuff.

Andrew Grimm 2010-10-25 22:47:53

Are you talking about Fibers? My limited understanding of Fibers is that instead of threads that each have a shared amount of CPU time your code explicitly hands off processing to the Fiber which can handle the blocking IO operation and immediately return back to the calling code. This reduces the amount of time that you spend waiting, but I don't think it will allow you to span more than one CPU per process. I think the GIL means only one thread of execution can run at any point in time.http://www.igvita.com/2009/05/13/fibers-cooperative-scheduling-in-ruby/

JEH 2010-10-25 23:57:07

Using pipes is a good solution to split the problem into 3 separate processes, but it is not asynchronous. It is in fact a "Ruby workaround", therefore quite difficult to implement within the scope of a bigger application. The "problem" I have outlined above is a simple example of IO driven processing. I am trying to understand what Ruby is capable of in this area and what it might be lacking.

Dim 2010-10-27 08:00:19

ansaurus

tags:

views:

answers:

Ruby concurrency/asynchronous processing (with simple use case)

related questions