views:

123

answers:

3

I have a small numerical simulation in C (I had to do it in C to share it with my advisor) but I want to use a "haskell script" like thing to organize the simulation. The program accepts some command line arguments and spits some output I'd like to redirect to a file, so I did something like this:

 import Control.Monad
 import System.Process

I have a function to create the name of the output file:

filename :: Int -> String  
filename n = some stuff here...

and the command I wanna run:

command :: Int -> String
command n = "./mycutesimulation " ++ show n ++ " >" ++ filename n

and finally I produce a list of the runs I wanna make and run them in with runCommand:

commands = map command [1,2..1000]

main = do
   sequence_ $ map runCommand commands

The problem is that after I run this "script", my computer almost freezes with the load. The program that is being executed is very light in memory use and runs in a fraction of a second. This shouldn't happen.

So, my questions are:

1) Did I just threw 1000 processes to be executed at the same time??? How can I execute them in a rational order - sequentially or just a few processes at a time.

2)I'm running this in a quad core and it'd be nice to use this in my favour. Is there a way I can compile this with that -threaded flag and get the processes to be concurrently executed but in an organized way?

+4  A: 

You need a waitForProcess =<< runCommand.

import System.Process

main = sequence $ map (\x -> runCommand x) commands
 where commands = map (\x -> "echo " ++ show x) [1, 2..1000]

has similar symptoms to yours, but

import System.Process

main = sequence $ map (\x -> waitForProcess =<< runCommand x) commands
 where commands = map (\x -> "echo " ++ show x) [1, 2..1000]

works.

Aidan Cully
+4  A: 

First of all you should check top or task manager to see if you are indeed creating 1000 processes in quick succession and then look for a solution based on that.

An easy way to slow down process creation is to wait for each process to finish before creating the next one. So instead of mapping runCommand over your commands you should map your own function which first calls runCommand and then calls waitForProcess on the returned ProcessHandle, i.e. each invocation of your helper function will block until the spawned process had finished.

The downside of the above solution is that it will only use one of your four cores. So what you could do to make use of all four cores is to partition commands into four (or as many cores as you want to use) lists, then spawn four worker threads with forkIO for each sublist that will each run the map on that sublist.

Btw. mapM_ f == sequence_ . map f

liwp
Thanks for pointing to `forkIO`. I managed to make it run concurrently using your suggestion. Very nice! My first multicore program! Hahahah...
Rafael S. Calsaverini
BTW, this page helped: http://haskell.org/haskellwiki/Haskell_for_multicores
Rafael S. Calsaverini
Your final aside doesn't type; `mapM_ = (.) sequence_ . map` would be more accurate.
ephemient
@ephemient: fixed
liwp
+1  A: 

Here's a quick and dirty "run a few at a time", if it helps:

import System.Process

commands = replicate 16 "sleep 2"

runSome handles cmd = do
    (h:hs) <- handles
    waitForProcess h
    h' <- runCommand cmd
    return $ hs ++ [h']

test n = 
    let initial = mapM runCommand $ take n commands
    in foldl runSome initial (drop n commands)

This just (mis)uses a list as a simple queue, runs as many commands as you tell it to, then waits on the one at the front of the queue and when it's done adds a new command. Note that this won't behave ideally if a few long-running commands are mixed in, but might be sufficient for you. Please don't think this is at all a "correct" way to do it, though.

camccann