views:

124

answers:

1

Long-Winded Background

I'm working on parallelising some code for cardiac electrophysiology simulations. Since users can specify their own simulations using an in-built scripting language, I have no way of knowing how to manage the trade-off of communication vs. computation. To combat this, I'm making a sort-of runtime profiler, which will decide how to handle the domain decomposition once it's seen the simulation to be run and the hardware environment that it has to work with.

My question is this:

How is MPI I/O implemented behind the scenes? Is each process actually writing to a single file on some other node, or is each process writing to some sparse file, which will get spliced back together when the file is closed?

Knowing this will help me decide whether to consider I/O operations as communication or computation, and adjust the balance accordingly…

Thanks in advance for any insight you can offer.

Ross

+3  A: 

The mechanism for I/O is implementation dependent. In addition, there is not a single style of I/O. Some I/O is cached by the remote ranks and collected by the mpirun process at the end of the run. Some I/O is written to local scratch space as required. Some I/O is written to a NAS/SAN style high performance shared file system.

Some MPI's use 3rd party libraries to support I/O to parallel file systems, and those details may be proprietary. Some file systems are local discs, others are SAN over fiber or InfinBand.

How are you planning to actually measure the time spent in I/O? Are you planning to use the pMPI interface to intercept all the calls into the library?

semiuseless
I was planning some kind of longwinded function-pointer swap-out, but this pMPI interface sounds very interesting. I shall investigate.Thanks!
rossmcf
semiuseless
Schuweeeeeet. It's a beautiful thing. Thank you.
rossmcf