views:

406

answers:

5

I'm developing a Python project for dealing with computer simulations, and I'm also developing a GUI for it. (The core logic itself does not require a GUI.) The GUI toolkit I use for is wxPython, but I think my question is general enough not to depend on it.

The way that the GUI currently works is that it starts the core logic package (called garlicsim) on the same process and the same thread as the GUI. This works, but I understand it's a problematic approach, because if the core logic needs to do some hard computation, the GUI will hang, which I consider unacceptable.

What should I do?

I heard about the option of launching the core logic on a separate process from the GUI. This sounds interesting, but I have a lot of questions about this.

  1. Do I use the multiprocessing package or the subprocess package to launch the new process?
  2. How do I have easy access to the simulation data from the GUI process? After all, it will be stored on the other process. The user should be able to browse through the timeline of the simulation easily and smoothly. How can this be done?
+6  A: 

You might find some inspiration here: http://wiki.wxpython.org/LongRunningTasks, however it is for multithreading, not multiprocessing.

The basic idea

  • for multithreading: use an event queue to communicate between the GUI and the processing thread.
  • for multiprocessing: maybe use the subprocess package, and use stdin/stdout of the child process to communicate with it. For this you need a command-line api, but it would come handy eventually, because you can do gui-independent unit testing.

You may even drive the i/o communication through a socket, this would let easy network management of the simulation.

Edit: I just saw the 2.6-new multiprocessing package you mentioned. Seems a nice pick, you could use queues to communicate between process then. This is a tighter coupling, you can choose based on your needs.

ron
By 'subprocessing' package I think you mean 'subprocess', right?
Kylotan
@Kylotan: yes, corrected, thanks
ron
A: 

Unfortunately, although you're right that the choice of GUI doesn't affect the answer, the best approach to this problem will depend a lot on what exactly your simulation data is doing.

For example, if it generates sequential data then it can feed it to your GUI via a thread-safe or process-safe queue. But if it mutates the whole data and your GUI needs to be able to see a snapshot at any given time, that might be too expensive to solve by sending the whole state along the queue and might require a mutex-style approach instead to share access to the data structure. So the nature of the work done on your data is paramount here.

As for whether to use multiprocessing or subprocess, that depends on whether you have a completely separate program or not handling the data. The former is for doing multiprocessing in the style of multithreading - it is different parts of the same program running in multiple processes. The latter is when one program wants to run another (which could be a copy of the program, but usually is not). Again, it's hard to know which is the best approach for your specific situation, although it does sound like you could have the core logic as a command line application and communicate via pipes, sockets, etc.

Kylotan
+1  A: 

To answer the specific questions.

"Do I use the multiprocessing package or the subprocess package to launch the new process?"

Use multiprocessing

"How do I have easy access to the simulation data from the GUI process?"

You don't have access to the simulation processes objects, if that's what you're asking The simulation is a separate process. You can start it, stop it, and -- most importantly -- make requests via a queue of commands that go to the simulator.

"The user should be able to browse through the timeline of the simulation easily and smoothly. How can this be done?"

This is just design. Single process, multiple processes, multiple threads don't have any impact on this question at all.

Each simulation must have some parameters, it must start, it must produce a log (or timeline). That has to be done no matter what library you use to start and stop the simulation.

The output from the simulation -- which is input to your GUI -- can be done a million ways.

  • Database. The simulation timeline could be inserted into a SQLite database and queried by the GUI. This doesn't work out terribly well because SQLite doesn't have really clever locking. But it does work.

  • File. The simulation timeline is written to a file. The GUI reads the file. This works out really, really well.

  • Request/Reply. The simulation has multiple threads, one of which is dequeueing commands and responding by -- for example -- sending back the timeline up to the moment, or stopping the simulation or changing parameters and restarting it.

S.Lott
+2  A: 

The simplest approach that can work for you here is launch the computation in a separate thread, and communicate data between this thread and the GUI using Queue objects. These are completely safe and very convenient for inter-thread communication.

Other solutions are more complex - you may end up running the simulation in a completely separate "server" process and communicate with sockets with the main GUI.

Eli Bendersky
A: 

Mutiprocessing or Pyro with distributed data objects.

http://pyro.sourceforge.net/

Your simulation supplies distributed objects to the GUI, the GUI manipulates them and reads their attributes.

Both libraries will provide expandability over a network with no hassle, but can run locally. When your simulation starts crunching too many numbers, add more simulation servers that provide more distributed objects.

manifest