views:

841

answers:

8

I use Python 2.5.4. My computer: CPU AMD Phenom X3 720BE, Mainboard 780G, 4GB RAM, Windows 7 32 bit.

I use Python threading but can not make every python.exe process consume 100% CPU. Why are they using only about 33-34% on average?.

I wish to direct all available computer resources toward these large calculations so as to complete them as quickly as possible.

EDIT: Thanks everybody. Now I'm using Parallel Python and everything works well. My CPU now always at 100%. Thanks all!

+2  A: 

You bottleneck is probably somewhere else, like the hard-drive (paging), or memory access.

nicholaides
Just FYI, memory (RAM) accesses will count towards the CPU usage %, because the CPU is busy-waiting. So that cannot be the cause.
intgr
This is *most probably* a GIL related issue on a 3-core machine, not a memory or disk access issue.
ΤΖΩΤΖΙΟΥ
+17  A: 

Try the multiprocessing module, as Python, while it has real, native threads, will restrict their concurrent use while the GIL is held. Another alternative, and something you should look at if you need real speed, is writing a C extension module and calling functions in it from Python. You can release the GIL in those C functions.

Also see David Beazley's Mindblowing GIL.

Roger Pate
Thanks for that link. I had no idea it was that bad.
Just Some Guy
+1  A: 

From CPU usage it looks like you're still running on a single core. Try running a trivial calculation with 3 or more threads with same threading code and see if it utilizes all cores. If it doesn't, something might be wrong with your threading code.

frgtn
The problem is not his threading code; due to the GIL, Python cannot perform true concurrent operations, even though it has multithreading. Thus, it can only use one core at a time.
qid
qid: Not true! Code in a C extension module can release the GIL and do some work while other threads are running concurrently.
Roger Pate
my CPU uses 3 cores with same load
AloneRoad
+16  A: 

It appears that you have a 3-core CPU. If you want to use more than one CPU core in native Python code, you have to spawn multiple processes. (Two or more Python threads cannot run concurrently on different CPUs)

As R. Pate said, Python's multiprocessing module is one way. However, I would suggest looking at Parallel Python instead. It takes care of distributing tasks and message-passing. You can even run tasks on many separate computers with little change to your code.

Using it is quite simple:

import pp

def parallel_function(arg):
    return arg

job_server = pp.Server() 

# Define your jobs
job1 = job_server.submit(parallel_function, ("foo",))
job2 = job_server.submit(parallel_function, ("bar",))

# Compute and retrieve answers for the jobs.
print job1()
print job2()
intgr
Not accurate, you can have threads which run without holding the GIL.
Roger Pate
For practical purposes, all native Python code *will* hold the GIL at all times.
intgr
Using extension modules is very practical, at least for me. (I still upvoted for the PP reference though. :)
Roger Pate
A: 

You should perform some Operating System and Python monitoring to determine where the bottle neck is.

Here is some info for windows 7:

Performance Monitor: You can use Windows Performance Monitor to examine how programs you run affect your computer’s performance, both in real time and by collecting log data for later analysis. (Control Panel-> All Control Panel Items->Performance Information and Tools-> Advanced Tools- > View Performance Monitor)

Resource Monitor: Windows Resource Monitor is a system tool that allows you to view information about the use of hardware (CPU, memory, disk, and network) and software (file handles and modules) resources in real time. You can use Resource Monitor to start, stop, suspend, and resume processes and services. (Control Panel-> All Control Panel Items->Performance Information and Tools-> Advanced Tools- > View Resource Monitor)

Chad
+4  A: 

Global Interpreter Lock

The reasons of employing such a lock include:

* increased speed of single-threaded programs (no necessity to acquire or release locks
  on all data structures separately)
* easy integration of C libraries that usually are not thread-safe.

Applications written in languages with a GIL have to use separate processes (i.e. interpreters) to achieve full concurrency, as each interpreter has its own GIL.

Sivvy
Why did I get downvoted? This gives a correct answer to "Why are they using only about 33-34% on average?", and it also gives a solution underneath... Was a downvote really necessary? Even the selected answer's first sentence matched this solution.
Sivvy
When hovering over the up/down vote arrow, the alt text is clear: "this answer is useful/not useful". Idiotically, it seems that many SO users read "I like this answer/I don't like this answer and this person should die a horrible death" instead, and vote accordingly. If I had to downvote an answer, I'd downvote nicholaides' answer, but I can't, for the miniscule chance he's right (although I strongly believe he's not).
ΤΖΩΤΖΙΟΥ
A: 

What about Stackless Python?

Silvanus
Stackless python has the same limitations. Not using traditional threads doesn't mean that the GIL is gone.
Mattias Nilsson
A: 

Thanks everybody. Now I'm using Parallel Python and everything works well. My CPU now always at 100%. Thanks all!

AloneRoad