views:

99

answers:

3

I am running a program that does simple data processing:

  • parses text
  • populates dictionaries
  • calculates some functions over the resulting data

The program only uses CPU, RAM, and HDD:

  • run from Windows command line
  • input/output to the local hard drive
  • nothing displayed on or printed to screen
  • no networking

The same program is run on:

  • desktop: Windows 7, i7-930 CPU overclocked @3.6 GHz (with matching memory speed), Intel X-25M SSD
  • laptop: Windows XP, Intel Core2 Duo T9300 @2.5GHz, 7200 rpm HDD

The CPU is 1.44 faster frequency, HDD is 4 times higher benchmark score (Passmark - Disk Mark). I found the program runs just around 1.66 times faster on the desktop. So apparently, the CPU is the bottleneck.

It seems there's only 15% benefit from the i7 Core vs Intel Core2 Duo architecture (most of the performance boost is due to the straight CPU frequency). Is there anything I can do in the code to increase the benefit of the new architecture?


EDIT: forgot to mention that I use ActivePython 3.1.2 if that matters.

+5  A: 

The increasing performance of hardware brings in most cases automatically results in benefit to user applications. The much maligned "GIL" means that you may not be able to take advantage of multicores with CPython unless you design your program to take advantage via various multiprocessing modules / libraries.

SO discussion on the same : http://stackoverflow.com/questions/203912/does-python-support-multiprocessor-multicore-programming

A related collation of solutions on python wiki: http://wiki.python.org/moin/ParallelProcessing

pyfunc
+1  A: 

Split your processing into multiple threads. Your particular i7 should be able to support up to 8 threads in parallel.

Roy Tang
A: 

Consider repeating on regular HDD's - that SSD could well result in a substantial performance difference depending on caches, and the nature of that data.

Arafangion