My first thought was that the laptop I have at hand right now, a Macbook Pro, must be comparable to but slightly better than your machine -- I don't have enough of your surrounding code to try your C example (what's a vec_t, etc, etc), but running the Python you coded gives me:
$ python -mtimeit -s'import inso' 'inso.insort(inso.li)'
10 loops, best of 3: 7.21 msec per loop
vs your 8.1 seconds. That's with you code put in insort.py
, preceded by:
import random
li = [random.randrange(10000) for _ in xrange(10000)]
array
doesn't help -- actually slows things down a bit. Then I installed psyco, the Python JIT helper (x86-only, 32-bit only), further added:
import psyco
psyco.full()
and got:
$ python -mtimeit -s'import inso' 'inso.insort(inso.li)'
10 loops, best of 3: 207 usec per loop
so a speedup of about 7.21 / 0.000207 = 34830 times -- vs the 8.04 / 0.13 = 62 times that surprised you so much;-).
Of course, the problem is that after the first time, the list is already sorted, so insort becomes must faster. You didn't give us enough of the surrounding test harness to know exactly what you measured. A more realisting example (where the actual list isn't touched so it stays disordered, only a copy is sorted...), without psyco:
$ python -mtimeit -s'import inso' 'inso.insort(list(inso.li))'
10 loops, best of 3: 13.8 sec per loop
Oops -- so your machine's WAY faster than a Macbook Pro (remembers, core don't count: we're using only one here;-) -- wow... or else, you're mismeasuring. Anyway, WITH psyco:
$ python -mtimeit -s'import inso' 'inso.insort(list(inso.li))'
10 loops, best of 3: 456 msec per loop
So psyco's speedup is only 13.8 / 0.456, 30 times -- about half as much as the 60+ times you get with pure-C coding. IOW, you'd expect python + psyco to be twice as slow as pure C. That's a more realistic and typical assessment.
If you we writing reasonably high-level code, psyco's speedup of it would degrade from (say) 30 times down to much less -- but so would C's advantage over Python. For example,
$ python -mtimeit -s'import inso' 'sorted(inso.li)'
100 loops, best of 3: 8.72 msec per loop
without psyco (in this case, psyco actually -- marginally -- slows down the execution;-), so that's another factor of 52 over psyco, 1582 overall over non-psyco insort.
But, when for some reason or other you have to write extremely low-level algorithms in python, rather than using the wealth of support from the builtins and stdlib, psyco can help reduce the pain.
Another point is, when you benchmark, please post ALL code so others can see exactly what you're doing (and possibly spot gotchas) -- your "scaffolding" is as tricky and likely to hide traps, as the code you think you're measuring!-)