ansaurus

Question

Which is faster in Python: x**.5 or math.sqrt(x)?

Answer 1

+3 A:

Most likely math.sqrt(x), because it's optimized for square rooting.

Benchmarks will provide you the answer you are looking for.

strager 2008-11-29 01:25:16

Answer 2

+12 A:

How many square roots are you really performing? Are you trying to write some 3D graphics engine in Python? If not, then why go with code which is cryptic over code that is easy to read? The time difference is would be less than anybody could notice in just about any application I could forsee. I really don't mean to put down your question, but it seems that you're going a little too far with premature optimization.

Kibbee 2008-11-29 01:28:24

i don't really feel I'm doing a premature optimization. It's more of a simple question of deciding from 2 different methods, which, on average, will be faster.

Casey 2008-11-29 01:54:31

Kibbee: it's definitely a valid question, but I share your dismay at the number of questions on Stack Overflow that imply that the asker is performing all kinds of premature optimization. It's definitely a large percentage of the questions being asked for every language.

Eli Courtwright 2008-11-29 02:02:02

But what does it really matter? It will make absolutely no impact on any program.

DGM 2008-11-29 02:17:24

Is math.sqrt(x) any easier to read than x ** 0.5 ? I think they are both pretty obviously square root... at least if you are familiar with python anyway. Don't call standard python operators like ** "cryptic" just because you aren't familiar with python.

TM 2009-02-01 01:10:38

I don't think the ** operator is cryptic. I think that raising something to the exponent 0.5 as a method of getting the square root to be a little cryptic to those that don't keep up on their math.

Kibbee 2009-02-01 19:39:33

Answer 3

+17 A:

As per comments, I've updated the code:

import time
import math

def timeit1():
    s = time.time()
    for i in xrange(750000):
        z=i**.5
    print "Took %f seconds" % (time.time() - s)

def timeit2(arg=math.sqrt):
    s = time.time()
    for i in xrange(750000):
        z=arg(i)
    print "Took %f seconds" % (time.time() - s)

timeit1()
timeit2()

Now the math.sqrt function is directly in a local argument, meaning it has the fastest lookup possible.

UPDATE: The python version seems to matter here. I used to think that timeit1 would be faster, since when python parses "i**.5" it knows, syntactically, which method to call (__pow__ or some variant), so it doesn't have to go through the overhead of lookup that the math.sqrt variant does. But I might be wrong:

Python 2.5: 0.191000 vs. 0.224000

Python 2.6: 0.195000 vs. 0.139000

Also psyco seems to deal with math.sqrt better:

Python 2.5 + Psyco 2.0: 0.109000 vs. 0.043000

Python 2.6 + Psyco 2.0: 0.128000 vs. 0.067000

Claudiu 2008-11-29 01:32:11

codepadded it: http://codepad.org/fyZqDLPO

yjerem 2008-11-29 01:36:27

I've now run it 3 times on codepad.org and all three times a() was much faster than b().

yjerem 2008-11-29 01:38:11

With my machine locally, it takes ~0.12 seconds for *a* and ~0.16 seconds for *b*. Interesting.

strager 2008-11-29 01:53:07

Try moving the sqrt function into the local scope, so it doesn't need to look up math in the global dictionary then sqrt in that dictionary.

Mr Fooz 2008-11-29 03:16:19

ah yes - i forgot that syntax-ed exression (such as * and **) are faster since they require no lookup... moving the math.sqrt into a local argument is still slower than i**.5

Claudiu 2008-11-30 06:53:30

The standard timeit module is your friend. It avoids common pitfalls when it comes to measuring execution time!

EOL 2009-05-27 21:13:01

Here are the results of your script:zoltan@host:~$ python2.5 p.py Took 0.183226 secondsTook 0.155829 secondszoltan@host:~$ python2.4 p.py Took 0.181142 secondsTook 0.153742 secondszoltan@host:~$ python2.6 p.py Took 0.157436 secondsTook 0.093905 secondsTarget system: Ubuntu LinuxCPU: Intel(R) Core(TM)2 Duo CPU T9600 @ 2.80GHzAs you can see I got different results. According to this your answer is not generic.

zoli2k 2010-04-23 03:37:10

I've found like zoli2k that timeit2() gives a lower time on my machine with Python 2.6.4. I also tried using the timeit module and found the same result.

Justin Peel 2010-04-23 06:35:00

Codepad is a great service, but horrible for timing performance, I mean who knows how busy the server will be at a given moment. Each run could potentially give very different results

Infinity 2010-05-12 23:49:32

hmm I just ran it on a local machine w/ py2.5 and timeit1 is faster.. although when running it with Psyco timeit2 is WAYY faster =). prob depends on a variety of factors..

Claudiu 2010-08-12 17:14:25

Answer 4

A:

You might want to benchmark the fast Newton-Raphson square root as well. Shouldn't take much to convert to Python.

Mitch Wheat 2008-11-29 01:36:19

I tried this (answered with the result) and my implementation is way slower than the native sqrt functions :P

lunixbochs 2010-04-23 05:50:01

Answer 5

+5 A:

In these micro-benchmarks, math.sqrt will be slower, because of the slight time it takes to lookup the sqrt in the math namespace. You can improve it slightly with

 from math import sqrt

Even then though, running a few variations through timeit, show a slight (4-5%) performance advantage for "x**.5"

interestingly, doing

 import math
 sqrt = math.sqrt

sped it up even more, to within 1% difference in speed, with very little statistical significance.

I will repeat Kibbee, and say that this is probably a premature optimization.

JimB 2008-11-29 01:45:49

Answer 6

+10 A:

first rule of optimization: don't do it
second rule: don't do it, yet

Here's some timings (Python 2.5.2, Windows):

$ python -mtimeit -s"from math import sqrt; x = 123" "x**.5"
1000000 loops, best of 3: 0.445 usec per loop

$ python -mtimeit -s"from math import sqrt; x = 123" "sqrt(x)"
1000000 loops, best of 3: 0.574 usec per loop

$ python -mtimeit -s"import math; x = 123" "math.sqrt(x)"
1000000 loops, best of 3: 0.727 usec per loop

This test shows that x**.5 is slightly faster than sqrt(x).

For the Python 3.0 the result is the opposite:

$ \Python30\python -mtimeit -s"from math import sqrt; x = 123" "x**.5"
1000000 loops, best of 3: 0.803 usec per loop

$ \Python30\python -mtimeit -s"from math import sqrt; x = 123" "sqrt(x)"
1000000 loops, best of 3: 0.695 usec per loop

$ \Python30\python -mtimeit -s"import math; x = 123" "math.sqrt(x)"
1000000 loops, best of 3: 0.761 usec per loop

math.sqrt(x) is always faster than x**.5 on another machine (Ubuntu, Python 2.6 and 3.1):

$ python -mtimeit -s"from math import sqrt; x = 123" "x**.5"
10000000 loops, best of 3: 0.173 usec per loop
$ python -mtimeit -s"from math import sqrt; x = 123" "sqrt(x)"
10000000 loops, best of 3: 0.115 usec per loop
$ python -mtimeit -s"import math; x = 123" "math.sqrt(x)"
10000000 loops, best of 3: 0.158 usec per loop
$ python3.1 -mtimeit -s"from math import sqrt; x = 123" "x**.5"
10000000 loops, best of 3: 0.194 usec per loop
$ python3.1 -mtimeit -s"from math import sqrt; x = 123" "sqrt(x)"
10000000 loops, best of 3: 0.123 usec per loop
$ python3.1 -mtimeit -s"import math; x = 123" "math.sqrt(x)"
10000000 loops, best of 3: 0.157 usec per loop

J.F. Sebastian 2008-11-29 02:16:47

Answer 7

+3 A:

For what it's worth (see Jim's answer). On my machine, running python 2.5:

PS C:\> python -m timeit -n 100000 10000**.5
100000 loops, best of 3: 0.0543 usec per loop
PS C:\> python -m timeit -n 100000 -s "import math" math.sqrt(10000)
100000 loops, best of 3: 0.162 usec per loop
PS C:\> python -m timeit -n 100000 -s "from math import sqrt" sqrt(10000)
100000 loops, best of 3: 0.0541 usec per loop

zdan 2008-11-29 02:17:09

Answer 8

+4 A:

using Claudiu's code, on my machine even with "from math import sqrt" x**.5 is faster but using psyco.full() sqrt(x) becomes much faster, at least by 200%

Casey 2008-11-29 02:28:52

Answer 9

A:

Claudiu's results differ from mine. I'm using Python 2.6 on Ubuntu on an old P4 2.4Ghz machine... Here's my results:

>>> timeit1()
Took 0.564911 seconds
>>> timeit2()
Took 0.403087 seconds
>>> timeit1()
Took 0.604713 seconds
>>> timeit2()
Took 0.387749 seconds
>>> timeit1()
Took 0.587829 seconds
>>> timeit2()
Took 0.379381 seconds

sqrt is consistently faster for me... Even Codepad.org NOW seems to agree that sqrt, in the local context, is faster (http://codepad.org/6trzcM3j). Codepad seems to be running Python 2.5 presently. Perhaps they were using 2.4 or older when Claudiu first answered?

In fact, even using math.sqrt(i) in place of arg(i), I still get better times for sqrt. In this case timeit2() took between 0.53 and 0.55 seconds on my machine, which is still better than the 0.56-0.60 figures from timeit1.

I'd say, on modern Python, use math.sqrt and definitely bring it to local context, either with somevar=math.sqrt or with from math import sqrt.

bobpaul 2010-04-23 03:14:26

Answer 10

+1 A:

In python 2.6 the (float).__pow__() function uses the C pow() function and the math.sqrt() functions uses the C sqrt() function.

In glibc compiler the implementation of pow(x,y) is quite complex and it is well optimized for various exceptional cases. For example, calling C pow(x,0.5) simply calls the sqrt() function.

The difference in speed of using .** or math.sqrt is caused by the wrappers used around the C functions and the speed strongly depends on optimization flags/C compiler used on the system.

Edit:

Here are the results of Claudiu's algorithm on my machine. I got different results:

zoltan@host:~$ python2.4 p.py 
Took 0.173994 seconds
Took 0.158991 seconds
zoltan@host:~$ python2.5 p.py 
Took 0.182321 seconds
Took 0.155394 seconds
zoltan@host:~$ python2.6 p.py 
Took 0.166766 seconds
Took 0.097018 seconds

zoli2k 2010-04-23 04:17:49

Answer 11

+1 A:

Someone commented about the "fast Newton-Raphson square root" from Quake 3... I implemented it with ctypes, but it's super slow in comparison to the native versions. I'm going to try a few optimizations and alternate implementations.

from ctypes import c_float, c_long, byref, POINTER, cast

def sqrt(num):
 xhalf = 0.5*num
 x = c_float(num)
 i = cast(byref(x), POINTER(c_long)).contents.value
 i = c_long(0x5f375a86 - (i>>1))
 x = cast(byref(i), POINTER(c_float)).contents.value

 x = x*(1.5-xhalf*x*x)
 x = x*(1.5-xhalf*x*x)
 return x * num

Here's another method using struct, comes out about 3.6x faster than the ctypes version, but still 1/10 the speed of C.

from struct import pack, unpack

def sqrt_struct(num):
 xhalf = 0.5*num
 i = unpack('L', pack('f', 28.0))[0]
 i = 0x5f375a86 - (i>>1)
 x = unpack('f', pack('L', i))[0]

 x = x*(1.5-xhalf*x*x)
 x = x*(1.5-xhalf*x*x)
 return x * num

lunixbochs 2010-04-23 05:27:31

ansaurus

tags:

views:

answers:

Which is faster in Python: x**.5 or math.sqrt(x)?

My email:

His response:

related questions