As a result of the comments in my answer on this thread, I wanted to know what the speed difference is between the +=
operator and ''.join()
So what is the speed comparison between the two?
As a result of the comments in my answer on this thread, I wanted to know what the speed difference is between the +=
operator and ''.join()
So what is the speed comparison between the two?
This is what silly programs are designed to test :)
Use plus
import time
if __name__ == '__main__':
start = time.clock()
for x in range (1, 10000000):
dog = "a" + "b"
end = time.clock()
print "Time to run Plusser = ", end - start, "seconds"
Output of:
Time to run Plusser = 1.16350010965 seconds
Now with join....
import time
if __name__ == '__main__':
start = time.clock()
for x in range (1, 10000000):
dog = "a".join("b")
end = time.clock()
print "Time to run Joiner = ", end - start, "seconds"
Output Of:
Time to run Joiner = 21.3877386651 seconds
So on python 2.6 on windows, I would say + is about 18 times faster than join :)
From: Efficient String Concatenation
Method 1:
def method1():
out_str = ''
for num in xrange(loop_count):
out_str += `num`
return out_str
Method 4:
def method4():
str_list = []
for num in xrange(loop_count):
str_list.append(`num`)
return ''.join(str_list)
Now I realise they are not strictly representative, and the 4th method appends to a list before iterating through and joining each item, but it's a fair indication.
String join is significantly faster then concatenation.
Why? Strings are immutable and can't be changed in place. To alter one, a new representation needs to be created (a concatenation of the two).
It looks like for strings < ~40, +=
is faster, while longer strings quickly hit the worst-case O(N squared).
The times are as follows:
Iterations: 1,000,000
String Length: Time += Time ''.join()
1 0.953990 1.3280
4 1.233990 1.8140
6 1.516000 2.2810
12 2.250000 3.2500
80 15.530900 12.3750
222 101.797000 30.5160
443 238.063990 57.2030
And here is the code:
import time
def strcat(string):
newstr = ''
for char in string:
newstr += string
return newstr
def listcat(string):
chars = []
for char in string:
chars.append(char)
return ''.join(chars)
def test(fn, times, *args):
start = time.time()
for x in xrange(times):
fn(*args)
return time.time() - start
def testall():
strings = ['a', 'long', 'longer', 'a bit longer',
'''adjkrsn widn fskejwoskemwkoskdfisdfasdfjiz oijewf sdkjjka dsf sdk siasjk dfwijs''',
'''this is a really long string that's so long
it had to be triple quoted and contains lots of
superflous characters for kicks and gigles
@!#(*_#)(*$(*!#@&)(*E\xc4\x32\xff\x92\x23\xDF\xDFk^%#$!)%#^(*#''',
'''I needed another long string but this one won't have any new lines or crazy characters in it, I'm just going to type normal characters that I would usually write blah blah blah blah this is some more text hey cool what's crazy is that it looks that the str += is really close to the O(n^2) worst case performance, but it looks more like the other method increases in a perhaps linear scale? I don't know but I think this is enough text I hope.''']
for string in strings:
print "String of len:", len(string), "took:", test(listcat, 1000000, string), "seconds"
for string in strings:
print "String of len:", len(string), "took:", test(strcat, 1000000, string), "seconds"
testall()