ansaurus

Question

Answer 1

+14 A:

text[0] fails if text is an empty string:

IronPython 2.6 Alpha (2.6.0.1) on .NET 4.0.20506.1
Type "help", "copyright", "credits" or "license" for more information.
>>> text = ""
>>> print(text.startswith("a"))
False
>>> print(text[0]=='a')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: index out of range: 0

EDIT: You say you "know" that text is not empty... how confident are you of that, and what would you want to happen if it is empty in reality? If a failure is appropriate (e.g. it means a bug in your code) that would encourage the use of text[0]=='a'.

Other questions:

How concerned are you about the performance of this? If this is performance critical, then benchmark it on your particular Python runtime. I wouldn't be entirely surprised to find that (say) one form was faster on IronPython and a different one faster on CPython.
Which do you (and your team) find more readable?

Jon Skeet 2009-08-22 09:04:25

Sorry I didn't say that I know that text is not empty.

dugres 2009-08-22 09:17:04

@Jon Skeet: True, but that can be cirumvented with saying `text[:1]` instead.

balpha 2009-08-22 09:20:29

@balpha: But that's a different question then :)

Jon Skeet 2009-08-22 09:22:29

True. Just felt it should be mentioned, because (at least in a naive test I just ran), slicing in fact *is* a lot faster than `startswith`; so if performance is important, it might help to know this.

balpha 2009-08-22 09:27:06

@balpha: A lot faster on what runtime? Not that I'm an expert on Python performance (by a long way!) but I'd expect the performance characteristics to change immensely between implementations.

Jon Skeet 2009-08-22 09:28:14

Good point. I used CPython. But as I said, it was just a very naive test. If performance is an issue, the particular case should be measured if not, readability counts---which is definitely a point for `startswith`. Anyhow, +1 on your edit.

balpha 2009-08-22 09:35:37

Answer 2

+7 A:

Personally I would say startswith is more readable.

Also, from Python 2.5 startwith can take a tuple of prefixes to look for:

>>> "hello world".startswith(("hello","goodbye"))
True

Dave Webb 2009-08-22 09:05:48

Answer 3

A:

Instead of asking you can just write a quick test and see which one's faster.
And then let us know.
Ofcourse you'll have to add to text[0] the check that the string isn't empty.

shoosh 2009-08-22 09:06:42

The question is not so much about speed, but rather which one do you use,and why ?

dugres 2009-08-22 09:25:48

Answer 4

+8 A:

Yes: it’s easier to use and easier to read. When you are testing for more than one letter, when using slicing, you’ll have to know how long the target text is:

haystack = 'Hello, World!'
needle = 'Hello'

# The easy way
result = haystack.startswith(needle)

# The slightly harder way
result = haystack[:len(needle)] == needle

Edit: The question seems to have changed. It now says, “knowing text is not empty and we are only interested in the first character of it.” That turns it into a fairly meaningless hypothetical situation.

I suspect the questioner is trying to “optimize” his/her code for execution speed. If that is the case, my answer is: don’t. Use whichever form is more readable and, therefore, more maintainable when you have to come back and work on it a year from now. Only optimize if profiling shows that line of code to be the bottleneck. This is not some O(n²) algorithm. It’s a string comparison.

Nate 2009-08-22 09:07:49

The question is really about testing the first letter.

dugres 2009-08-22 09:28:05

I've hit this exact issue in a non-hypothetical use case; effectively I had a mapping that proxied to one dictionary if you requested a key starting with ~, and another dictionary for every other character. It was called tens of thousands to millions of times, and in particular was called for redrawing dirty parts of a window when a mouse moved over it (a GTK+ virtual list model column), so it needed to be extremely efficient.Since I didn't know the string was non-empty, I used [:1]; as suggested below it was still twice as fast as startsWith.

Joe 2009-08-22 15:51:02

Answer 5

+7 A:

The stock phrase for the questiom is: "Premature optimization is the root of all evil".

Zed 2009-08-22 09:09:46

So, which one would you use ?

dugres 2009-08-22 09:20:45

text.startswith('a'). It clearly tells what you want to achieve.I can only think of text[0] == 'a' as an alternative, as a way of saving a few cpu cycles.

Zed 2009-08-22 09:49:50

Answer 6

A:

text[0]=='a' is good considering performance. Also you need to add validation when you use this.

Niger 2009-08-22 09:20:26

Answer 7

+3 A:

text[0] can fail but the equivalent text[:1] is safe if the string is empty.

If you want to compare more than one characters, I believe .startswith() is better.

Nick D 2009-08-22 09:24:06

Answer 8

+6 A:

I'd agree with the others that startswith is more readable, and you should use that. That said, if performance is a big issue for such a special case, benchmark it:

$ python -m timeit -s 'text="foo"' 'text.startswith("a")'
1000000 loops, best of 3: 0.537 usec per loop

$ python -m timeit -s 'text="foo"' 'text[0]=="a"'
1000000 loops, best of 3: 0.22 usec per loop

So text[0] is amost 2.5 times as fast - but it's a pretty quick operation; you'd save ~0.3 microseconds per compare depending on the system. Unless you're doing millions of comparisons in a time critical situation though, I'd still go with the more readable startswith.

Brian 2009-08-22 15:44:09

Answer 9

A:

PEP 8 explicitly tells to use startswith, because of readability:

- Use ''.startswith() and ''.endswith() instead of string

slicing to check for prefixes or suffixes.

  startswith() and endswith() are cleaner and less error prone.  For
  example:

    Yes: if foo.startswith('bar'):

    No:  if foo[:3] == 'bar':

Roberto Liffredo 2009-08-27 17:58:00

ansaurus

tags:

views:

answers:

How good is startswith?

related questions