ansaurus

Question

Can ( s is "" ) and ( s == "" ) ever give different results in Python 2.6.2?

Answer 1

+10 A:

Python is tests the objects identity and not equality. Here is an example where using is and == gives a different result:

>>> s=u""
>>> print s is ""
False
>>> print s==""
True

zoli2k 2010-07-02 11:50:47

This however tests two different kinds of string objects, `str` and `unicode`. The question still remains when s is of the same type, or if you use Python 3 in which there is no explicit `unicode` object any more.

poke 2010-07-02 11:53:55

@poke, The question explicitly is focused on Python 2.6.2 and not on Python 3.

zoli2k 2010-07-02 11:56:37

pycruft 2010-07-02 11:57:27

This answer is good. Use `is` for *identity* and `==` for *equality*. Don't ever assume they do the same thing. Use `is` to test if two variables point to the same instance of an object (but use `==` if you want to see if the two instances have an equal value, in case they implement `__eq__`). A common case when you want to use `is` is to test whether something `is None`, since `None` is a singleton.

Blixt 2010-07-02 12:08:14

Answer 2

+2 A:

It seems to work for anything which actually is a string, but something which just looks like a string (e.g. a unicode or subclass of str or something similar) will fail.

>>> class mysub(str):
    def __init__(self, *args, **kwargs):
        super(mysub, self).__init__(*args, **kwargs)

>>> 
>>> q = mysub("")
>>> q is ""
False
>>> q == ""
True

edit:

For the purposes of code review & feedback I would suggest that it was bad practice because it implements an unexpected test (even if we ignore the uncertainty of whether it will always behave the same when the types match).

if x is ""

Implies that x is of the correct value and type, but without an explicit test of type which would warn future maintainers or api users, etc.

if x == ""

Implies that x is just of the correct value

pycruft 2010-07-02 11:54:14

And of course they will be different for an object of a class unrelated to str but which implements \_\_eq\_\_ in such a way as to compare equal to a string.

jchl 2010-07-02 11:58:22

Answer 3

+7 A:

You shouldn't care. Unlike None which is defined to be a singleton, there is no rule that says there is only one empty string object. So the result of s is "" is implementation-dependent and using is is a NO-NO whether you can find an example or not.

John Machin 2010-07-02 11:56:19

I understand that. But for the purpose of giving code review feedback to a junior programmer it's better to say "don't do that because it doesn't work in this situation" than to say "don't do that because it's wrong". Of course, if "because it's wrong" is the only reason, then that's the reason I shall have to give...

jchl 2010-07-02 11:59:57

Well, you should give the big picture to your junior programmer. It's better to know that `is` is testing for something else than `==` and therefore should be used for its correct purpose - the incidental fact that you may get away with it in a few circumstances is rather irrelevant - trying to learn the circumstances where an antipattern *does* work is a wasted effort and a sure-fire recipe for future bugs.

Tim Pietzcker 2010-07-02 12:07:00

You should say: "Don't do that because it's wrong. It's wrong because `is` should not be used for checking if two values are equal, only to check if they are the exact same instance."

Blixt 2010-07-02 12:12:04

@jchl: I told you what to say: The result of `s is ""` is implementation-dependent. So don't do it.

John Machin 2010-07-02 12:23:13

Answer 4

A:

Probably no, CPython seems to optimize spurious instances of "" away in all cases. But as the others say, don't rely on that.

Philipp 2010-07-02 11:59:32

Answer 5

+1 A:

You can't find a example because some things are unique and not muteable - so Python keeps them around exactly once and therefore is works. These include (), '',u'', True, False, None CPython even keeps a few frequently used numbers, ie 0, 0.0, 1, 1.0,

THC4k 2010-07-02 12:02:43

Experimentally, it seems CPython (version 2.6.2) actually interns all integers from -5 through 256.

jchl 2010-07-02 12:06:42

`0.0` and `1.0` shouldn't be in the list above: floats aren't interned by current CPython.

Mark Dickinson 2010-07-02 13:17:12

Answer 6

A:

ython:

>>> a = ""
>>> b = "abc"[ 2:2 ]
>>> c = ''.join( [] )
>>> d = re.match( '()', 'abc' ).group( 1 )
>>> e = a + b + c + d 
>>> a is b is c is d is e
0
>>> a == b == c == d == e
1

Tomasz Wysocki 2010-07-02 12:10:20

Answer 7

A:

Undefined behavior is a murky issue. There are things the Python specification defines and adhering implementations must conform to, and there are things left to choice. You may get convinced, by looking into the source code of Python, that this behavior can never happen for actual string objects (unlike unicode vs. non-unicode and other close-but-irrelevant examples shown). Happy, you will leave such a test in a code.

But the Python implementation doesn't guarantee it will always work. Some future implementation may cause it to change and you'll have a painful incompatibility.

So the rule of thumb with this is simple: don't do it. Use operators only for their intended and well documented use. Don't rely on artifacts of implementation that may very well change in the future.

Eli Bendersky 2010-07-02 12:28:13

Answer 8

+8 A:

As everyone else has said, don't rely on undefined behaviour. However, since you asked for a specific counterexample for Python 2.6, here it is:

>>> s = u"\xff".encode('ascii', 'ignore')
>>> s
''
>>> id(s)
10667744
>>> id("")
10666064
>>> s == ""
True
>>> s is ""
False
>>> type(s) is type("")
True

The only time that Python 2.6 can end up with an empty string which is not the normal empty string is when it does a string operation and it isn't sure about in advance how long the string will be. So when you encode a string the error handler can end up stripping characters and fixes up the buffer size after it has completed. Of course that's an oversight and could easily change in Python 2.7.

Duncan 2010-07-02 13:59:04

ansaurus

tags:

views:

answers:

Can ( s is "" ) and ( s == "" ) ever give different results in Python 2.6.2?

related questions