views:

430

answers:

3

I'm attempting to compare two strings with is. One string is returned by a function, and the other is just declared in the comparison. is tests for object identity, but according to this page, it also works with two identical strings because of Python's memory optimization. But, the following doesn't work:

def uSplit(ustring):
        #return user minus host
        return ustring.split('!',1)[0]

user = uSplit('theuser!host')
print type(user)
print user
if user is 'theuser':
    print 'ok'
else:
    print 'failed'

user = 'theuser'

if user is 'theuser':
    print 'ok'

The output:

type 'str'
theuser
failed
ok

I'm guessing the reason for this is a string returned by a function is a different "type" of string than a string literal. Is there anyway to get a function to return a string literal? I know I could use ==, but I'm just curious.

+3  A: 

That page you quoted says "If two string literals are equal, they have been put to same memory location" (emphasis mine). Python interns literal strings, but strings that are returned from some arbitrary function are separate objects. The is operator can be thought of as a pointer comparison, so two different objects will not compare as identical (even if they contain the same characters, ie. they are equal).

Greg Hewgill
In the larger scheme of things, it's rare that ANY language makes strong guarantees about memory on a managed-memory system (garbage collected or reference-counting with cycle detection). Even when they do it's not generally good to rely on it because the guarantee might disappear in future versions.
Imagist
+2  A: 

The site you quote says this:

If two string literals are equal, they have been put to same memory location.

But

uSplit('theuser!host')

is not a string literal -- it's the result of an operation on the literal 'theuser!host'.

Anyway, you usually shouldn't check for string equality using is, because this memory optimization in any case is just an implementation detail you shouldn't rely on.


Also, You should use is for things like is None. Use it for checking to see if two objects -- of classes that you designed -- are the same instance. You can't easily use it for strings or numbers because the rules for creation of those built-in classes are complex. Some strings are interned. Some numbers, similarly, are interned.

balpha
The rules for interning are indeed strange. If uSplit returned a string with 2 or less characters, the 'is' comparison works. I guess it's safest to stick to using ==.
pique oil
It's not a "safest". It's a matter of definition. Strings are not defined as always interned. Some are, but there's no explicit guarantee, so you simply can't use `is` with strings. It's not safety, so much as a matter of definition.
S.Lott
Exactly. "is" has its purpose, "==" has its purpose. Different purposes.
balpha
A: 

What you have run into is the fact that Python does not always intern all of its strings. More detail here:

http://mail.python.org/pipermail/tutor/2009-July/070157.html

Amber