views:

2179

answers:

10

I need to know if a variable in Python is a string or a dict. Is there anything wrong with the following code?

if type(x) == type(str()):
    do_something_with_a_string(x)
elif type(x) == type(dict()):
    do_somethting_with_a_dict(x)
else:
    raise ValueError

Update: I accepted avisser's answer (though I will change my mind if someone explains why isinstance is preferred over "type(x) is").

But thanks to nakedfanatic for reminding me that it's often cleaner to use a dict (as a "case statement) than an if/elif/else series.

Let me elaborate on my use case. If a variable is a string, I need to put it in a list. If it's a dict, I need a list of the unique values. Here's what I came up with:

def value_list(x):
    cases = {str: lambda t: [t],
             dict: lambda t: list(set(t.values()))}
    try:
        return cases[type(x)](x)
    except KeyError:
        return None

If isinstance is prefered, how would you write this value_list() function?

A: 

That should work - so no, there is nothing wrong with your code. However, it could also be done with a dict:

{type(str()): do_something_with_a_string,
 type(dict()): do_something_with_a_dict}.get(type(x), errorhandler)()

A bit more concise and pythonic wouldn't you say?


Edit.. Heeding Avisser's advice, the code also works like this, and looks nicer:

{str: do_something_with_a_string,
 dict: do_something_with_a_dict}.get(type(x), errorhandler)()
nakedfanatic
No it's not more pythonic because you are supposed to use the isinstance builtin function.
David Locke
Heh, I knew the 'P' word would be inflammatory. I stand by my answer however, as an alternative that avoids the if-elif-else structure.
nakedfanatic
I agree. See my edited question.
Daryl Spitzer
+7  A: 

built-in types in Python have built in names:

>>> s = "hallo"
>>> type(s) is str
True
>>> s = {}
>>> type(s) is dict
True

btw note the is operator. However, type checking (if you want to call it that) is usually done by wrapping a type-specific test in a try-except clause, as it's not so much the type of the variable that's important, but whether you can do a certain something with it or not.

Albert Visser
The preferred way, as others have mentioned is to use the isinstance builtin function.
David Locke
Why is isinstance preferred?
Daryl Spitzer
isinstance can be done on any class/type, including the ones you define yourself, whereas there is a limited number of builtin type names
Albert Visser
+10  A: 

"type(dict())" says "make a new dict, and then find out what its type is". It's quicker to say just "dict".

But if you want to just check type, a more idiomatic way is isinstance(x, dict).

Is "isinstance(x, dict)" better than "type(x) is dict"? Why?
Daryl Spitzer
@Daryl http://codepad.org/WS6BWUa5
Dustin
+2  A: 

I think it might be preferred to actually do

if isinstance(x, str):
    do_something_with_a_string(x)
elif isinstance(x, dict):
    do_somethting_with_a_dict(x)
else:
    raise ValueError

2 Alternate forms, depending on your code one or the other is probably considered better than that even. One is to not look before you leap

try:
  one, two = tupleOrValue
except TypeError:
  one = tupleOrValue
  two = None

The other approach is from Guido and is a form of function overloading which leaves your code more open ended.

http://www.artima.com/weblogs/viewpost.jsp?thread=155514

Ed
+21  A: 

What happens if somebody passes a unicode string to your function? Or a class derived from dict? Or a class implementing a dict-like interface? Following code covers first two cases. If you are using Python 2.6 you might want to use collections.Mapping instead of dict as per the ABC PEP.

def value_list(x):
    if isinstance(x, dict):
        return list(set(x.values()))
    elif isinstance(x, basestring):
        return [x]
    else:
        return None
Suraj Barkale
+1 for mentioning ABCs
nakedfanatic
+4  A: 

isinstance is preferrable over type because it also evaluates as True when you compare an object instance with it's superclass, which basically means you won't ever have to special-case your old code for using it with dict or str subclasses.

For example:

 >>> class a_dict(dict):
 ...     pass
 ... 
 >>> type(a_dict()) == type(dict())
 False
 >>> isinstance(a_dict(), dict)
 True
 >>>

Of course, there might be situations where you wouldn't want this behavior, but those are –hopefully– a lot less common than situations where you do want it.

Dirk Stoop
A: 

I've been using a different approach:

from inspect import getmro
if (type([]) in getmro(obj.__class__)):
    # This is a list, or a subclass of...
elif (type{}) in getmro(obj.__class__)):
    # This one is a dict, or ...

I can't remember why I used this instead of isinstance, though...

Matthew Schinckel
+1  A: 

I think I will go for the duck typing approach - "if it walks like a duck, it quacks like a duck, its a duck". This way you will need not worry about if the string is a unicode or ascii.

Here is what I will do:

In [53]: s='somestring'

In [54]: u=u'someunicodestring'

In [55]: d={}

In [56]: for each in s,u,d:
    if hasattr(each, 'keys'):
        print list(set(each.values()))
    elif hasattr(each, 'lower'):
        print [each]
    else:
        print "error"
   ....:         
   ....:         
['somestring']
[u'someunicodestring']
[]

The experts here are welcome to comment on this type of usage of ducktyping, I have been using it but got introduced to the exact concept behind it lately and am very excited about it. So I would like to know if thats an overkill to do.

JV
It seems likely that this could potentially yield false positives -- if we're worried about that kind of thing.ie... my 'Piano' class also has 'keys'
nakedfanatic
depends on the dataset, if i know i just have dictionaries and strings(unicode or ascii), then it shall work flawless. Yes, in a grand sense of things, you are correct in saying that it might lead to false positives.
JV
+6  A: 

*sigh*

No, typechecking arguments in python is not necessary. It is never necessary.

If your code accepts either a string or a dict object, your design is broken.

That comes from the fact that if you don't know already the type of an object in your own program, then you're doing something wrong already.

Typechecking hurts code reuse and reduces performance. Having a function that performs different things depending on the type of the object passed is bug-prone and has a behavior harder to understand and maintain.

You have the following saner options:

1) Make a function unique_values that converts dicts in unique lists of values:

def unique_values(some_dict):
    return list(set(some_dict.values()))

Make your function assume the argument passed is always a list. That way, if you need to pass a string to the function, you just do:

myfunction([some_string])

If you need to pass it a dict, you do:

myfunction(unique_values(some_dict))

That's your best option, it is clean, easy to understand and maintain. Anyone reading the code immediatelly understands what is happening, and you don't have to typecheck.

2) Make two functions, one that accepts lists of strings and one that accepts dicts. You can make one call the other internally, in the most convenient way (myfunction_dict can create a list of strings and call myfunction_list).

In any case, don't typecheck. It is completely unnecessary and has only downsides. Refactor your code instead in a way you don't need to typecheck. You only get benefits in doing so, both in short and long run.

nosklo
+1 for a solid comment. Personally I stay away from all kinds of typechecking. If I need to, I prefer "Its easier to ask for forgiveness than permission". I `try` an operation and `except` the error. Never do `if` then this `else` this.
jeffjose
A: 

You may want to check out typecheck. http://oakwinter.com/code/typecheck/

Type-checking module for Python

This package provides powerful run-time typechecking facilities for Python functions, methods and generators. Without requiring a custom preprocessor or alterations to the language, the typecheck package allows programmers and quality assurance engineers to make precise assertions about the input to, and output from, their code.

Paul Hildebrandt