views:

81

answers:

2

Suppose I have the following object:

class Foo(object):
  def __init__(self, name=None):
    self.name = name

  def __repr__(self):
    return self.name

And a list containing multiple instances, such as:

list = [Foo(name='alice'), Foo(name='bob'), Foo(name='charlie')]

If I want to find an object with a given name, I could use the following:

def get_by_name(name, list):
  return [foo for foo in list if foo.name == name][-1]

Which would obviously mean:

print get_by_name('alice', list)
>> alice

However, is there a more efficient data structure or method for retrieving such objects? In reality, the object names are only known at runtime and could in theory change throughout the lifecycle of the object.

Any advice?

UPDATE:

Thanks to Matt Joiners answer, I have updated it to support multiple Foo's with the same name:

class Foo(object):
    _all_names = {}    
    def __init__(self, name=None):
        self._name = None
        self.name = name        
    @property
    def name(self):
        return self._name        
    @name.setter
    def name(self, name):
        if self._name is not None:
            self._all_names[self._name].remove(self)
        self._name = name
        if name is not None:
            self._all_names.setdefault(name, []).append(self)
    @classmethod
    def get_by_name(cls, name):
        return cls._all_names[name]        
    def __repr__(self):
        return "{0}".format(self.name)

l = [Foo("alice"), Foo("bob"), Foo('alice'), Foo('charlie')]
print Foo.get_by_name("alice")
print Foo.get_by_name("charlie")

Any comments on this approach?

+7  A: 

Try this for size:

class Foo(object):
    _all_names = {}
    def __init__(self, name=None):
        self.name = name
    @property
    def name(self):
        return self._name
    @name.setter
    def name(self, name):
        self._name = name
        self._all_names[name] = self
    @classmethod
    def get_by_name(cls, name):
        return cls._all_names[name]
    def __str__(self):
        return "Foo({0})".format(self.name)

a = Foo("alice")
b = Foo("bob")
print Foo.get_by_name("alice")

Keep in mind it's kept minimal to convey the idea, there are many tweaks and checks you can make here and there.

Matt Joiner
Good idea. You should add `del self._all_names[self._name]` to the setter
THC4k
@THC4k: I did consider it, but it would require `hasattr`, and key checks to avoid exceptions, the example is enough to get started.
Matt Joiner
What would be the best way to update _all_names when an instance of Foo is deleted?
epoch
There it gets a bit more complicated epoch, as references are held to each Foo by at least the function that instiated it, and by the `_all_names` dict. It's not enough to hook `__del__` as this isn't invoked until the last reference disappears. One method I tried was to add a `del_by_name` `classmethod` to `Foo`, if you like I can add it here.
Matt Joiner
This makes sense. Many thanks for your answer.
epoch
I adjusted the OPs inspired version.
Matt Joiner
A: 

The situation is a little confusing.

You're going to be searching for this object in a list which is a linear data structure. The only way to look for something is by going through it one by one. This makes the time grow linearly with the length of the list. So that is where your speed issue is.

The way to get some speed gains is to use some kind of hashing scheme to keep your objects directly indexable by the key you're interested in (in your case, name). This will allow you to look up an object in a way similar to dictionary key lookup. It's a constant time operation. This is basically what Matt's answer does. He's keeping an "index" as a class level structure so that you can look things up using hashing rather than by walking a list.

Noufal Ibrahim