How do you efficiently implement the observer pattern if the subject is a huge container?

views:

380

answers:

+3 Q:

How do you efficiently implement the observer pattern if the subject is a huge container?

We all know the observer pattern: You have a subject which is able to notify and update a list of observers of its state changes. Now suppose that the subject you would like to observe is a container, and you would like to observe the container itself, i.e. element addition and deletion of elements, and also the contained elements, i.e. state updates of the container elements.

How would you implement the update mechanism so that it is fast with respect to element insertion and deletions when you store massive amounts of objects in your container? In particular,

would you use the same type of container in the local copy of the observers?
is there a smart choice of container which the observers should use? (For instance, would it be faster to, say, always use balanced trees, even if you are observing a linked list?)
how do you quickly translate an iterator into the observed container into an iterator into the observer's container? (Trivial for arrays, hard for linked lists?)

If your container is a linked list for instance, then you can insert elements in constant time. If m observers have to iterate through the list containing n elements, then updating takes O(n * m) expected time.

If your container is an array, then changing an element takes constant time, and updating m observers takes O(m) if you pass the element's index, O(n * m) if observers have to iterate through the array.

If it helps, consider the following examples:

Example 1. You are writing an operating system. The subject you would like to observe is the filesystem and its files. Your views are a file explorer, an indexer, and other applications. You would like to update the observers when files are added, deleted or modified.

Example 2. You are writing an address book application which should be able to handle a city of the size of New York. The subject you would like to observe is the container of your records (a person with its address, phone numbers, email...). Your observers are several views, which should update automatically when you add, delete or modify a record. (One could image one view containing a list of people who live on 53rd and another drawing dots on a map for each person whose last name is Doe).

How do you handle the case that a complete directory-subtree is deleted or that "53rd St" is renamed to "Dijkstra St"?

+1 A:

Why not observer pattern itself ?

The subject needs to inform the observer about the interesting events. Then the observer shall dispatch it to interested parties (subscribers).

The nature of the subject is not of any significance here. (Unless I understood your question wrong).

Sathya 2009-06-05 10:16:12

Yes, it will be some incarnation of the observer pattern. However, one can think of many ways to implement the pattern in this context:- Do observers observe the container or also the contained elements?- Working with huge containers means that there is a non-negligible cost associated to each operation (insertion, deletion, searching, possibly sorting). If you have an iterator into a linked list, it is cheap to insert something there. But observers usually don't have a corresponding iterator into their copy of the list, making the same insertion expensive if it's their copy is also a list.

Tobias 2009-06-05 12:24:31

Somehow, you must turn the container into a subject.

The main problem here is to find an efficient way to notice changes. Most of the time when you run into this problem, it's because the thing you want to observe doesn't offer an efficient notification mechanism (probably because the observer design pattern wasn't invented when they thing was written).

[EDIT] Since you ask for an efficient way, the general answer is "it depends". Design patterns don't have a "one-size-fits-all" solution. They are general rules how to approach a problem. How you need to implement the rules in a specific situation is something that you solve when you're in the situation.

Generally, if your observers need to identify small changes (i.e. an attribute change or adding an element), the notification message should contain enough information that they can do this efficiently. So if you have a big list and an insert, send the list and the index of the new element plus "item as inserted".

As for attribute changes, there are two solutions. One is to add an observer to every element in the list. This can be slow and need a lot of RAM but it means you can add several types into the same list.

Alternatively, you can have a "modify item in list service". This means that it's forbidden to change items directly, you must always use the service. The service can then work as a subject and send notifications with the item, the old and changed value and possibly with the index in the list.

[EDIT2] The general rule is to collect as much information about the change as possible and pass that to the observers. But it really depends on your specific problem. Let's say the observer is sitting on a remote machine. In this case, there is no efficient way to send it the whole list. You can only send it "item X was inserted" and hope that's enough. If the container has no way to notice changes (for example, new web pages on a web site), the container has to traverse the whole site again and again to find changes which it can then tell the observers in an efficient manner.

Again, the details really depend on the specific situation. Google runs thousand of web spiders which visit millions of web pages every hour. For a long time, this was "efficient" (as in "the only way"). A while ago, the "sitemap" protocol was implemented which allows admins to turn their web sites into subjects that can tell the Google observer about changes.

So unless you can give a more specific example what you need to do, I can't give you a more specific answer. With design patterns, there is a point where you need to sit down, take a real problem and turn on your brain.

[EDIT3] Here are a couple of examples for uses of the observer pattern:

Many UI frameworks use this pattern to spread events to interested parties. In Qt, you have a central spot where all subjects can register their signals (notifications they will send) and where observers can attach to subjects. This means there is a single spot where all connections are managed. The advantage is that you don't need to add this data structure to every object. Also, objects from outside (non-Qt objects) can send and receive messages. Since everything is in a single place, this data structure can be optimized easily. The drawback is that this structure can become very big, so sending a message will take more time when there are more parties involved (even ones which are completely unrelated).
Google uses the sitemap protocol to turn web sites into subjects since that's much more efficient than traversing the whole site again and again, even if you only request the last modification time of a URL (HTTP HEAD instead of HTTP GET).
Filesystems in Windows and Linux offer notifications to tell applications about new or deleted files. The main problem here is what should happen when files change while an application doesn't run. Say you have an app which maintains checksums of files in a directory. Obviously, you'd like to know about changes when the app was down but that would mean the notification service would have to keep track of the last change it sent. So here, the app has to read the whole tree in at startup to see anything it might have missed and it needs to use the observer pattern for changes happening while it runs.
A mail client is an observer. It will tell the mail server the ID of the last email it has seen and the server will tell it about any new ones.
When you have lots of attribute changes in a complex model, it's usually the only way to centralize all changes (make them run through a single place) and attack the observers there (instead of attaching N observers to M individual objects). In this implementation, the observers can say "I'm interested in any change anywhere" or "a change of the field X in any subject" or "any change in subject Y" (the last one usually doubles as a "change of field X in subject Y" - the observer will simply ignore changes to fields != X).

Aaron Digulla 2009-06-05 10:33:53

I am precisely interested in how you make the update mechanism efficient when the subject is a container.

Tobias 2009-06-05 11:05:53

@GrGr: Tobias fixed the text.

Aaron Digulla 2009-06-05 11:24:15

Sure it depends. That's why I'd like to know what others have done, and if there are any general rules how to approach this problem. In fact, I'd be happy to get references to specific examples where this problems is solved efficiently.

Tobias 2009-06-05 11:55:51

See edit#3 for examples. But I'm really not sure you understand what you ask. Efficient is only ever meaningful in a certain context. You can't generalize this.

Aaron Digulla 2009-06-05 13:30:19

I am not interested in plain usage of the observer pattern as in your examples. Also, I don't have a concrete problem I need to solve. I am merely interested in learning how others have solved problems of this category efficiently, as stated in my question. I gave to specific examples if you feel that there is no common denominator.

Tobias 2009-06-07 17:21:09

I don't get it. I answered all your specific questions, gave lots of examples... okay, I didn't post code and I didn't enumerate your questions and answered them one by one but maybe you should try to understand what I say. If you think for five minutes, you'll see that as soon as you have to solve one of the issues, it will be obvious what you can do.

Aaron Digulla 2009-06-08 14:26:04

ansaurus

tags:

views:

answers:

How do you efficiently implement the observer pattern if the subject is a huge container?

related questions