views:

130

answers:

3

I'm new to Cocoa and have a small question before I get carried away with using categories.

Say you add a new method to NSString. Does that affect performance of normal NSString messages, or are category methods only checked when a method call does not match the standard method set?

+5  A: 

If you are new to Cocoa, this shouldn't be something you worry about. Apart from the basic rule of not making assumptions about performance without profiling.

If you need to provide functionality, you add it where you can. If it makes sense to extend a class then do so. If you don't add the functionality to NSString, you would have to provide it elsewhere. If you do it in another class, then that adds a different level of complexity.

Personally, I would worry about your application's design. Use a category if it makes sense, and if you are really concerned about performance, profile your app once it works.

And although I am making an assumption here, I would say that there are more likely to be bigger hits to your application's performance than the speed of calls to NSString. Unless, you are making a lot of NSString calls - in which case we're back to profiling again.

Abizern
your right I am new to objC, I have to worry about it as I'm making hundreds of millions of calls to nsstring objects. Have already had to convert some code to use c structs instead. (:
In cases like this, the overhead associated with using objects is probably the culprit, not any slowdowns that you're guessing might be caused by adding methods via categories.
Quinn Taylor
Agreed. NSStrings are not performance demons. For hundreds of millions of calls I would definitely take the trouble to switch to cstrings (and probably carrays if you're using NSArray). Profile as Abizem says, but experience says that on that order of magnitude, you need C. Just wrap it up inside an object so that the C data structures doesn't spill into your whole program, and you can optimize their internal performance without rewriting your program.
Rob Napier
+6  A: 

All methods are sent using dynamic dispatch, to messages to category methods don't interfere with "normal" messages.

From a performance aspect, the runtime handles associating the methods with the class in question, so there is a one-time cost for that, but there is no change to each individual object. I wouldn't be concerned about performance with categories, but instead be cautious about making sure that methods you add via categories don't include default methods or those specified in other categories. That's where the problems generally begin.

Quinn Taylor
Do these make sense for nsstring categories? '(NSString*)firstStringBetween: start and: end' '(NSArray*)stringsBetween: start and: end' ie [string strinsBetween: @"(" and @")"]
Those seem like okay names, although they're not as detailed as they could be, and I'd expect the documentation to make up for that. As long as you know what they mean, though... In general, I'd examine the names of existing Cocoa methods and use the pattern they do.
Quinn Taylor
The easiest way to avoid method name collisions is to add a prefix to all the methods of the category. XYZFirstStringBetween:...
Benedict Cohen
That's true, but it violates good design of method names. I'm of the opinion that knowing about the potential for conflict can help avoid such problems. Programmers who add categories to widely-used classes should be especially wary of the potential pitfalls. That said, whatever works is better than something that doesn't work. :-)
Quinn Taylor
The trouble with well-named category methods is that they're the ones mostly likely to collide with Apple's. I just had that happen when I added -pop to NSMutableArray and UINaviagtionController exploded. Lesson: good names are good, and I hate prefixes on methods, but just a *little* bit weird can be better for categories on Apple objects. I personally think it's a bug that Apple adds private methods without prefixing with underscore. I should radar that. But they do it a *lot*.
Rob Napier
+4  A: 

In general, no.

objc_msgSend() keeps a pseudo-Least Recently Used cache of the most recent SEL to IMP lookups on a per class basis. As always, the specifics are 'implementation private details', but it's reasonable to say that the lookup time is ~O(1) on average, regardless of the number of selectors. The most common way this is done is with a small hash table- if the selector is in the cache, then the dispatch is essentially instant. If the selector isn't in the cache, then it needs to perform the 'slow path' expensive lookup.

However, even the 'slow path' can be reasonably fast. There are any number of data structures one can turn to, such as Red Black Trees, that offer excellent, sub exponential lookup times that scale well regardless of the number of selectors- generally in the O(log2(selectorCount)) range. Again, how libobjc deals with these kinds of details are private, but there's so many data structures that easily scale regardless of the number of items to search that there's no reason that this kind of thing should even be on your radar.

A quick check via nm turns up 7771 selectors in Foundation, and 27510 selectors in AppKit, for a total of 35281 selectors just between the two. Throw in QuickTime, CoreData, WebKit, Quartz, and you're up to 50K selectors easily. With a log2 lookup time growth rate, doubling the number of selectors will increase the worst case time by less than 10%.

In summary: objc_msgSend() uses a small hash based cache to provide O(1) lookup times for the most recently used selectors... and there is a very high degree of temporal locality, so the vast majority of dispatches are completed in O(1) time regardless of the number of selectors present in the system. The natural effect of the cache is to 'tune' itself to your particular usage patterns. Even on a cache miss it's probably a reasonable guess that the worst case lookup time is ~O(log2(selectorCount)) bound, which is pretty good, and it's probably better than that in practice.

For what it's worth, I've spent a lot of time tweaking code for speed. Even on multi-threading stuff where I peg all the CPUs doing huge amounts of analysis -> NSView / OpenGL heavy result rendering, all coded in Objective-C, I'll only see objc_msgSend() take up 1-4 percent of the CPU when profiled with Shark.app... and that's the worst case, doing heavy Objective-C message dispatching. It's never been an issue for me, and whatever minor speed penalty there is, it's made up for in 100X programming productivity, easily.

See also:

Mulle kybernetiK- Obj-C Optimization: The faster objc_msgSend
Objective-C 2.0 Runtime Programming Guide - Messaging
Apple Objective-C Runtime - objc4-437.tar.gz

EDIT: How weird is this: Patent 5960197 - Compiler dispatch function for object-oriented C. Can't say I knew the whole Obj-C message dispatching system was patented.... I guess you really can get a patent on anything. I'm going to go patent the alphabet and charge big, baby!

johne
+1 Very nice explanation of the likely (if undocumented) implementation details.
Rob Napier