views:

818

answers:

9

Hi,

In Objective-C, why [object doSomething]? Wouldn't it be [*object doSomething] since you're calling a method on the object, which means you should dereference the pointer?

Thanks!

+8  A: 
  1. It's not a pointer, it's a reference to an object.
  2. It's not a method, it's a message.
Matthew Flaschen
Question rephrased: Why do you call methods on references to objects and not the object itself? :-)
chrisgoyal
@unknown: You missed the second point. You're not calling a method, you're sending a message through a reference.
Chuck
[object message] syntax sends a message to an object, and the object executes suitable method for the message.
Eonil
+2  A: 

The Objective-C runtime may need to bounce the object around to a couple different functions, so it wants the object reference, and not the object itself.

Dave DeLong
Thanks. Is this fact documented anywhere? None of the books I've read so far cover why this is the case.
chrisgoyal
The source is available. If I have a moment, I'll have to answer this question in detail. It is an interesting issue and there is subtlety involved.
bbum
+6  A: 

Because objc_msgSend() is declared like this:

id objc_msgSend(id theReceiver, SEL theSelector, ...)
Darren
+4  A: 

Part of the reason is that you would get null pointer exceptions left and right. Sending a message to nil is allowed and often perfectly legitimate (it does nothing and does not generate an error).

But you can think of it as analogous to C++'s -> notation: It executes the method and dereferences the pointer in one piece of syntactic sugar.

Frank Schmitt
Sending messages to `nil` is one of the reasons I like Objective-C so much. It's nice being able to test for a non-empty string with `if([myString length])` or test for a non-empty array with `if([myArray count])` than to have to explicitly make sure the object is valid before querying.
dreamlax
+3  A: 

You never dereference object pointers, period. The fact that they're typed as pointers rather than just "object types" is an artifact of the language's C heritage. It's exactly equivalent to Java's type system, where objects are always accessed through references. You never dereference an object in Java — in fact, you can't. You should not think of them as pointers, because semantically, they aren't. They're just object references.

Chuck
+2  A: 

I'd phrase this way: What a language associates to a series of alphabets is just a convention. The people who designed Objective-C decided that

[x doSomething];

to mean "sending the doSomething message to the object pointed by x". They defined it that way, you follow the rule :) One peculiarity of Objective-C, compared to e.g. C++, is that it doesn't have a syntax to hold an object itself, not a pointer to object. So,

NSString* string;

is OK, but

NSString string;

is illegal. If the latter were possible, there would have to be a way to "send the message capitalizedString to a string string," not to "send the message capitalizedString to a string pointed by string". But in reality, you always send a message to an object pointed by a variable in your source code.

So, if the designers of Objective-C had followed your logic, you would have to write

[*x doSomething];

every time you send a message... You see, * needs to appear always after the leading bracket [, forming the combination [*. At that stage, I believe you agree that it's better to redesign the language so that you only have to write [ instead of [*, by changing the meaning of the sequence of letters [x doSomething].

Yuji
Thanks for the clarification. So it seems like the people who made Objective-C decided that you send messages to pointers to object, not to the object itself. Still curious as to *why* they chose this way... :-)
chrisgoyal
+2  A: 
dreamlax
+3  A: 

Objective-C was first proposed and discussed in this book: Object-Oriented Programming: An Evolutionary Approach. It's not immensely practical for modern Cocoa programmers, but the motivations for the language are in there.

Note that in the book all objects are given type id. You don't see the more specific Object *s in the book at all; they are just a leak in the abstraction when we're talking about the "why." Here's what the book says:

Object identifiers must uniquely identify as many objects as may ever coexist in the system at any one time. They are stored in local variables, passed as arguments in message expressions and in function calls, held in instance variables (fields inside objects), and in other kinds of memory structures. In other words, they can be used as fluidly as the built-in types of the base language.

How an object identifier actually identifies the object is an implementation detail for which many choices are plausible. A reasonable choice, certainly one of the simplest, and the one that is used in Objective-C, is to use the physical address of the object in memory as its identifier. Objective-C makes this decision known to C by generating a typedef statement into each file. This defines a new type, id, in terms of another type that C understands already, namely pointers to structures. [...]

An id consumes a fixed amount of space. [...] This space is not the same as the space occupied by the private data in the object itself.

(pp58-59, 2nd ed.)

So the answer to your question is twofold:

  1. The language design specifies that the identifier of an object is not the same as an object itself, and the identifier is the thing that you send messages to, not the object itself.
  2. The design doesn't dictate, but suggests, the implementation that we have now, where pointers to objects are used as identifiers.

The strictly-typed syntax where you say "an object specifically of type NSString" and thus use NSString * is a more modern change, and is basically an implementation choice, equivalent to id.

If this seems like a high-minded response to a question about pointer dereferencing, it's important to keep in mind that objects in Objective-C are "special" per the definition of the language. They are implemented as structures and passed around as pointers to structures, but they are conceptually different.

quixoto
+25  A: 

The answer harkens back to the C roots of Objective-C. Objective-C was originally written as a compiler pre-processor for C. That is, Objective-C wasn't compiled so much as it was transformed into straight C and then compiled.

Start with the definition of the type id. It is declared as:

typedef struct objc_object {
    Class isa;
} *id;

That is, an id is a pointer to a structure whose first field is of type Class (which, itself, is a pointer to a structure that defines a class). Now, consider NSObject:

@interface NSObject <NSObject> {
    Class   isa;
}

Note that the layout of NSObject and the layout of the type pointed to by id are identical. That is because, in reality, an instance of an Objective-C object is really just a pointer to a structure whose first field -- always a pointer -- points to the class that contains the methods for that instance (along with some other metadata).

When you subclass NSObject and add some instance variables you are, for all intents and purposes, simply creating a new C structure that contains your instance variables as slots in that structure concatenated on the slots for the instance variables for all superclasses. (The modern runtime works slightly differently so that a superclass can have ivars appended without requiring all subclasses to be recompiled).

Now, consider the difference between these two variables:

NSRect foo;
NSRect *bar;

(NSRect being a simple C structure -- no ObjC involved). foo is created with the storage on the stack. It will not survive once the stack frame is closed, but you also don't have to free any memory. bar is a reference to an NSRect structure that was, most likely, created on the heap using malloc().

If you try to say:

NSArray foo;
NSArray *bar;

The compiler will complain about the first, saying something along the lines of stack based objects are not allowed in Objective-C. In other words, all Objective-C objects must be allocated from the heap (more or less-- there are one or two exceptions, but they are comparatively esoteric to this discussion) and, as a result, you always refer to an object through the address of said object on the heap; you are always working with pointers to objects (and the id type really is just a pointer to any old object).

Getting back to the C preprocessor roots of the language, you can translate every method call to an equivalent line of C. For example, the following two lines of code are identical:

[myArray objectAtIndex: 42];
objc_msgSend(myArray, @selector(objectAtIndex:), 42);

Similarly, a method declared like this:

- (id) objectAtIndex: (NSUInteger) a;

Is equivalent to C function declared like this:

id object_at_index(id self, SEL _cmd, NSUInteger a);

And, looking at objc_msgSend(), the first argument is declared to be of type id:

OBJC_EXPORT id objc_msgSend(id self, SEL op, ...);

And that is exactly why you don't use *foo as the target of a method call. Do the translation through the above forms -- the call to [myArray objectAtIndex: 42] is translated to the above C function call which then must call something with the equivalent C function call declaration (all dressed up in method syntax).

The object reference is carried through because it gives the messenger -- objc_msgSend() access to the class to then find the method implementation -- as well as that reference then becoming the first parameter -- the self -- of the method that is eventually executed.

If you really want to go deep, start here. But don't bother until you have fully grokked this.

bbum
Great Answer. Thank You, I stumbled across your blog entry about a year ago, and was not able to grok it then, but after some more time with obj-c, I'm starting to go deeper down the rabbit hole and now really appreciate it.
Brad Smith