Hi,
In Objective-C, why [object doSomething]? Wouldn't it be [*object doSomething] since you're calling a method on the object, which means you should dereference the pointer?
Thanks!
Hi,
In Objective-C, why [object doSomething]? Wouldn't it be [*object doSomething] since you're calling a method on the object, which means you should dereference the pointer?
Thanks!
The Objective-C runtime may need to bounce the object around to a couple different functions, so it wants the object reference, and not the object itself.
Because objc_msgSend() is declared like this:
id objc_msgSend(id theReceiver, SEL theSelector, ...)
Part of the reason is that you would get null pointer exceptions left and right. Sending a message to nil
is allowed and often perfectly legitimate (it does nothing and does not generate an error).
But you can think of it as analogous to C++'s ->
notation: It executes the method and dereferences the pointer in one piece of syntactic sugar.
You never dereference object pointers, period. The fact that they're typed as pointers rather than just "object types" is an artifact of the language's C heritage. It's exactly equivalent to Java's type system, where objects are always accessed through references. You never dereference an object in Java — in fact, you can't. You should not think of them as pointers, because semantically, they aren't. They're just object references.
I'd phrase this way: What a language associates to a series of alphabets is just a convention. The people who designed Objective-C decided that
[x doSomething];
to mean "sending the doSomething
message to the object pointed by x". They defined it that way, you follow the rule :)
One peculiarity of Objective-C, compared to e.g. C++, is that it doesn't have a syntax to hold an object itself, not a pointer to object. So,
NSString* string;
is OK, but
NSString string;
is illegal. If the latter were possible, there would have to be a way to "send the message capitalizedString
to a string string
," not to "send the message capitalizedString
to a string pointed by string
". But in reality, you always send a message to an object pointed by a variable in your source code.
So, if the designers of Objective-C had followed your logic, you would have to write
[*x doSomething];
every time you send a message... You see, *
needs to appear always after the leading bracket [
, forming the combination [*
. At that stage, I believe you agree that it's better to redesign the language so that you only have to write [
instead of [*
, by changing the meaning of the sequence of letters [x doSomething]
.
Objective-C was first proposed and discussed in this book: Object-Oriented Programming: An Evolutionary Approach. It's not immensely practical for modern Cocoa programmers, but the motivations for the language are in there.
Note that in the book all objects are given type id
. You don't see the more specific Object *
s in the book at all; they are just a leak in the abstraction when we're talking about the "why." Here's what the book says:
Object identifiers must uniquely identify as many objects as may ever coexist in the system at any one time. They are stored in local variables, passed as arguments in message expressions and in function calls, held in instance variables (fields inside objects), and in other kinds of memory structures. In other words, they can be used as fluidly as the built-in types of the base language.
How an object identifier actually identifies the object is an implementation detail for which many choices are plausible. A reasonable choice, certainly one of the simplest, and the one that is used in Objective-C, is to use the physical address of the object in memory as its identifier. Objective-C makes this decision known to C by generating a typedef statement into each file. This defines a new type, id, in terms of another type that C understands already, namely pointers to structures. [...]
An id consumes a fixed amount of space. [...] This space is not the same as the space occupied by the private data in the object itself.
(pp58-59, 2nd ed.)
So the answer to your question is twofold:
The strictly-typed syntax where you say "an object specifically of type NSString" and thus use NSString *
is a more modern change, and is basically an implementation choice, equivalent to id
.
If this seems like a high-minded response to a question about pointer dereferencing, it's important to keep in mind that objects in Objective-C are "special" per the definition of the language. They are implemented as structures and passed around as pointers to structures, but they are conceptually different.
The answer harkens back to the C roots of Objective-C. Objective-C was originally written as a compiler pre-processor for C. That is, Objective-C wasn't compiled so much as it was transformed into straight C and then compiled.
Start with the definition of the type id
. It is declared as:
typedef struct objc_object {
Class isa;
} *id;
That is, an id
is a pointer to a structure whose first field is of type Class (which, itself, is a pointer to a structure that defines a class). Now, consider NSObject
:
@interface NSObject <NSObject> {
Class isa;
}
Note that the layout of NSObject
and the layout of the type pointed to by id
are identical. That is because, in reality, an instance of an Objective-C object is really just a pointer to a structure whose first field -- always a pointer -- points to the class that contains the methods for that instance (along with some other metadata).
When you subclass NSObject and add some instance variables you are, for all intents and purposes, simply creating a new C structure that contains your instance variables as slots in that structure concatenated on the slots for the instance variables for all superclasses. (The modern runtime works slightly differently so that a superclass can have ivars appended without requiring all subclasses to be recompiled).
Now, consider the difference between these two variables:
NSRect foo;
NSRect *bar;
(NSRect being a simple C structure -- no ObjC involved). foo
is created with the storage on the stack. It will not survive once the stack frame is closed, but you also don't have to free any memory. bar
is a reference to an NSRect structure that was, most likely, created on the heap using malloc()
.
If you try to say:
NSArray foo;
NSArray *bar;
The compiler will complain about the first, saying something along the lines of stack based objects are not allowed in Objective-C. In other words, all Objective-C objects must be allocated from the heap (more or less-- there are one or two exceptions, but they are comparatively esoteric to this discussion) and, as a result, you always refer to an object through the address of said object on the heap; you are always working with pointers to objects (and the id
type really is just a pointer to any old object).
Getting back to the C preprocessor roots of the language, you can translate every method call to an equivalent line of C. For example, the following two lines of code are identical:
[myArray objectAtIndex: 42];
objc_msgSend(myArray, @selector(objectAtIndex:), 42);
Similarly, a method declared like this:
- (id) objectAtIndex: (NSUInteger) a;
Is equivalent to C function declared like this:
id object_at_index(id self, SEL _cmd, NSUInteger a);
And, looking at objc_msgSend()
, the first argument is declared to be of type id
:
OBJC_EXPORT id objc_msgSend(id self, SEL op, ...);
And that is exactly why you don't use *foo
as the target of a method call. Do the translation through the above forms -- the call to [myArray objectAtIndex: 42]
is translated to the above C function call which then must call something with the equivalent C function call declaration (all dressed up in method syntax).
The object reference is carried through because it gives the messenger -- objc_msgSend() access to the class to then find the method implementation -- as well as that reference then becoming the first parameter -- the self -- of the method that is eventually executed.
If you really want to go deep, start here. But don't bother until you have fully grokked this.