views:

446

answers:

8

The following declaration in C:

int* a, b;

will declare a as type int* and b as type int. I'm well aware of this trap, but what I want to know is why it works this way. Why doesn't it also declare b as int*, as most people would intuitively expect? In other words, why does * apply to the variable name, rather than the type?

Sure you could write it this way to be more consistent with how it actually works:

int *a, b;

However, I and everyone I've spoken to think in terms of a is of type "pointer to int", rather than a is a pointer to some data and the type of that data is "int".

Was this simply a bad decision by the designers of C or is there some good reason why it's parsed this way? I'm sure the question has been answered before, but I can't seem to find it using the search.

+8  A: 

There may be an additional historical reason, but I've always understood it this way:

One declaration, one type.

If a, b, c, and d must be the same type here:

int a, b, c, d;

Then everything on the line must an integer as well.

int a, *b, **c, ***d;

The 4 integers:

  1. a
  2. *b
  3. **c
  4. ***d

It may be related to operator precedence, as well, or it may have been at some point in the past.

eruciform
The question wasn't how you understood it.
Georg Fritzsche
@georg: it's just a different wording of the above two. how did you understand it?
eruciform
@Georg, eruciform is correct. From K it says that the expression `*ip` is an `int`. The syntax of the declaration for a variable mimics the syntax of expressions in which the variable might appear. This *is* the reason for the syntax.
Matthew Flaschen
eruciform, @matthew, the question asks e.g. *"why it's parsed this way"*, i.e. why was the language designed that way. I only see *"here is how i understand the syntax"* in this answer :)
Georg Fritzsche
@Georg, he never said "here is how i understand the syntax" The understanding he gave of the language's design *is* the real reason for the language's design.
Matthew Flaschen
@matt: thanks :-)
eruciform
@mat: Maybe its too late here and i am missing something, but i see only *"here is how they decided the language will work"*, not the *"why"*. caf instead says *"so that they mirror use"*.
Georg Fritzsche
@georg: yes, and i meant that it's "so that everything on the line is the same type"
eruciform
+2  A: 

The * modifies the variable name, not the type specifier. This is mostly because of the way the * is parsed. Take these statements:

char*  x;
char  *x;

Those statements are equivalent. The * operator needs to be between the type specifier and the variable name (it is treated like an infix operator), but it can go on either side of the space. Given this, the declaration

int*  a, b;

would not make b a pointer, because there is no * adjacent to it. The * only operates on the objects on either side of it.

Also, think about it this way: when you write the declaration int x;, you are indicating that x is an integer. If y is a pointer to an integer, then *y is an integer. When you write int *y;, you are indicating that *y is an integer (which is what you want). In the statement char a, *b, ***c;, you are indicating that the variable a, the dereferenced value of b, and the triply-dereferenced value of c are all of type char. Declaring variables in this way makes the usage of the star operator (nearly) consistent with dereferencing.

I agree that it would almost make more sense for it to be the other way around. To avoid this trap, I made myself a rule always to declare pointers on a line by themselves.

bta
That is the status-quo, but *why* is it that way?
Georg Fritzsche
+16  A: 

There's a web page on The Development of the C Language that says, "The syntax of these declarations reflects the observation that i, *pi, and **ppi all yield an int type when used in an expression." Search for that sentence on the page to find the relevant section that talks about this question.

cape1232
+27  A: 

C declarations were written this way so that "declaration mirrors use". This is why you declare arrays like this:

int a[10];

Were you to instead have the rule you propose, where it is always

type identifier, identifier, identifier, ... ;

...then arrays would logically have to be declared like this:

int[10] a;

which is fine, but doesn't mirror how you use a. Note that this holds for functions, too - we declare functions like this:

void foo(int a, char *b);

rather than

void(int a, char* b) foo;

In general, the "declaration mirrors use" rule means that you only have to remember one set of associativity rules, which apply to both operators like *, [] and () when you're using the value, and the corresponding tokens in declarators like *, [] and ().


After some further thought, I think it's also worth pointing out that spelling "pointer to int" as "int*" is only a consequence of "declaration mirrors use" anyway. If you were going to use another style of declaration, it would probably make more sense to spell "pointer to int" as "&int", or something completely different like "@int".

caf
Hmm, this doesn't sound like a very good reason to me. I *do* want to declare arrays as `int[10]`, which is how Java and C# do it. Are there other examples in the C syntax of declaration mirroring use?
Evgeny
Matthew Flaschen
@Evgeny: The most obvious other example is function declarations, which are written to look like the function call.
caf
+1  A: 

Because if the statement

int* a, b;

were to declare b as a pointer too, then you would have no way to declare

int* a;
int  b;

on a single line.

On the other hand, you can do

int*a, *b;

to get what you want.

Think about it like that: the way it is now it is still the most concise and yet unique way to do it. That's what C is mostly about :)

lorenzog
Yes, well, similarly you can't declare an int and a float on the same line - so what? Actually, you can't delcare them in the same *statement*, but you could have a *line* that said `int* a; int b;` - no problem.
Evgeny
+2  A: 

I assume it is related to the full declaration syntax for type modifiers:

int x[20], y;
int (*fp)(), z;

In these examples, it feels much more obvious that the modifiers are only affecting one of the declarations. One guess is that once K&R decided to design modifiers this way, it felt "correct" to have modifiers only affect one declaration.

On a side note, I would recommend just limiting yourself to one variable per declaration:

int *x;
int y;
R Samuel Klatchko
Yeah, it *is* more obvious here, but really, who the hell declares a function pointer and an int in the same statement? :)
Evgeny
@Evgeny - you are missing the point. For function pointer syntax it's easy to understand why the modifiers only affect one variable. If the language was inconsistent and some modifiers affects one variable one other modifiers affected all variable, that would be bad design.
R Samuel Klatchko
A: 

I can only guess why one would have to repeat the * for every variable name mentioned on a line to make them all pointers. Maybe this is due to a decision for language consistency? Let me illustrate this with an example:


Let's assume you have a function foo declared as follows:

int foo() { ... }

Let's also assume that you want to declare two function pointers to this function:

int (*fooptr1, fooptr2)();
// this doesn't work; and even if it did, what would the syntax possibly
// look like to initializing them to function foo?
// int (*fooptr1 = foo, fooptr2 = foo)() ?

int (*fooptr1)() = foo, (*fooptr2)() = foo;
// this works.

In this case, you simply have to repeat the whole type declaration for both variables, and you can't go wrong because there's no other way to do this (given the C declaration syntax as it is).


Now, maybe it was thought, if there are cases where the type declaration has to be repeated for all variables, maybe this should just be the general case.

(Don't forget that I'm just guessing here.)

stakx
+1  A: 

Consider the declaration:

int *a[10];
int (*b)[10];

The first is an array of ten pointers to integers, the second is a pointer to an array of ten integers.

Now, if the * was attached to the type declaration, it wouldn't be syntatically valid to put a parenthesis between them. So you'd have to find another way to differentiate between the two forms.

Fabio Ceconello
"It should be noted that the declaration of a pointer to an array is not that useful" What? With `int (*b)[10]`, you can assign `b` to point to e.g. an arbitrary element of `int ar[ROW_COUNT][10]`, and pointer arithmetic will operate by row (e.g. after `ar += 1`, `ar` is incremented by one row). Please explain how to do that with a "simple pointer" (not sure what "simple" actually means).
Matthew Flaschen
You're right, I was thinking more about the use of that pointer as a parameter and failed to consider the case you mentioned. Correcting now.
Fabio Ceconello