ansaurus

Question

Why does C have a distinction between -> and . ?

Answer 1

+25 A:

Well there clearly isn't any ambiguity or the proposal couldn't be made. The only issue is that if you see:

p->x = 3;

you know p is a pointer but if you allow:

p.x = 3;

in that circumstance then you don't actually know, which could potentially create problems, particularly if you later cast that pointer and use the wrong number of levels of indirection.

cletus 2009-11-28 21:51:11

Indeed. It makes code easier to read/understand if you can immediately recognize a variable as a pointer based on the indirection operator.

Mike Weller 2009-11-29 12:37:14

Answer 2

+2 A:

If anything, the current syntax lets readers of the code know whether or not the code is working with a pointer or the actual object. Someone who does not know the code beforehand understands it better.

Noctis Skytower 2009-11-28 21:52:13

Answer 3

+27 A:

I don't think there's anything crazy about what you've said. Using . for pointers to structs would work.

However, I like the fact that pointers to structs and structs are treated differently.

It gives some context about operations and clues as to what might be expensive.

Consider this snippet, imagine that it's in the middle of a reasonably large function.

s.c = 99;
f(s);

assert(s.c == 99);

Currently I can tell that s is a struct. I know that it's going to be copied in its entirety for the call to f. I also know that that assert can't fire.

If using . with pointers to struct were allowed, I wouldn't know any of that and the assert might fire, f might set s.c (err s->c) to something else.

The other downside is that it would reduce compatibility with C++. C++ allows -> to be overloaded by classes so that classes can be 'like' pointers. It's important that . and -> behave differently. "New" C code that used . with pointers to structs would no probably not be acceptable as C++ code any more.

Charles Bailey 2009-11-28 21:56:27

+1 for C++ problems.

Douglas Leeder 2009-11-28 22:29:28

Why is the difference between "->" and "." important in C++? Couldn't you just overload operator "." in C++ instead of ->?

Edan Maor 2009-11-28 22:39:48

Currently you can't overload the `.` operator, but that's not important right now. What can be important is that a class can act like a pointer because you can intercept the `->` operator to return a pointer to the object being 'proxied', but then you can also call things on the object itself (e.g. `.reset()` to set it to null or something). Losing the disctintion between `.` and `->` would prevent this from working.

Charles Bailey 2009-11-28 23:06:38

With this example, you don't actually know that the assertion will hold. The struct may contain a pointer to itself, giving f the ability to modify s.c

William Pursell 2009-11-29 09:02:17

@William Pursell: Touche, I knew when i wrote this that there was probably a corner case where it wouldn't hold. I was actually wondering if there was a type that you could assign an int (99) to and which then wouldn't compare equal to 99 when promoted back to an int, but a self referential (or global/static instance!) would also work.

Charles Bailey 2009-11-29 17:54:56

Answer 4

+5 A:

Well, there could definitely be cases where you have something complex like:

(*item)->elem

(which I have had happen in some programs), and if you wrote something like

item.elem

meaning the above, it could be confusing whether elem is an element of struct item, or an element of a struct that item points to, or an element of a struct that is pointed to be an element in a list that is pointed to by an iterator item, and so on and so forth.

So yeah, it does make things somewhat clearer when using pointers to pointers to structs, &c.

Keand64 2009-11-28 21:59:14

Answer 5

+3 A:

Well, if you really wanted to introduce that kind of functionality into the specification of C language, then in order to make it "blend" with the rest of the language the logical thing to do would be to extend the concept of "decay to pointer" to struct types. You yourself made an example with a function and a function pointer. The reason it works that way is because function type in C decays to pointer type in all contexts, except for sizeof and unary & operators. (The same thing happens to arrays, BTW.)

So, in order to implement something similar to what you suggest, we could introduce the concept of "struct-to-pointer decay", which would work in exactly the same way as all other "decays" in C (namely, array-to-pointer decay and function-to-pointer decay) work: when a struct object of type T is used in an expression, its type immediately decays to type T* - pointer to the beginning of the struct object - except when it's an operand of sizeof or unary &. Once such a decay rule is introduced for structs, you could use -> operator to access struct elements regardless of whether you have a pointer to struct or the struct itself on the left-hand side. Operator . would become completely unnecessary in this case (unless I'm missing something), you'd always use -> and only ->.

The above, once again, what this feature would look like, in my opinion, if it was implemented in the spirit of C language.

But I'd say (agreeing with what Charles said) that the loss of visual distinction between the code that works with pointers to structs and the code that works with structs themselves is not exactly desirable.

P.S. An obvious negative consequence of such a decay rule for structs would be that besides the current army of newbies selflessly believing that "arrays are just constant pointers", we'd have an army of newbies selflessly believing that "struct objects are just constant pointers". And Chris Torek's array FAQ would have to be about 1.5-2x larger to cover structs as well :)

AndreyT 2009-11-28 22:16:46

If it really was redundant, then . could be maintained as a synonym for ->. People who wanted to maintain the distinction could use -> with pointers and . with structs, in the same way that some C++ programmers (try to) declare POD classes with `struct` and non-POD classes with `class`. Then the compiler wouldn't help, so people would make mistakes, and ask for a compiler option to enforce the difference, and be back where they started ;-)

Steve Jessop 2009-11-28 22:39:14

Answer 6

+4 A:

Yes, that's OK, but it is not what C really needs at all

Not only is it OK, but it is the modern style. Java and Go both just use .. Since everything that doesn't fit in a register is at some level a reference, the distinction between thing and pointer to thing is definitely a bit arbitrary, at least until you get to function calls.

The first evolutionary step was to make the dereference operator postfix, something dmr once implied he at some point prefered. Pascal does this, so it has p^.field. The only reason there even is a -> operator is because it's goofy to have to type (*p).field or p[0].field.

So yes, it would work. It would even be better as it works at a higher level of abstraction. One really should be able to make as many changes as possible without requiring downstream code to change, that's in a sense the entire point of higher level languages.

I have argued that using () for function calls and [] for array subscripting is wrong. Why not allow different implementations to export different abstractions?

But there isn't much reason to make the change. C programmers are unlikely to revolt over the lack of a syntactic sugar extension that saves one character in an expression and it would be hard to use anyway because it would not be immediately if ever universally adopted. Remember that when standards committees go rogue they end up preaching to empty rooms. They require the willing cooperation and agreement of the world's compiler developers.

What C really needs isn't ever-so-slightly faster ways to write unsafe code. I don't mind working in C, but project managers don't like having their reliability determined by their worst guy, and it's possible that what C really needs is a safe dialect, something like Cyclone, or perhaps something just like Go.

DigitalRoss 2009-11-29 00:05:59

Have you *used* Cyclone? It's great research, but the type system is from hell. And it was always damned difficult to keep things in statically typed regions as opposed to letting everything work its way up to the garbage-collected heap. It's great work, but let's not oversell it, shall we?

Norman Ramsey 2009-11-29 00:59:25

Hmm, not like you have, apparently. I will revise...

DigitalRoss 2009-11-29 01:01:46

+1, really good summary.

Konrad Rudolph 2009-11-29 01:17:11

Answer 7

+15 A:

A distinguishing feature of the C programming language (as opposed to its relative C++) is that the cost model is very explicit. The dot is distinguished from the arrow because the arrow requires an additional memory reference, and C is very careful to make the number of memory references evident from the source code.

Norman Ramsey 2009-11-29 00:57:37

Good point. And that memory reference may be very expensive on modern architectures, perhaps costing 1000x as much as accessing a register, assuming that the data needs to be fetched from main memory.

emk 2009-11-29 01:18:59

@Norman Ramsey: The implicit memory refernce suggested by the OP has the same nature as the potential implicit memory reference in `[]` and `()` operators, as I noted in my answer. When you use `[]` operator, you can't see from the syntax wheteher you are working with a "real" array or with a pointer object (the latter requiring an extra memory reference). So no, this part of cost model is not normally explicit in C. And what he OP is proposing does not cross the traditional boundaries of the "implicitness" of C cost model at all.

AndreyT 2009-11-29 17:46:06

@AndreyT: There's nothing implicit about `[]`; `a[i]` is always and forever syntactic sugar for `*(a+i)`, just as `p->x` is syntactic sugar for `(*p).x`. I used to love to blow people's minds writing `i[a] = a[i] + k` and similar wackiness.

Norman Ramsey 2009-11-30 00:08:19

@Norman : Yes, there is. You are missing the fact that there's a significant difference between 'a' as a name of array object and 'a' as a pointer object. The expression `*(a + i)` looks the same in both cases, but in fact it's actual semantics is considerably different. In case of an array object, the act of converting array type to ponter type is purely conceptual, meaning that the resutant pointer is essentially a compile-time constant (or a compile-time offset in case of automatic array). There's no memory access in the process of obtaining the pointer value.

AndreyT 2009-12-01 23:26:37

But in case when `a` is a pointer, retrieving the actual value of `a` requires a memory access. So, you are wrong, there's an inherent implicitness in `[]`. Stating that `[]` is just a syntactic sugar for `*(a + i)` doesn change anything - the same implicitness is still present in `*(a + i)` as well. Moreover, the nature of that implicitness is *exactly* the same as in what the OP proposes. I actually illustrate it in a very strighforward way in my answer.

AndreyT 2009-12-01 23:30:08

@AndreyT: I didn't say there was no difference between a pointer and an array. Remind me how many compilers you've written again?

Norman Ramsey 2009-12-02 04:48:39

Huh? What does this have to do with the number of compilers I have written? You made a claim that C has explicit cost model. I have demonstarted that C cost model is not absolutely explicit and, moreover, that exactly the same "implicitness" as the one proposed by the OP is *already present* in the language. Now you seem to be trying to switch subject. Why? And what's the point? This is a very minor issue, but you are acting as if my being right is somehow insulting to you. Sheesh... :)

AndreyT 2009-12-02 05:39:45

ansaurus

tags:

views:

answers:

Why does C have a distinction between -> and . ?

Yes, that's OK, but it is not what C really needs at all

related questions