tags:

views:

1044

answers:

7

In The Ten Commandments for C Programmers, what is your interpretation of the 9th commandment?

The 9th commandment:

Thy external identifiers shall be unique in the first six characters, though this harsh discipline be irksome and the years of its necessity stretch before thee seemingly without end, lest thou tear thy hair out and go mad on that fateful day when thou desirest to make thy program run on an old system.

Exactly what is this all about?

+17  A: 

Old linkers only used a limited number of characters of the symbol - I seem to recall that the old IBM mainframes I started programming in only used 8 characters. The C standards people settled on 6 characters as a "lowest common denominator", but would allow a linker to resolve longer names if they wanted to.

If you really hit one of these lowest common denominator linkers, external symbols (function names, external variables, etc) ABCDEFG and ABCDEFH would appear the same to them. Unless you're programming on really old hardware, you can safely ignore this "commandment".

Note that any linker that can't handle more than 6 characters can't do C++ either because of the name mangling.

Paul Tomblin
Trivia: The linker on the PDP-11 running RT-11 stores symbols in Radix-50. Three characters fit in one 16-bit word. A six character symbol name fits in two words. That was enough symbols for the whole OS, and to write MACRO-11, TECO, and even a C compiler.
RBerteig
@RBerteig - I thought 6 had to do with the 36-bit machines that used 6-bits for it's native character encoding. A "WORD" held 6 6-bit characters and that made symbol lookup logic "fast". Can remember what system it was though that used 36-bit WORDS.
jmucchiello
@jmucchiello, as a wild guess, you are thinking of the PDP-10 or PDP-20, both of which had 36-bit words and a variety of ways to pack characters for storage. I'm reasonably confident that RAD50 was a supported case, as was 5 7-bit ASCII characters per word with a leftover bit.
RBerteig
+1  A: 

According to this site:

What does this mean? Your globals should be "Unique to the first six letters", not "limited to six letters". This is in ANSI, I hear, because of utterly painful "obsolescence" of some linkers. Hopefully ANSI will some day say "linkers will have to do longer names, or they'll cry and do longer names". (And all rational people will just ignore this commandment and tell people to upgrade their 2-penny linker - this may not get you friends, or make warnings happy...)

Eric Petroelje
+10  A: 

External identifier = something that might have to be called from another system

The reason for the first six characters being unique is that an ancient system may have a six-character limitation on its' identifiers. If, one day, such a system tries to call your code, it needs to be able to tell the difference between all of your functions.

These days, this seems overly conservative to me, unless you are working with a lot of legacy systems.

JosephStyons
External identifier == identifier visible outside of the current translation unit. Whether or not that identifier is visible from another system has no necessary bearing on its linkage and vice-versa, they are orthogonal qualities. Could you adjust the answer to reflect this?
Mihai Limbășan
+4  A: 

A lot of old compilers and linkers had limitations on how long an identifier could be. Six characters was a common limit. Actually, they could be longer than that, but the compiler or linker would throw away everything after the sixth character.

This was usually done to conserve symbol table memory.

Ferruccio
+1  A: 

It means you're looking at a piece of ancient history. Those commandments are mostly true, but that 9th one may as well actually be carved into a stone tablet, it's so old.

The remaining mystery is: creat. What was wrong with create? It's only six letters!

Daniel Earwicker
Dennis Richie said at a talk I was at that "creat" was named that way because they'd already used too many letters when they named "lseek". :-)
Paul Tomblin
+4  A: 

Here are the minimum number of significant characters in an external identifier that must be supported by C/C++ compilers under various standards:

  • C90 - 6 characters
  • C99 - 31 characters
  • C++98 - 1024 characters

Here's an example of the kinds of problems that you can run into if your toolset skimps on these limits (from http://www.comeaucomputing.com/libcomo/):

Note to users with Borland and MetroWerks CodeWarrior as backend C:

==================================================================

Note that the Borland compiler and linker, and the Metrowerks compiler, seem to have a maximum external id length of 250 characters. It turns out that some of the generated mangled template names are unable to fit within that space. Therefore, when Borland or Metrowerks is used as the backend C compiler, we have remapped some of the names libcomo uses to shorter names. So short in fact we could not get away with names beginning with underscores. In fact, it was necessary to map most to 2 character id names.

Michael Burr
+2  A: 

In response to:

  • C++98 - 1024 characters

'begin humor'

Addendum to 9th commandment:

If thy external identifiers approach'th to be anywhere near as long as one-thousand-and- twenty-four thou shouldst surely be quickly brought outside and shot.

'/end humor'

Trevor Boyd Smith
thou canst get enough of thee ye olde english!
dotjoe
Heed well the template instantiator, for ye must know it spawns the most foul external identifiers.
David Thornley
1024 is "one" thousand twenty-four, not ten thousand. :)
Andrew Coleson