views:

13193

answers:

49

I know there is a standard behind all C compiler implementations, so there should be no hidden features. Despite that, I am sure all C developers have hidden/secret tricks they use all the time.

+9  A: 

using INT(3) to set break point at the code is my all time favorite

Dror Helper
I don't think it's portable. It will work on x86, but what about other platforms?
Cristian Ciupitu
I have no idea - You should post a question about it
Dror Helper
@ Dror HelperNo, this type of thing should be "you should try it yourself"...
Ape-inago
@ape-inago you're right - thanks
Dror Helper
It's a good technique and it is X86 specific (although there are probably similar techniques on other platforms). However, this is not a feature of C. It depends on non-standard C extensions or library calls.
Ferruccio
In GCC there is __builtin_trap and for MSVC __debugbreak which will work on any supported architecture.
Axel Gneiting
+17  A: 

Well... I think that one of the strong points of C language is its portability and standardness, so whenever I find some "hidden trick" in the implementation I am currently using, I try not to use it because I try to keep my C code as standard and portable as possible.

Giacomo Degli Esposti
And the code is more stable if you code like that ;)
Johan
But in reality, how often do you have to compile your code with another compiler?
Joe D
+31  A: 

Interlacing structures like Duff's Device:

strncpy(to, from, count)
char *to, *from;
int count;
{
    int n = (count + 7) / 8;
    switch (count % 8) {
    case 0: do { *to = *from++;
    case 7:      *to = *from++;
    case 6:      *to = *from++;
    case 5:      *to = *from++;
    case 4:      *to = *from++;
    case 3:      *to = *from++;
    case 2:      *to = *from++;
    case 1:      *to = *from++;
               } while (--n > 0);
    }
}
ComSubVie
Please add the link: http://en.wikipedia.org/wiki/Duffs_device
jan.vdbergh
I don't understand this. I've added the link, it is displayed in the preview, but not in the final post...
ComSubVie
This is not a proper strcpy function - this implementation assumes that "to" is a memory-mapped register.
Adam Rosenfield
@ComSubVie: you created the link, but have't instantiated it. You need to add something like this to the answer: "[Wikipedia article][1] on Duff's device." Personally I dislike the Markdown notation for links, I tend to use HTML anchor tags directly.
DGentry
@ComSubVie, anyone who uses Duff's Device is a script kiddy who saw Duff's Device and thought their code would look 1337 if they used Duff's Device. (1.) Duff's Device doesn't offer any performance increases on modern processor because modern processors have zero-overhead-looping. In other words it is an obsolete piece of code. (2.) Even if your processor doesn't offer zero-overhead-looping, it will probably have something like SSE/altivec/vector-processing which will put your Duff's Device to shame when you use memcpy(). (3.) Did I mention that other that doing memcpy() duff's is not useful?
Trevor Boyd Smith
@ComSubVie, please meet my Fist-of-death (http://en.wikipedia.org/wiki/Alice_(Dilbert_character)#Alice.27s_violent_nature)
Trevor Boyd Smith
That's ... that's HORRIBLE. why would anyone want to do that to poor defenseless code?
Brian Postow
@Trevor: so only script kiddies program 8051 and PIC microcontrollers, right?
SF.
@Trevor Boyd Smith : While the Duff's Device appears outdated, it's still an historical curiosity, which validates ComSubVie's answer. Anyway, quoting Wikipedia : *"When numerous instances of Duff's device were removed from the XFree86 Server in version 4.0, there was a notable improvement in performance."*...
paercebal
On Symbian, we once evaluated various loops for fast pixel coding; the duff's device, in assembler, was the fastest. So it still had relevance on the mainstream ARM cores on your smartphones today.
Will
+3  A: 

Early versions of gcc attempted to run a game whenever it encountered "#pragma" in the source code. See also here.

Sec
#pragma GCC poison identifiersThis directive bans usage of the identifiers within the program. Poisoned identifiers cannot be #ifdef'd or #undef'd, and attempting to use them for anything will produce an error. Identifiers is a list, separated by spaces.
squadette
+20  A: 

C has a standard but not all C compilers are fully compliant (I've not seen any fully compliant C99 compiler yet!).

That said, the tricks I prefer are those that are non-obvious and portable across platforms as they rely on the C semantic. They usually are about macros or bit arithmetic.

For example: swapping two unsigned integer without using a temporary variable:

...
a ^= b ; b ^= a; a ^=b;
...

or "extending C" to represent finite state machines like:

FSM {
  STATE(x) {
    ...
    NEXTSTATE(y);
  }

  STATE(y) {
    ...
    if (x == 0) 
      NEXTSTATE(y);
    else 
      NEXTSTATE(x);
  }
}

that can be achieved with the following macros:

#define FSM
#define STATE(x)      s_##x :
#define NEXTSTATE(x)  goto s_##x

In general, though, I don't like the tricks that are clever but make the code unnecessarily complicated to read (as the swap example) and I love the ones that make the code clearer and directly conveying the intention (like the FSM example).

Remo.D
C supports chaining, so you can do a ^= b ^= a ^= b;
OJ
Strictly speaking, the state example is a tick of the preprocessor, and not the C language - it is possible to use the former without the latter.
Greg Whitfield
OJ: actually what you suggest is undefined behavior because of sequence point rules. It may work on most compilers, but is not correct or portable.
Evan Teran
Using an XOR swap is generally a bad idea, mostly because it works only for integers and aliasing can prove to be a big problem. I'm pretty sure that most compilers optimize it when memory becomes an issue.
Xor swap could actually be less efficient in the case of a free register. Any decent optimizer would make the temp variable be a register. Depending on implementation (and need for parallelism support) the swap might actually use real memory instead of a register (which would be the same).
Paul de Vrieze
please don't ever actually do this: http://en.wikipedia.org/wiki/Xor_swap#Reasons_for_avoidance_in_practice
Christian Oudard
Why would you do this? other than to raise a host of serious static analysis warnings. Use a switch case statement wrapped in a while loop.
Oliver
@OJ: No, this is undefined behavior.
AndreyT
XOR swapping fails if 'a' is alias to 'b'
Adrian Panasiuk
OJ: I used to do it with GCC... until I tried to compile a small example to MIPS and a lost a lot of time debugging because it was compiling somehow different. As they told here: it is undefined.
dbarbosa
+6  A: 

C compilers implement one of several standards. However, having a standard does not mean that all aspects of the language are defined. Duff's device, for example, is a favorite 'hidden' feature that has become so popular that modern compilers have special purpose recognition code to ensure that optimization techniques do not clobber the desired effect of this often used pattern.

In general hidden features or language tricks are discouraged as you are running on the razor edge of whichever C standard(s) your compiler uses. Many such tricks do not work from one compiler to another, and often these kinds of features will fail from one version of a compiler suite by a given manufacturer to another version.

Various tricks that have broken C code include:

  1. Relying on how the compiler lays out structs in memory.
  2. Assumptions on endianness of integers/floats.
  3. Assumptions on function ABIs.
  4. Assumptions on the direction that stack frames grow.
  5. Assumptions about order of execution within statements.
  6. Assumptions about order of execution of statements in function arguments.
  7. Assumptions on the bit size or precision of short, int, long, float and double types.

Other problems and issues that arise whenever programmers make assumptions about execution models that are all specified in most C standards as 'compiler dependent' behavior.

Kevin S.
To solve most of those, make those assumptions dependant on the characteristics of your platform, and describe each platform in his own header.Order execution is an exception - never rely on that; on the other ideas, each platform needs having a reliable decision.
Blaisorblade
@Blaisorblade, Even better, use compile-time assertions to document your assumptions in a way that will make the compile fail on a platform where they are violated.
RBerteig
I think one should combine both, so that your code works on multiple platforms (that was the original intention), and if the feature macros are set the wrong way, compile-time assertions will catch it. I'm not sure if, say, assumption on function ABIs are checkable as compile-time assertions, but it should be possible for most of the other (valid) ones (except order of execution ;-)).
Blaisorblade
+6  A: 

Strange vector indexing:

int v[100]; int index = 10; 
/* v[index] it's the same thing as index[v] */
Iulian Şerbănoiu
It's even better... char c = 2["Hello"]; (c == 'l' after this).
yrp
Not so strange when you consider that v[index] == *(v + index) and index[v] == *(index + v)
Ferruccio
Please tell me you don't actually use this "all the time", like the question asks!
Tryke
+17  A: 

anonymous structures and arrays is my favourite one. (cf. http://www.run.montefiore.ulg.ac.be/~martin/resources/kung-f00.html)

setsockopt(yourSocket, SOL_SOCKET, SO_REUSEADDR, (int[]){1}, sizeof(int));

or

void myFunction(type* values) {
    while(*values) x=*values++;
}
myFunction((type[]){val1,val2,val3,val4,0});

it can even be used to instanciate linked lists...

sylvainulg
This feature is usually called "compound literals". Anonymous (or unnamed) structures designate nested structures that have no member names.
calandoa
according to my GCC, "ISO C90 forbids compound literals".
jmtd
+37  A: 

Function pointers. You can use a table of function pointers to implement, e.g., fast indirect-threaded code interpreters (FORTH) or byte-code dispatchers, or to simulate OO-like virtual methods.

Then there are hidden gems in the standard library, such as qsort(),bsearch(), strpbrk(), strcspn() [the latter two being useful for implementing a strtok() replacement].

A misfeature of C is that signed arithmetic overflow is undefined behavior (UB). So whenever you see an expression such as x+y, both being signed ints, it might potentially overflow and cause UB.

zvrba
But if they had specified behaviour on overflow, it would have made it very slow on architectures where that was not the normal behaviour. Very low runtime overhead has always been a design goal of C, and that has meant that a lot of things like this are undefined.
Mark Baker
I'm very well aware of _why_ overflow is UB. It is still a misfeature, because the standard should have at least provided library routines that can test for arithmetic overflow (of all basic operations) w/o causing UB.
zvrba
@zvrba, "library routines that can test for arithmetic overflow (of all basic operations)" if you had added this then you would have incurred significant performance hit for any integer arithmetic operations. ===== Case study Matlab specifically ADDS the feature of controlling integer overflow behavior to wrapping or saturate. And it also throws an exception whenever overflow occurs ==> Performance of Matlab integer operations: VERY SLOW. My own conclusion: I think Matlab is a compelling case study that shows why you don't want integer overflow checking.
Trevor Boyd Smith
@zvrba, In my opinion, C was designed ASSUMING that whenever you are doing integer arithmetic you the programmer are doing rigorous analysis to ENSURE that you have bounded-input-bounded-output (fancy way of saying "make sure your input and output stay within a range")!! If you are not doing that rigorous analysis then it's not the language's fault it is the programmer's fault.
Trevor Boyd Smith
I said that the standard should have provided *library* support for checking for arithmetic overflow. Now, how can a library routine incur a performance hit if you never use it?
zvrba
A big negative is that GCC does not have a flag to catch signed integer overflows and throw a runtime exception. While there are x86 flags for detecting such cases, GCC does not utilize them. Having such a flag would allow non-performance-critical (especially legacy) applications the benefit of security with minimal to no code review and refactoring.
Andrew Keeton
None of this is at all 'hidden'. The standard library for example is well advertised; if people choose not to read the documentation they are fools. Function pointers are merely 'advanced' not hidden in any way whatsoever - most texts on the language deal with them.
Clifford
Hidden is relative, documentation is relatively absolute.
Anonymous Type
+26  A: 

I never used bit fields but they sound cool for ultra-low-level stuff.

struct cat {
    unsigned int legs:3;  // 3 bits for legs (0-4 fit in 3 bits)
    unsigned int lives:4; // 4 bits for lives (0-9 fit in 4 bits)
    // ...
};

cat make_cat()
{
    cat kitty;
    kitty.legs = 4;
    kitty.lives = 9;
    return kitty;
}

This means that sizeof(cat) can be as small as sizeof(char).


Incorporated comments by Aaron and leppie, thanks guys.

Motti
The combination of structs and unions is even more interesting - on embedded systems or low level driver code. An example is when you like to parse the registers of an SD card, you can read it in using union (1) and read it out using union (2) which is a struct of bitfields.
ComSubVie
Bitfields are not portable -- the compiler can choose freely whether, in your example, legs will be allocated the most significant 3 bits, or the least significant 3 bits.
zvrba
Bitfields are an example of where the standard gives implementations so much freedom in how they're inplemented, that in practice, they're nearly useless. If you care how many bits a value takes up, and how it's stored, you're better off using bitmasks.
Mark Bessey
Bitfields are indeed portable as long as you treat them as the structure elements they are, and not "pieces of integers." Size, not location, matters in an embedded system with limited memory, as each bit is precious ... but most of today's coders are too young to remember that. :-)
Adam Liss
Yay for bitfields! I use them all the time.
c0m4
@Adam: location may well matter in an embedded system (or elsewhere), if you are depending on the position of the bitfield within its byte. Using masks removes any ambiguity. Similarly for unions.
Steve Melnikoff
@ComSubVie: from this point of view, nothing is ever portable :)
AndreyT
Yes, I actually used this for an assignment I had in my CS class last year. We implemented the LZW algorithm to run under a specific memory constraint. The only option was to specify number of bits to use for each field in the struct because the default made each field too large.
dougvk
+27  A: 

Multi-character constants:

int x = 'ABCD';

This sets x to 0x41424344 (or 0x44434241 depending on architecture).

EDIT: This technique is not portable, especially if you serialize the int. However, it can be extremely useful to create self-documenting enums. e.g.

enum state {
    stopped = 'STOP',
    running = 'RUN!',
    waiting = 'WAIT',
};

This makes it much simpler if you're looking at a raw memory dump and need to determine the value of an enum without having to look it up.

Ferruccio
I'm pretty sure this is not a portable construct. The result of creating a multi-character constant is implementation-defined.
Mark Bessey
I dont like this.
Tim Matthews
remove the comma after 'WAIT' in case someone tries this.
blak3r
@blak3r - thanks! that reminds me of another hidden feature. see here: http://stackoverflow.com/questions/132241/hidden-features-of-c/980530#980530
Ferruccio
@blakr - comma removed.
Matthew Murdoch
A problem is that it cannot be compared with real strings, since depending on endianness it can be stored in memory as "STOP" or "POTS".
calandoa
It is not intended to be compared to real strings. The point is that you can give enums unique values that are easy to read in a memory dump.
Ferruccio
The comma was intentional and syntactically valid.
Ferruccio
@Ferruccio - I believe the comma is valid in C99, but not C89. Which means GCC will probably swallow it, but some older compilers (or older GCC versions) won't.
Chris Lutz
The "not portable" comments miss the point entirely. It is like criticizing a program for using INT_MAX just because INT_MAX is "not portable" :) This feature is as portable as it needs to be. Multi-char constant is an extremely useful feature that provides readable way to for generating unique integer IDs.
AndreyT
Ferruccio
@Ferruccio: You must be thinking about the trailing comma in the aggregate initailizer lists. As for the trailing comma in enum declarations - it's a recent addition, C99.
AndreyT
@AndreyT - You're right. I was.
Ferruccio
You forgot 'HANG' or 'BSOD' :-)
JBRWilkinson
Why not just use macros with defined numerical flags?
Vince
@Vince: Macros, even for constant, pollute the namespace of the program. On similar lines, Why not use assembly instead of C? ;)
Joe D
+3  A: 

I got shown this in a bit of code once, and asked what it did:


hexDigit = "0123456789abcdef"[someNybble];

Another favorite is:


unsigned char bar[100];
unsigned char *foo = bar;
unsigned char blah = 42[foo];
Andrew Edgecombe
First one's too easy. I think you meant someNybble["0123456789abcdef"]. Second one doesn't compile until you add a *.
Windows programmer
Thanks for the "*"
Andrew Edgecombe
I think the first one's right as-is: it converts the integer someNybble in the range 0-15 to its hex equivalent.
Adam Liss
Both forms are correct for the 1st, but the one shown is not really tricky, just unusual to see maybe but obvious to understand.
Blaisorblade
+2  A: 

Not really a hidden feature, but it looked to me like voodoo, the first time I saw something like this:


void callback(const char *msg, void *data)
{
    // do something with msg, e.g.
    printf("%s\n", msg);

    return;
    data = NULL;
}

The reason for this construction is, that if you compile this with -Wextra and without the "data = NULL;"-line, gcc will spit out a warning about unused parameters. But with this useless line you don't get a warning.

EDIT: I know there are other (better) ways to prevent those warnings. It just looked strange to me, the first time I saw this.

quinmars
Don't you get a warning about unreachable code instead?Why not just comment out 'data' - that also removes the unused param warning.
Greg Whitfield
Nope, I didn't get that last time I checked that, but I actually don't use that trick I prefer to use the unused attribute. Removing the data isn't always possible, when you are sticked to the signature because you are writing a callbacks for example.
quinmars
No, the signature does not change. You just do this:void callback(const char *msg, void * /* data*/ )Or this:void callback(const char *msg, void *)
Greg Whitfield
With gcc you could add an unused attribute to parameters:void callback(const char *msg, void *data __attribute__((unused)))
DGentry
As apposed to using the non-portable __attribute__ syntax. You can just put:(void)data;in the function. I usually put it directly after any locals (as they must be first in c89). I also tend to just make a macro like this:#define UNUSED(x) (void)xso I can just write: UNUSED(data).
Evan Teran
You can use '(void)data' anywhere until 'return'. (seems Evan already said that)
akauppi
@Greg Whitfield void callback(const char *msg, void *) {...} doesn't compile here@all I know that there are many ways how you can suppress this kind of warnings
quinmars
@quinmars - What compiler are you using? What I told you is simply standard C. What error do you get?
Greg Whitfield
I'm using gcc, I retried it now with g++ and there indeed it works. So it seems to be a C++ feature.
quinmars
Can't you also just use #unused data ?
Brian Postow
+85  A: 

More of a trick of the GCC compiler, but you can give branch indication hints to the compiler (common in the Linux kernel)

#define likely(x)       __builtin_expect((x),1)
#define unlikely(x)     __builtin_expect((x),0)

see: http://kerneltrap.org/node/4705

What I like about this is that it also adds some expressiveness to some functions.

void foo(int arg)
{
     if (unlikely(arg == 0)) {
           do_this();
           return;
     }
     do_that();
     ...
}
tonylo
This trick is cool... :) Especially with the macros you define. :)
sundar
+19  A: 

I'm very fond of designated initializers, added in C99 (and supported in gcc for a long time):

#define FOO 16
#define BAR 3

myStructType_t myStuff[] = {
    [FOO] = { foo1, foo2, foo3 },
    [BAR] = { bar1, bar2, bar3 },
    ...

The array initialization is no longer position dependent. If you change the values of FOO or BAR, the array initialization will automatically correspond to their new value.

DGentry
The syntax gcc has supported for a long time is not the same as the standard C99 syntax.
Mark Baker
+5  A: 

Variable size automatic variables are also useful in some cases. These were added i nC99 and have been supported in gcc for a long time.

void foo(uint32_t extraPadding) {
    uint8_t commBuffer[sizeof(myProtocol_t) + extraPadding];

You end up with a buffer on the stack with room for the fixed-size protocol header plus variable size data. You can get the same effect with alloca(), but this syntax is more compact.

You have to make sure extraPadding is a reasonable value before calling this routine, or you end up blowing the stack. You'd have to sanity check the arguments before calling malloc or any other memory allocation technique, so this isn't really unusual.

DGentry
Will this also work correctly if a byte/char is not exactly 8 bits wide on the target platform? I know, those cases are rare, but still... :)
Stephan202
+59  A: 
int8_t
int16_t
int32_t
uint8_t
uint16_t
uint32_t

These are an optional item in the standard, but it must be a hidden feature, because people are constantly redefining them. One code base I've worked on (and still do, for now) has multiple redefinitions, all with different identifiers. Most of the time it's with preprocessor macros:

#define INT16 short
#define INT32  long

And so on. It makes me want to pull my hair out. Just use the freaking standard integer typedefs!

Ben Collins
I think they are C99 or so. I haven't found a portable way to ensure these would be around.
akauppi
They are an optional part of C99, but I know of no compiler vendors that don't implement this.
Ben Collins
stdint.h isn't optional in C99, but following the C99 standard apparently is for some vendors (*cough* Microsoft).
Ben Combee
Microsoft Visual C++ doesn't follow the Ada95 standard either. It's not a C99 compiler. It's a C++ 97 compiler. (It doesn't always follow that standard, but it's not fair to complain about it not being something it doesn't claim to be)
Pete Kirkham
@Pete, if you want to be anal: (1) This thread has nothig to do with any Microsoft product. (2) This thread never had anything to do with C++ at all. (3) There is no such thing as C++ 97.
Ben Collins
Have a look at http://www.azillionmonkeys.com/qed/pstdint.h -- a close-to-portable stdint.h
gnud
@gnud: thanks for the tip, but my whole gripe is that it isn't necessary - most compilers implement the standard typedefs. The only compiler I've ever used that didn't was an old version of GCC adapted for embedded VxWorks development (old, like, GCC 2.7).
Ben Collins
@Ben Collins: He's pointing out that it almost implements C++98, but falls short of several requirements. Furthermore, MSVC _doesn't_ support C99, especially stdint.h which is a royal PITA.
Matt Joiner
@Anacrolix: Yes. I understood what he was pointing out. You seem to miss my point though: it's apropos nothing. Whether or not a particular compiler "supports C99" really has nothing at all to do with whether or not you should use the standard integer typedefs. They are portable and easy to define even if your compiler sucks. If you need to specify a certain integer width, then the standard typedefs should *always*, **always** be used, regardless of where the definitions come from.
Ben Collins
Thanks so much! It gets my back up when I use the Windows headers and you get typedef unsigned long ULONG. Like, seriously? Or, typedef float FLOAT.
DeadMG
+44  A: 

The comma operator isn't widely used. It can certainly be abused, but it can also be very useful. This use is the most common one:

for (int i=0; i<10; i++, doSomethingElse())
{
  /* whatever */
}

But you can use this operator anywhere. Observe:

int j = (printf("Assigning variable j\n"), getValueFromSomewhere());

Each statement is evaluated, but the value of the expression will be that of the last statement evaluated.

Ben Collins
In 20years of C I have NEVER seen that!
Martin Beckett
In C++ you can even overload it.
Wouter Lievens
can != should, of course. The danger with overloading it is that the built in applies to everything already, including void, so will never fail to compile for lack of available overload. Ie, gives programmer much rope.
Aaron
+2  A: 

Conversion of types by using unusual typecasts. Though not hidden feature, its quite tricky.

Example:

If you needed to know how compiler stores float, just try this:

uint32_t Int;
float flt = 10.5; // say

Int = *(uint32_t *)&flt;

printf ("Float 10.5 is stored internally as %8X\n", Int);

or

float flt = 10.5; // say

printf ("Float 10.5 is stored internally as %8X\n", *(uint32_t *)&flt);

Note the clever use of typecasts. Converting address of variable (here &flt) to desired type (here (uint32_t * )) and extracting its content (applying '*').

This works other side of expression as well:

*(float *)&Int = flt;

This could also be accomplished using union:

typedef union
{
  uint32_t Int;
  float    flt;

} FloatInt_type;
yogeesh
This falls under "common usage that I would recommend against". Type aliasing and optimizations don't get along. Use unions instead for clarity, both for the reader and the compiler.
ephemient
To be exact, "don't get along" means "this code might be actually miscompiled", because it's undefined behaviour in C.
Blaisorblade
@ephemient: Using multiple members of a union at the same time, e.g. assigning to one right before reading from the other, is undefined behaviour.
Joe D
+4  A: 

I liked the variable sized structures you could make:

typedef struct {
    unsigned int size;
    char buffer[1];
} tSizedBuffer;

tSizedBuffer *buff = (tSizedBuffer*)(malloc(sizeof(tSizedBuffer) + 99));

// can now refer to buff->buffer[0..99].

Also the offsetof macro which is now in ANSI C but was a piece of wizardry the first time I saw it. It basically uses the address-of operator (&) for a null pointer recast as a structure variable.

paxdiablo
+5  A: 

My favorite "hidden" feature of C, is the usage of %n in printf to write back to the stack. Normally printf pops the parameter values from the stack based on the format string, but %n can write them back.

Check out section 3.4.2 here. Can lead to a lot of nasty vulnerabilities.

Sridhar Iyer
+21  A: 

gcc has a number of extensions to the C language that I enjoy, which can be found here. Some of my favorites are function attributes. One extremely useful example is the format attribute. This can be used if you define a custom function that takes a printf format string. If you enable this function attribute, gcc will do checks on your arguments to ensure that your format string and arguments match up and will generate warnings or errors as appropriate.

int my_printf (void *my_object, const char *my_format, ...)
            __attribute__ ((format (printf, 2, 3)));
Russell Bryant
A: 

register variables

I used to declare some variables with the register keyword to help speed things up. This would give a hint to the C compiler to use a CPU register as local storage. This is most likely no longer necessary as modern day C compilers do this automatically.

Mark Stock
More to the point the C compiler knows better than you which variables would benefit most from being in a register. Most modern compilers are smart enough to entirely ignore the register keyword, but if they actually paid attention to it it would probably make your code slower
Mark Baker
I am pretty sure some compilers refuse to let you take the address of a variable declared with register. So that is useful, in order to keep your intentions clear.
Zan Lynx
+48  A: 

initializing structure to zero

struct mystruct a = {0};

this will zero all stucture elements.

mike511
Does it zero out an entire array, as well?
drhorrible
It doesn't zero the padding, if any, however.
Mikeage
@drhorrible: You can do int array[50] = {0}; for example. But only at declaration (I think.)
Skurmedel
Doesn't this do something undefined if the structure contains non-integral types (e.g. floats and doubles)?
Simon Nickerson
@simonn, no it doesn't do undefined behavior if the structure contains non-integral types. memset with 0 on the memory of a float/double will still be zero when you interpret the float/double (float/double are designed like that on purpose).
Trevor Boyd Smith
@Trevor I thought that the effect this has on floats and such is "all bytes zero" which, in all sane cases, would give you a float equal to 0.0, but it's still implementation-defined.
Andrew Keeton
@Andrew: `memset`/`calloc` do "all bytes zero" (i.e. physical zeroes), which is indeed not defined for all types. `{ 0 } ` is guaranteed to intilaize *everything* with proper *logical* zero values. Pointers, for example, are guranteed to get their proper null values, even if the null-value on the given platform is `0xBAADFOOD`.
AndreyT
@AndreyT: Could you please elaborate on diff between logical and physical zeroes?
N 1.1
@nvl: You get *physical* zero when you just forcefully set all memory occupied by the object to all-bits-zero state. This is what `memset` does (with `0` as second argument). You get *logical* zero when you initialize/assign `0` ( or `{ 0 }`) to the object in the source code. These two kinds of zeros do not necessarily produce the same result. As in the example with pointer. When you do `memset` on a pointer, you get a `0x0000` pointer. But when you assign `0` to a pointer, you get *null pointer value*, which at the physical level might be `0xBAADF00D` or anything else.
AndreyT
In other words, *physical* value is the explicit bit-pattern that is stored in memory. *Logical* value is how that bit-pattern is interpreted at the program level.
AndreyT
@AndreyT I am clear about the physical zero thing. But cannot digest this logical zeroes completely. Any other data structure, other than pointer, which has different logical/physical representations?
N 1.1
@nvl: Well, in practice the difference is often only conceptual. But in theory, virtually any type can have it. For example, `double`. Usually it is implemented in accordance with IEEE-754 standard, in which the logical zero and physical zero are the same. But IEEE-754 is not required by the language. So it might happen that when you do `double d = 0;` (logical zero), physically some bits in memory occupied by `d` will not be zero.
AndreyT
Same with `bool` values, for another example. If you do `bool b = false;` (or, equivalently, `bool b = 0;`) it does not necessarily mean that in physical memory `b` will be zeroed out (even though it is usually the case in practice).
AndreyT
Got it. Thanks. I remember the floating point representation now. Makes sense.
N 1.1
@AndreyT : Excellent comments. +1
paercebal
+10  A: 

Struct assignment is cool. Many people don't seem to realize that structs are values too, and can be assigned around, there is no need to use memcpy(), when a simple assignment does the trick.

For example, consider some imaginary 2D graphics library, it might define a type to represent an (integer) screen coordinate:

typedef struct {
   int x;
   int y;
} Point;

Now, you do things that might look "wrong", like write a function that creates a point initialized from function arguments, and returns it, like so:

Point point_new(int x, int y)
{
  Point p;
  p.x = x;
  p.y = y;
  return p;
}

This is safe, as long (of course) as the return value is copied by value using struct assignment:

Point origin;
origin = point_new(0, 0);

In this way you can write quite clean and object-oriented-ish code, all in plain standard C.

unwind
Of course, there are performance implications to passing round large structs in this way; it's often useful (and is indeed something a lot of people don't realise you can do) but you need to consider whether passing pointers is better.
Mark Baker
Of course, there *might* be. Ít's also quite possible for the compiler to detect the usage and optimize it.
unwind
Be careful if any of the elements are pointers, as you'll be copying the pointers themselves, not their contents. Of course, the same is true if you use memcpy().
Adam Liss
The compiler can't optimize this converting by-value passing with by-referenece, unless it can do global optimizations.
Blaisorblade
It's probably worth noting that in C++ the standard specifically allows optimizing away the copy (the standard has to allow for it for compilers to implement it because it means the copy constructor which may have side effects may not be called), and since most C++ compilers are also C compilers, there's a good chance your compiler does do this optimization.
Joseph Garvin
I fixed the final "constructor" call, it was calling Point(0, 0) which of course is wrong, point_new() is the way to go. Oops.
unwind
+15  A: 

Well, I've never used it, and I'm not sure whether I'd ever recommend it to anyone, but I feel this question would be incomplete without a mention of Simon Tatham's co-routine trick.

Mark Baker
+14  A: 

Compile-time assertions, as already discussed here.

//--- size of static_assertion array is negative if condition is not met
#define STATIC_ASSERT(condition) \
    typedef struct { \
        char static_assertion[condition ? 1 : -1]; \
    } static_assertion_t

//--- ensure structure fits in 
STATIC_ASSERT(sizeof(mystruct_t) <= 4096);
philippe
+3  A: 

C99-style variable argument macros, aka

#define ERR(name, fmt, ...)   fprintf(stderr, "ERROR " #name ": " fmt "\n", \
                                  __VAR_ARGS__)

which would be used like

ERR(errCantOpen, "File %s cannot be opened", filename);

Here I also use the stringize operator and string constant concatentation, other features I really like.

Ben Combee
You have an extra 'R' in __VA_ARGS__.
Blaisorblade
+1  A: 

Excerpt:

In this page, you will find a list of interesting C programming questions/puzzles, These programs listed are the ones which I have received as e-mail forwards from my friends, a few I read in some books, a few from the internet, and a few from my coding experiences in C.

http://www.gowrikumar.com/c/index.html

Comptrol
+9  A: 

When initializing arrays or enums, you can put a comma after the last item in the initializer list. e.g:

int x[] = { 1, 2, 3, };

enum foo { bar, baz, boom, };

This was done so that if you're generating code automatically you don't need to worry about eliminating the last comma.

Ferruccio
This is also important in a multi-developer environment where, for instance, Eric adds in "baz," and then George adds in "boom,". If Eric decides to pull his code out for the next project build, it still compiles with George's change. Very important for multi-branch source code control and overlapping development schedules.
Harold Bamford
I think this is C99 only.
Mikeage
Ferruccio
+6  A: 

Gcc (c) has some fun features you can enable, such as nested function declarations, and the a?:b form of the ?: operator, which returns a if a is not false.

-Alex

Alex Brown
+3  A: 

Say you have a struct with members of the same type:

struct Point {
    float x;
    float y;
    float z;
};

You can cast instances of it to a float pointer and use array indices:

Point a;
int sum = 0, i = 0;
for( ; i < 3; i++)
    sum += ((float*)a)[i];

Pretty elementary, but useful when writing concise code.

aeflash
Are you sure this is portable? I thought that the C standards made no guarantee about structure alignment besides the first element being at offset 0. There might be gaps between the elements. I.e. sizeof(Point) is not guaranteed to be sizeof(float)*3.
jmtd
@jmtd, Right. In practice its exactly "portable" enough to get you in trouble. The offset of any member other than the first is implementation defined behavior, and need not have the same effective packing as an array of that type. In practice, it is likely it does have the same packing as an array, so this code will work until it is ported to the *next* platform where it will fail mysteriously. A similar thing happened to a common implementation of MD5 when ported to 64-bit: it compiled and ran, but got a different answer.
RBerteig
+12  A: 
A: 

Variable-sized structs, seen in common resolver libs among other places.

struct foo
{
  int a;
  int b;
  char b[1]; // using [0] is no longer correct
             // must come at end
};

char *str = "abcdef";
int len = strlen(str);
struct foo *bar = malloc(sizeof(foo) + len);

strcpy(bar.b, str); // try and stop me!
+1  A: 

Here's three nice ones in gcc:

__FILE__ 
__FUNCTION__
__LINE__
__FILE__ and __LINE__ are standard ; C99 brings __func__
philippe
A: 

Wrap malloc and realloc like this:

#ifdef _DEBUG
#define mmalloc(bytes)                  malloc(bytes);printf("malloc: %d\t<%s@%d>\n", bytes, __FILE__, __LINE__);
#define mrealloc(pointer, bytes)        realloc(pointer, bytes);printf("realloc: %d\t<%s@%d>\n", bytes, __FILE__, __LINE__);
#else //_DEBUG
#define mmalloc(bytes)                  malloc(bytes)
#define mrealloc(pointer, bytes)        realloc(pointer, bytes)

In fact, here is my full arsenol (The BailIfNot is for OO c):

#ifdef _DEBUG
#define mmalloc(bytes)                  malloc(bytes);printf("malloc: %d\t<%s@%d>\n", bytes, __FILE__, __LINE__);
#define mrealloc(pointer, bytes)        realloc(pointer, bytes);printf("realloc: %d\t<%s@%d>\n", bytes, __FILE__, __LINE__);
#define BAILIFNOT(Node, Check)  if(Node->type != Check) return 0;
#define NULLCHECK(var)          if(var == NULL) setError(__FILE__, __LINE__, "Null exception", " var ", FATAL);
#define ASSERT(n)               if( ! ( n ) ) { printf("<ASSERT FAILURE@%s:%d>", __FILE__, __LINE__); fflush(0); __asm("int $0x3"); }
#define TRACE(n)                printf("trace: %s <%s@%d>\n", n, __FILE__, __LINE__);fflush(0);
#else //_DEBUG
#define mmalloc(bytes)                  malloc(bytes)
#define mrealloc(pointer, bytes)        realloc(pointer, bytes)
#define BAILIFNOT(Node, Check)  {}
#define NULLCHECK(var)          {}
#define ASSERT(n)               {}
#define TRACE(n)                {}
#endif //_DEBUG

Here is some example output:

malloc: 12      <hash.c@298>
trace: nodeCreate <hash.c@302>
malloc: 5       <hash.c@308>
malloc: 16      <hash.c@316>
malloc: 256     <hash.c@320>
trace: dataLoadHead <hash.c@441>
malloc: 270     <hash.c@463>
malloc: 262144  <hash.c@467>
trace: dataLoadRecursive <hash.c@404>
please, don't like that... for example, this otherwise correct code `if (something) mmaloc(); else otherthing;` won't compile if _DEBUG is defined.
fortran
you want a comma on the malloc macros, not a semicolon (for the reasons @fortran described). That does ignore the return value, though (but then again I'm not sure why these macros are desirable).
meeselet
A: 

I just read this article. It has some C and several other languages "hidden features".

Rigo Vides
Oh my! they're all stackoverflow contributions, sorry (I'm kinda new here and I didn't notice that there's a hidden features section)... Anyway, it may work as a reference and quick guide to these topics.
Rigo Vides
A: 

Object oriented C macros: You need a constructor (init), a destructor (dispose), an equal (equal), a copier (copy), and some prototype for instantiation (prototype).

With the declaration, you need to declare a constant prototype to copy and derive from. Then you can do C_OO_NEW. I can post more examples if needed. LibPurple is a large object oriented C code base with a callback system (if you want to see one in use)

#define C_copy(to, from) to->copy(to, from)

#define true 1
#define false 0
#define C_OO_PROTOTYPE(type)\
void type##_init (struct type##_struct *my);\
void type##_dispose (struct type##_struct *my);\
char type##_equal (struct type##_struct *my, struct type##_struct *yours); \
struct type##_struct * type##_copy (struct type##_struct *my, struct type##_struct *from); \
const type type##__prototype = {type##_init, type##_dispose, type##_equal, type##_copy

#define C_OO_OVERHEAD(type)\
        void (*init) (struct type##_struct *my);\
        void (*dispose) (struct type##_struct *my);\
        char (*equal) (struct type##_struct *my, struct type##_struct *yours); \
        struct type##_struct *(*copy) (struct type##_struct *my, struct type##_struct *from); 

#define C_OO_IN(ret, type, function, ...)       ret (* function ) (struct type##_struct *my, __VA_ARGS__);
#define C_OO_OUT(ret, type, function, ...)      ret type##_##function (struct type##_struct *my, __VA_ARGS__);

#define C_OO_PNEW(type, instance)\
        instance = ( type *) malloc(sizeof( type ));\
        memcpy(instance, & type##__prototype, sizeof( type ));

#define C_OO_NEW(type, instance)\
        type instance;\
        memcpy(&instance, & type ## __prototype, sizeof(type));

#define C_OO_DELETE(instance)\
        instance->dispose(instance);\
        free(instance);

#define C_OO_INIT(type)         void type##_init (struct type##_struct *my){return;}
#define C_OO_DISPOSE(type)      void type##_dispose (struct type##_struct *my){return;}
#define C_OO_EQUAL(type)        char type##_equal (struct type##_struct *my, struct type##_struct *yours){return 0;}
#define C_OO_COPY(type)         struct type##_struct * type##_copy (struct type##_struct *my, struct type##_struct *from){return 0;}
+1  A: 

I like the typeof() operator. It works like sizeof() in that it is resolved at compile time. Instead of returning the number of bytes, it returns the type. This is useful when you need to declare a variable to be the same type as some other variable, whatever type it may be.

typeof(foo) copy_of_foo; //declare bar to be a variable of the same type as foo
copy_of_foo = foo; //now copy_of_foo has a backup of foo, for any type

This might be just a gcc extension, I'm not sure.

Eyal
in the same familly there is also an offsetof(), well it's a macro but it's nice anyway.
kriss
And if you want: `#define countof(array) (sizeof (array) / sizeof (array[0]))` ;)
Joe D
+9  A: 

the (hidden) feature that "shocked" me when I first saw is about printf. this feature allows you to use variables for formatting format specifiers themselves. look for the code, you will see better:

#include <stdio.h>

int main() {
    int a = 3;
    float b = 6.412355;
    printf("%.*f\n",a,b);
    return 0;
}

the * character achieves this effect.

kolistivra
+2  A: 

For clearing the input buffer you can't use fflush(stdin). The correct way is as follows: scanf("%*[^\n]%*c") This will discard everything from the input buffer.

Good to know. Reference?
profjim
A: 
+2  A: 

I only discovered this after 15+ years of C programming:

struct SomeStruct
{
   unsigned a : 5;
   unsigned b : 1;
   unsigned c : 7;
};

Bitfields! The number after the colon is the number of bits the member requires, with members packed into the specified type, so the above would look like the following if unsigned is 16 bits:

xxxc cccc ccba aaaa

Skizz

Skizz
+1  A: 

Compile-time assumption-checking using enums: Stupid example, but can be really useful for libraries with compile-time configurable constants.

#define D 1
#define DD 2

enum CompileTimeCheck
{
    MAKE_SURE_DD_IS_TWICE_D = 1/(2*(D) == (DD)),
    MAKE_SURE_DD_IS_POW2    = 1/((((DD) - 1) & (DD)) == 0)
};
S.C. Madsen
A: 

intptr_t for declaring variables of type pointer. C99 specific and declared in stdint.h

Ramakrishnan Muthukrishnan
+2  A: 

Lambda's (e.g. anonymous functions) in GCC:

#define lambda(return_type, function_body) \
    ({ return_type fn function_body fn })

This can be used as:

lambda (int, (int x, int y) { return x > y; })(1, 2)

Which is expanded into:

({ int fn (int x, int y) { return x > y } fn; })(1, 2)
Joe D
+3  A: 

Constant string concatenation

I was quite surprised not seeing it allready in the answers, as all compilers I know of support it, but many programmers seems to ignore it. Sometimes it's really handy and not only when writing macros.

Use case I have in my current code: I have a #define PATH "/some/path/" in a configuration file (really it is setted by the makefile). Now I want to build the full path including filenames to open ressources. It just goes to:

fd = open(PATH "/file", flags);

Instead of the horrible, but very common:

char buffer[256];
snprintf(buffer, 256, "%s/file", PATH);
fd = open(buffer, flags);

Notice that the common horrible solution is:

  • three times as long
  • much less easy to read
  • much slower
  • less powerfull at it set to an arbitrary buffer size limit (but you would have to use even longer code to avoid that without constant strings contatenation).
  • use more stack space
kriss
A: 

I like __LINE__ and __FILE__. See here: http://gcc.gnu.org/onlinedocs/cpp/Standard-Predefined-Macros.html

Steve Webb
+3  A: 

When using sscanf you can use %n to find out where you should continue to read:

sscanf ( string, "%d%n", &number, &length );
string += length;
onemasse