tags:

views:

174

answers:

5

Any C programmer who's been working for more than a week has encountered crashes that result from calling printf with more format specifiers than actual arguments, e.g.:

printf("Gonna %s and %s, %s!", "crash", "burn");

However, are there any similar bad things that can happen when you pass too many arguments to printf?

printf("Gonna %s and %s!", "crash", "burn", "dude");

My knowledge of x86/x64 assembly leads me to believe that this is harmless, though I'm not convinced that there's not some edge condition I'm missing, and I have no idea about other architectures. Is this condition guaranteed to be harmless, or is there a potentially crash-inducing pitfall here, too?

+3  A: 

All the arguments will be pushed on the stack and removed if the stack frame is removed. this behaviour is independend from a specific processor. (I only remember a mainframe which had no stack, designed in 70s) So, yes the second example wont't fail.

stacker
+6  A: 

You probably know the prototype for the printf function as something like this

int printf(const char *format, ...);

A more complete version of that would actually be

int __cdecl printf(const char *format, ...);

The __cdecl defines the "calling convention" which, along with other things, describes how arguments are handled. In the this case it means that args are pushed onto the stack and that the stack is cleaned by the function making the call.

One alternative to _cdecl is __stdcall, there are others. With __stcall the convention is that arguments are pushed onto the stack and cleaned by the function that is called. However, as far as I know, it isn't possible for a __stdcall function to accept a variable number of arguments. That makes sense since it wouldn't know how much stack to clean.

The long and the short of it is that in the case of __cdecl functions its safe to pass however many args you want, since the cleanup is performed in the code makeing the call. If you were to somehow pass too many arguments to a __stdcall function it result in a corruption of the stack. One example of where this could happen is if you had the wrong prototype.

More information on calling conventions can be found here

torak
__cdecl is a Win32ism, created by the fact that some old DOS compilers supported both C and pascal calling conventions.
ninjalj
@ninjalj, `__cdecl` is only supported by MS compilers, but the general note about calling conventions is valid for all OSs.
JSBangs
@JSBangs: __cdecl was also supported by Borland compilers IIRC. Also, on most other OSes C uses only the C calling convention (right to left, caller cleans stack), possibly with variants for ISRs, compatibility with other compilers, and/or saving the args on registers (GCC's regparm). AFAIK, Win32 is the only platform where you can select a calling convention that does not support vararg functions.
ninjalj
@ninjalj: The 68000-based Macintosh operating system used "Pascal" calling convention (called function pops stack) for almost everything. A little ironic, actually, since on the 68000 that calling convention would require a sequence like: "mov.l (A7+),A0 / addq #4,A7 / jmp (A0)", whereas the C calling convention would allow use of the "RETURN" instruction.
supercat
@supercat: s/only platform/only still-in-active-use platform/ on my previous comment.
ninjalj
-1 for presenting MS-isms as if they were part of the C language.
R..
+3  A: 

printf is designed to accept any number of arguments. printf then reads the format specifier (first argument), and pulls arguments from the argument list as needed. This is why too few arguments crash: the code simply starts using non-existent arguments, accessing memory that doesn't exist, or some other bad thing. But with too many arguments, the extra arguments will simply be ignored. The format specifier will use fewer arguments than have been passed in.

Ned Batchelder
To add to this, your compiler might also eliminate the extra parameters if it can detect that they are unused. You would have to look at the assembly output to tell if the extra params are really passed to `printf` or if they get optimized away.
bta
If the compiler knows for sure that it is calling something that uses printf's format language (and GCC has an attribute for that which can be used to decorate your own printf-like functions) then it is in principle safe to do this optimization. It would still have to act as if it had computed all the parameters in case any of the unused ones happened to have side effects.
RBerteig
+9  A: 

Online C Draft Standard (n1256), section 7.19.6.1, paragraph 2:

The fprintf function writes output to the stream pointed to by stream, under control of the string pointed to by format that specifies how subsequent arguments are converted for output. If there are insufficient arguments for the format, the behavior is undefined. If the format is exhausted while arguments remain, the excess arguments are evaluated (as always) but are otherwise ignored. The fprintf function returns when the end of the format string is encountered.

Behavior for all the other *printf() functions is the same wrt excess arguments except for vprintf() (obviously).

John Bode
A: 

Comment: both gcc and clang produce warnings:

$ clang main.c 
main.c:4:29: warning: more '%' conversions than data arguments [-Wformat]
  printf("Gonna %s and %s, %s!", "crash", "burn");
                           ~^
main.c:5:47: warning: data argument not used by format string 
                      [-Wformat-extra-args]
  printf("Gonna %s and %s!", "crash", "burn", "dude");
         ~~~~~~~~~~~~~~~~~~                   ^
2 warnings generated.
J.F. Sebastian