ansaurus

Question

C's "bad" functions vs. their "good" alternatives

Answer 1

+12 A:

strtok() is generally considered to be evil because it stores state information between calls. Don't try running THAT in a multithreaded environment!

Dave Markle 2009-08-10 04:02:09

alternative strtok_r

Artyom 2009-08-10 04:05:54

It's also considered a bit nutty because of the way that it modifies its first argument.

caf 2009-08-10 04:49:39

Many CRT implementations use thread-local variables to keep this kind of state information, so it might technically be safe depending on the platform, but it's definitely not a good idea.

Tim Sylvester 2009-08-10 05:08:10

mscv strtok() uses TLS to store state info, so it thread-safe

f0b0s 2009-08-10 10:38:49

Answer 2

+2 A:

scanf() is bad because it dosen't prevent buffer overflow. I just recently learned that.

Lucas McCoy 2009-08-10 04:03:29

It is possible to use scanf() safely, though. You just need to avoid an unadorned "%s" conversion specifier, and use maximum field widths ("%200s").

caf 2009-08-10 04:17:35

I know you can do things like this: `scanf("%10[0-9a-zA-Z ]s", str);` but `fgets()` just seams simpler to use.

Lucas McCoy 2009-08-10 14:04:24

Answer 3

+6 A:

Any function that does not take a maximum length parameter and instead relies on an end-of- marker to be present (such as many 'string' handling functions).

Any method that maintains state between calls.

Mitch Wheat 2009-08-10 04:05:11

ok but, this answer doesn't list them nor provide their alternatives. still +1

hasen j 2009-08-10 04:24:21

Not always. Strlen relies on the end-of marker yet is still a safe function to use.

Billy ONeal 2009-08-10 04:47:06

Probably more precise to say any *destructive* function.

Chuck 2009-08-10 05:02:47

@Chuck: care to define 'destructive'?

Mitch Wheat 2009-08-10 12:09:43

One that has side-effects.

Chuck 2009-08-10 21:33:29

Answer 4

+2 A:

View page 7 (PDF page 9) SAFECode Dev Practices

Edit: From the page -

strcpy family
strncpy family
strcat family
scanf family
sprintf family
gets family

Dan McGrath 2009-08-10 04:06:07

That also refers to Microsoft's Security Development Lifecycle (SDL) Banned Function Calls at http://msdn.microsoft.com/en-us/library/bb288454.aspx

dajobe 2009-08-10 05:35:21

Answer 5

A:

strcpy() - You should use strncpy instead, to explicitly define the number of bytes to copy, and avoid a buffer overflow.

Matthew Iselin 2009-08-10 04:06:57

Um, no. strncpy() 1.does not guarantee you to null-terminate its output, and 2.does guarantee to write n characters even if most of them are nulls. It should be used only in the same specific circumstances for which it was designed, i.e. handling old-style Unix directory entries.

mlp 2009-08-10 05:25:38

A valid point. I was writing from the optimistic perspective where the parameters make sense and don't end up running into the (multiple) problems you can hit with strncpy.

Matthew Iselin 2009-08-10 05:42:57

Answer 6

+5 A:

sprintf is bad, does not check size, use snprintf
gmtime, localtime -- use gmtime_r, localtime_r

Artyom 2009-08-10 04:08:12

Answer 7

+25 A:

In the old days, most of the string functions had no bounds checking. Of course they couldn't just delete the old functions, or modify their signatures to include an upper bound, that would break compatibility. Now, for almost every one of those functions, there is an alternative "n" version. For example:

strcpy -> strncpy
strlen -> strnlen
strcmp -> strncmp
strcat -> strncat
strdup -> strndup
sprintf -> snprintf
wcscpy -> wcsncpy
wcslen -> wcsnlen

And more.

Adam Batkin 2009-08-10 04:08:46

+1. nice example.

Mitch Wheat 2009-08-10 04:13:16

strncpy are not what you should be changing to either. Take for example copying "hello" into a char[5], with the size param of 5. This results in the char[5] are not being NULL terminated! Use strlcpy or equiv please! Or, if you must use something like strncpy, ALWAYS set the same param to 1 less then the max buffer size and manually NULL terminate yourself. http://www.usenix.org/events/usenix99/millert.html

Dan McGrath 2009-08-10 04:21:33

Note that strlen(), strcmp() and strdup() are safe. The 'n' alternatives give you additional functionality.

caf 2009-08-10 04:22:24

@Dan strlcpy is a BSDism - a standard C alternative is to set dest[0] = '\0'; and then call strncat() - unlike strncpy(), strncat() always nul-terminates the destination.

caf 2009-08-10 04:24:08

I assume you meant dest[n] (Where n = 4 for char[5]), since setting the the first char to null does not stop the NULL termination issue when copying buffers are of maximum size. This is why I put the "param to 1 less then the max buffer size and manually NULL terminate yourself"

Dan McGrath 2009-08-10 04:29:05

Nope, I meant dest[0] = '\0'; strncat(dest, source, dest_size - 1); - the idea is to use strncat(), since it has sane destination-terminating behaviour.

caf 2009-08-10 04:32:01

Slight caveat: strncpy() doesn't just copy up to n chars of a string but also pads out with NUL chars up to the length n if the string is shorter than n. This can be a massive performance hit if you have allocated big buffers "to be safe" but normally handle small strings (e.g. copying file names to buffer of size PATH_MAX, which is typically 4K). [And, yes, I know PATH_MAX is deprecated].

Dipstick 2009-08-10 05:34:02

Instead of mucking about with badly designed low-level functions for manipulating arrays of char for string processing, it makes much more sense to use a library for a higher level string abstraction.

Lars Wirzenius 2009-08-10 10:23:43

@liw.fi: Absolutely, I agree. But C still doesn't include a standard string library, other than null-terminated `char*` and the above (and other) manipulation functions

Adam Batkin 2009-08-10 18:28:42

And, if you're on MSVC, *_s variants (e.g. `strcpy_s`/`strncpy_s`). Shame they're not more widely used, since `strncpy` is still open to programmer's slip-ups and thus is not entirely safe to buffer overflows.

romkyns 2010-07-19 16:35:34

Note that it appears that "strnlen()" is a non-standard function (not part of C99). Some implementations may have it, others might not: http://cboard.cprogramming.com/c-programming/101860-strnlen-implicit-declaration.html

Dave Gallagher 2010-09-25 20:47:42

Answer 8

+8 A:

Yes, fgets( , , STDIN) is a good alternative to gets(), because it takes a size parameter.

scanf() is considered problematic in some cases, rather than straight-out "bad", because if the input doesn't conform to the expected format it can be impossible to recover sensibly (it doesn't let you rewind the input and try again). If you can just give up on badly formatted input, it's useable. A "better" alternative here is to use an input function like fgets() or fgetc() to read chunks of input, then scan it with sscanf() or parse it with string handling functions like strchr() and strtol(). Also see below for a specific problem with the "%s" conversion specifier in scanf().

It's not a standard C function, but the BSD and POSIX function mktemp() is generally impossible to use safely, because there is always a race condition between testing for the file's existence and creating it. mkstemp() or tmpfile() are good replacements.

strncpy() is a slightly tricky function, because it doesn't nul-terminate the destination if there was no room for it. You can work around this either by adding the nul-terminator to the destination yourself, or setting the destination to an empty string and then using strncat() instead.

atoi() can be a bad choice in some situations, because you can't tell when there was an error doing the conversion (eg. if the number exceeded the range of an int). Use strtol() if this matters to you.

strcpy(), strcat() and sprintf() suffer from a similar problem to gets() - they don't allow you to specify the size of the destination buffer. It's still possible, at least in theory, to use them safely - but you are much better off using strncat() and snprintf() instead (you could use strncpy(), but see above). On the same theme, if you use the scanf() family of functions, don't use a plain "%s" - specify the size of the destination eg. "%200s".

caf 2009-08-10 04:10:02

Answer 9

+2 A:

strcpy - again!

Most people agree that strcpy is dangerous, but strncpy is only rarely a useful replacement. It is usually important that you know when you've needed to truncate a string in any case, and for this reason you usually need to examine the length of the source string anwyay. If this is the case, usually memcpy is the better replacement as you know exactly how many characters you want copied.

e.g. truncation is error:

n = strlen( src );

if( n >= buflen )
    return ERROR;

memcpy( dst, src, n + 1 );

truncation allowed, but number of characters must be returned so caller knows:

n = strlen( src );

if( n >= buflen )
    n = buflen - 1;

memcpy( dst, src, n );
dst[n] = '\0';

return n;

Charles Bailey 2009-08-10 05:05:31

Answer 10

+2 A:

Strictly speaking, there is one really dangerous function. It is gets() because its input is not under the control of the programmer. All other functions mentioned here are safe in and of themselves. "Good" and "bad" boils down to defensive programming, namely preconditions, postconditions and boilerplate code.

Let's take strcpy() for example. It has some preconditions that the programmer must fulfill before calling the function. Both strings must be valid, non-NULL pointers to zero terminated strings, and the destination must provide enough space with a final string length inside the range of size_t. Additionally, both strings are not allowed to overlap.

That are quite a lot of preconditions, and none of them is checked by strcpy(). The programmer must be sure they are fulfilled, or he must explicitely test them with additional boilerplate code before calling strcpy():

n = DST_BUFFER_SIZE;
if ((dst != NULL) && (src != NULL) && (strlen(dst)+strlen(src)+1 <= n))
{
    strcpy(dst, src);
}

Already silently assuming the non-overlap and zero-terminated strings.

strncpy() does include some of these checks, but it adds another postcondition the programmer must take care for after calling the function, because the result may not be zero-terminated.

strncpy(dst, src, n);
if (n > 0)
{
    dst[n-1] = '\0';
}

Why are these functions considered "bad"? Because they would require additional boilerplate code for each call to really be on the safe side when the programmer assumes wrong about the validity, and programmers tend to forget this code.

Or even argue against it. Take the printf() family. These functions return a status that indicate error and success. Who checks if the output to stdout or stderr succeeded? With the argument that you can't do anything at all when the standard channels are not working. Well, what about rescueing the user data and terminating the program with an error-indicating exit code? Instead of the possible alternative of crash and burn later with corrupted user data.

In a time- and money-limited environment it is always the question of how much safety nets you really want and what is the resulting worst case scenario? If it is a buffer overflow as in case of the str-functions, then it makes sense to forbid them and probably provide wrapper functions with the safety nets already within.

One final question about this: What makes you sure that your "good" alternatives are really good?

Secure 2009-08-10 10:17:26

These functions are bad because they make it easy to write buggy code. Worse, the buggy code is often they type that introduces security vulnerabilities. Sure, they often *can* be used safely and correctly, but it's too easy to use them incorrectly. I know this both from experience and from studies Microsoft has done on millions of lines of code. MS doesn't ban the use of functions 'just because'. They do it because they have hard statistics about the types of code cause bugs (in particular security bugs). This is hard data whether or not Microsoft is a company that you like or respect.

Michael Burr 2009-08-10 14:27:00

And as far as `strncpy()` is concerned, it's not a 'good' or safe function. It's banned by Microsoft and in my code.

Michael Burr 2009-08-10 14:29:02

Just what I said more verbosely. The "badness" of a function depends on the amount and the type of the conditions the programmer has to ensure, their locality (the destination memory can be defined anywhere in the program) and the severity of the errors that are possible when done wrong. No surprise that string-related functions are on top.And I don't use strncpy() myself, but for the reason that I consider a silent string truncation without any indication that it happened as an error.

Secure 2009-08-10 16:21:54

BTW, reading it again after 3 months, I've confused strcpy and strcat. But nobody cared, anyway. ;)

Secure 2009-11-02 17:18:06

Very nice post +1. I wonder if you are the same Secure who invariably provided (provides?) definitive and reliable C information on the Joel on Software forums ?

Bill Forster 2009-12-01 22:09:10

Misleading post -1. In the real world, the importance of never calling these deprecated functions is real. Adding all the "boilerplate" is an avenue for errors to enter. The "preconditions" you check for `strcpy()` are mindless; for one thing, why add the source length; for another, checking for NULL is useless and misleading code.

Heath Hunnicutt 2010-07-06 14:13:23

Answer 11

A:

I would say that scanf is good sometimes, more specifically when you really need to read something FAST. It is magnitudes faster than cin<<.

I recall a task on the international olympiad in informatics (IOI), where you needed to use scanf, since cin took too much time.

Paxinum 2009-08-16 10:05:31

cin<< doesn't exist in C

Carson Myers 2009-08-18 23:14:13

Oh, thats right.

Paxinum 2009-08-19 09:03:05

Answer 12

+2 A:

To add something about strncpy most people here forgot to mention. strncpy can result in performance problems as it clears the buffer to the length given.

char buff[1000];
strncpy(buff, "1", sizeof buff);

will copy 1 char and overwrite 999 bytes with 0

Another reason why I prefer strlcpy (I know strlcpy is a BSDism but it is so easy to implement that there's no excuse to not use it).

tristopia 2009-10-22 14:52:48

ansaurus

tags:

views:

answers:

C's "bad" functions vs. their "good" alternatives

related questions