views:

1228

answers:

12

I've read on stackoverflow that some C functions are 'obsolete' or should be 'avoided'. Can you please give me some examples of this kind of functions + the reason why.

What alternatives of those function exists ?

Can we use them safely - any good practices ?

Later edit: From the answers:

  • gets : can cause buffer overflows.
  • scanf : for strings it would be better to use fgets (followed by self parsing using sscanf or strchr).
  • sprintf (instead use snprintf)
  • strcat : avoid it when dealing with user input (instead use strncat, even if strncat has its pitfalls.)
  • strcpy : avoid it when dealing with user input (instead use strncpy, even if strncopy has its pitfalls.)
  • strtok : is not thread safe.
+5  A: 

Avoid

  • strtok for multithreaded programs as its not thread-safe.
  • gets as it could cause buffer overflow
codaddict
They are slightly different cases. You can use strtok safely if you know your program isn't multithreaded or you somehow lock access to it, but you can't use gets safely so should never use it.
John Burton
The problem with `strtok()` goes a little beyond thread safety - it's not safe even in a single threaded program unless you know for certain that no functions your code might call while using `strtok()` don't also use it (or they'll mess up `strtok()`'s state right from under you). In fact, most compilers that target multi-threaded platforms take care of `strtok()`'s potential problems as far as threads go by using thread-local storage for `strtok()`'s static data. But that still doesn't fix the problem of other functions using it while you are (in the same thread).
Michael Burr
Indeed and I'll shut up now as I don't want to encourage anyone to use strtok as it does suffer from many problems. I just wanted to point out that it's different from gets though as it *is* possible to use it safely whereas it's impossible to use gets safely.
John Burton
What's a good alternative to strtok?
Jimmy
@Jimmy: There are often nonstandard library extensions, or you could write your own. The big problem with `strtok()` is that it keeps precisely one buffer to work in, so most good replacements require keeping a buffer value around and passing it in.
David Thornley
+33  A: 

Deprecated Functions
Insecure
A perfect example of such a function is gets(), because there is no way to tell it how big the destination buffer is. Consequently, any program that reads input using gets() has a buffer overflow vulnerability. For similar reasons, one should use strncpy() in place of strcpy() and strncat() in place of strcat().

Yet some more examples include the tmpfile() and mktemp() function due to potential security issues with overwriting temporary files and which are superseded by the more secure mkstemp() function.

Non-Reentrant
Other examples include gethostbyaddr() and gethostbyname() which are non-reentrant (and, therefore, not guaranteed to be threadsafe) and have been superseded by the reentrant getaddrinfo() and freeaddrinfo().

You may be noticing a pattern here... either lack of security (possibly by failing to include enough information in the signature to possibly implement it securely) or non-reentrance are common sources of deprecation.

Outdated, Non-Portable
Some other functions simply become deprecated because they duplicate functionality and are not as portable as other variants. For example, bzero() is deprecated in favor of memset().

Thread Safety and Reentrance
You asked, in your post, about thread safety and reentrance. There is a slight difference. A function is reentrant if it does not use any shared, mutable state. So, for example, if all the information it needs is passed into the function, and any buffers needed are also passed into the function (rather than shared by all calls to the function), then it is reentrant. That means that different threads, by using independent parameters, do not risk accidentally sharing state. Reentrancy is a stronger guarantee than thread safety. A function is thread safe if it can be used by multiple threads concurrently. A function is thread safe if:

  • It is reentrant (i.e. it does not share any state between calls), or:
  • It is non-reentrant, but it uses synchronization/locking as needed for shared state.

In general, in the Single UNIX Specification and IEEE 1003.1 (i.e. "POSIX"), any function which is not guaranteed to be reentrant is not guaranteed to be thread safe. So, in other words, only functions which are guaranteed to be reentrant may be portably used in multithreaded applications (without external locking). That does not mean, however, that implementations of these standards cannot choose to make a non-reentrant function threadsafe. For example, Linux frequently adds synchronization to non-reentrant functions in order to add a guarantee (beyond that of the Single UNIX Specification) of threadsafety.

Strings (and Memory Buffers, in General)
You also asked if there is some fundamental flaw with strings/arrays. Some might argue that this is the case, but I would argue that no, there is no fundamental flaw in the language. C and C++ require you to pass the length/capacity of an array separately (it is not a ".length" property as in some other languages). This is not a flaw, per-se. Any C and C++ developer can write correct code simply by passing the length as a parameter where needed. The problem is that several APIs that required this information failed to specify it as a parameter. Or assumed that some MAX_BUFFER_SIZE constant would be used. Such APIs have now been deprecated and replaced by alternative APIs that allow the array/buffer/string sizes to be specified.

Scanf (In Answer to Your Last Question)
Personally, I use the C++ iostreams library (std::cin, std::cout, the << and >> operators, std::getline, std::istringstream, std::ostringstream, etc.), so I do not typically deal with that. If I were forced to use pure C, though, I would personally just use fgetc() or getchar() in combination with strtol(), strtoul(), etc. and parse things manually, since I'm not a huge fan of varargs or format strings. That said, to the best of my knowledge, there is no problem with [f]scanf(), [f]printf(), etc. so long as you craft the format strings yourself, you never pass arbitrary format strings or allow user input to be used as format strings, and you use the formatting macros defined in <inttypes.h> where appropriate. (Note, snprintf() should be used in place of sprintf(), but that has to do with failing to specify the size of the destination buffer and not the use of format strings). I should also point out that, in C++, boost::format provides printf-like formatting without varargs.

Michael Aaron Safyan
Is gethostbyaddr, gethostbyname from the standard library ?
Andrei Ciobanu
@nomemory, no; they are defined in the Single UNIX Specification.
Michael Aaron Safyan
Thank you for your clarifications.
Andrei Ciobanu
+1, best answer. Very thorough and detailed.
Tim Post
"Deprecated" is a strong word, which has a specific meaning when discussing the C++ Standard. In that sense, gets(), strcpy() etc. are not deprecated.
anon
@Neil, "deprecated" is a generic programming term derived from "depreciated" and simply means held or looked on with contempt. In that sense, I believe I am using the terms correctly.
Michael Aaron Safyan
Just so long as you distinguish between "deprecated by the C standard", "deprecated by Michael Aaron Safyan", and "deprecated by person or persons unknown who hopefully know what they're talking about [citation needed]". The question *is* about preferred coding style, not about the C standard, so the second two are appropriate. But like Neil I required a double-take before realising that your statements did not intend to imply the first meaning.
Steve Jessop
@Steve, fair enough. I only use the term when the Single UNIX Specification's informative section recommends against its use and suggests using an alternative function.
Michael Aaron Safyan
@Steve, for example, with bzero, it says "The memset() function is preferred over this function."
Michael Aaron Safyan
Lastly, I'd like to point out that the original question did not specify the ISO C standard library... and so my answer includes many functions that are commonly used in C but not from the standard library.
Michael Aaron Safyan
Sure, in the case of the POSIX functions, if the POSIX specification deprecates them, then they "are deprecated", and it advises against using certain standard C functions *on POSIX systems*. `gets` is deprecated in the latest C99 draft, `strcpy` isn't. In C++, deprecation is meaningless: "deprecated" is defined to mean "not guaranteed to be part of the Standard in future revisions", but C++0x is removing a feature which was not deprecated in C++03. So non-deprecated features aren't guaranteed not to be removed either. Hence IMO the need to resolve the ambiguity of "is deprecated".
Steve Jessop
`strncpy` should generally be avoided as well. It doesn't do what most programmers assumes it does. It does not guarantee termination (leading to buffer overruns), and it pads shorter strings (possibly degrading performance in some cases).
Adrian McCarthy
@Adrian: I agree with you - neither `strncpy()` nor the even worse `strncat()` is a sensible replacement for the n-less variants.
Jonathan Leffler
@Adrian, strncpy() and strcat() actually *do NUL-terminate strings*, but _only if_ the destination buffer has remaining space. The non-portable strlcpy() and strlcat(), while they will forcibly terminate the strings, also lead to truncations and detecting them is not entirely intuitive. Moreover, since buffers should be dynamically allocated (to allow for arbitrary input sizes), it is really not that difficult to ensure that buffers are always one more character in length than the data (in order to provide space for the NUL-terminator).
Michael Aaron Safyan
@Michael: I didn't say `strncpy` doesn't terminate; I said it doesn't _guarantee_ termination, which is true. Many programmers believe that it does guarantee termination, and that leads to buffer overruns, which is why many shops ban it from their code.
Adrian McCarthy
+2  A: 

Almost any function that deals with NUL terminated strings is potentially unsafe. If you are receiving data from the outside world and manipulating it via the str*() functions then you set yourself up for catastrophe

rep_movsd
+5  A: 

Some people would claim that strcpy and strcat should be avoided, in favor of strncpy and strncat. This is somewhat subjective, in my opinion.

They should definitely be avoided when dealing with user input - no doubt here.

In code "far" from the user, when you just know the buffers are long enough, strcpy and strcat may be a bit more efficient because computing the n to pass to their cousins may be superfluous.

Eli Bendersky
And if available, better yet `strlcat` and `strlcpy`, since the 'n' version does not guarantee NULL termination of the destination string.
Dan Andreatta
Sounds like premature optimization to me. But IIRC, `strncpy()` will write write exactly `n` bytes, using nul characters if necessary. As Dan, using a safe version is the best choice IMO.
Bastien Léonard
@Dan, these functions are unfortunately non-standard, and hence don't belong to this discussion
Eli Bendersky
@Eli: but the fact that `strncat()` can be difficult to use correctly and safely (and should be avoided) is on topic.
Michael Burr
@Michael: I wouldn't say it's difficult to use correctly and safely. With care, it works just fine
Eli Bendersky
@Eli: also `strcpy`, with care, works just fine.
Dan Andreatta
I don't think it matters whether it's user input or not, it depends on your coding style. If you are correctly calculating required buffer sizes before using `strcpy`, then replacing it with `strncpy` will not improve your code. If you belong to the "oh, 1024 will be enough" school of string-handling, then `strncpy` will improve your code. User input might be malicious, whereas trusted input will not be, but if there are errors in your code then (a) trusted input can trip them too, and (b) *there can be errors even using strncpy*, because you still have to pass the right length in.
Steve Jessop
So the classic strncpy/strncat error is `char buf[128]; strncpy(buf, "http://", 128); strncat(buf, url, 128);`. You should have subtracted the length of the first string from 128. If you're careful enough to do that, then you're careful enough to have just computed `strlen("http://") + strlen(url) + 1` in the first place. And likewise with ensuring that strings (`url` in particular in this example) actually do get nul-terminated.
Steve Jessop
@Dan - you're right, that's exactly what I wrote in my answer. But to get it right with user input is so hard that it's better not to use it at all there.
Eli Bendersky
@Eli: I've updated my answer (http://stackoverflow.com/questions/2565727/what-are-the-c-functions-from-the-standard-library-that-must-should-be-avoided/2565831#2565831) with more information about why `strncat()` should be avoided (even if it's possible to use correctly).
Michael Burr
+2  A: 

All functions that have a n-version should be avoided. Use strncpy instead and not strcpy for example.

Frank
Careful. `strncpy` has its own pitfalls. Also, I first misinterpreted your first sentence to mean to avoid n-versions.
jamesdlin
+7  A: 

Several answers here suggest using strncat() over strcat(); I'd suggest that strncat() (and strncpy()) should also be avoided. It has problems that make it difficult to use correctly and lead to bugs:

  • the length parameter to strncat() is related to (but not quite exactly - see the 3rd point) the maximum number of characters that can be copied to the destination rather than the size of the destination buffer. This makes strncat() more difficult to use than it should be, particularly if multiple items will be concatenated to the destination.
  • it can be difficult to determine if the result was truncated (which may or may not be important)
  • it's easy to have an off-by-one error. As the C99 standard notes, "Thus, the maximum number of characters that can end up in the array pointed to by s1 is strlen(s1)+n+1" for a call that looks like strncat( s1, s2, n)

strncpy() also has an issue that can result in bugs you try to use it in an intuitive way - it doesn't guarantee that the destination is null terminated. To ensure that you have to make sure you specifically handle that corner case by dropping a '\0' in the buffer's last location yourself (at least in certain situations).

I'd suggest using something like OpenBSD's strlcat() and strlcpy() (though I know that some people dislike those functions; I believe they're far easier to use safely than strncat()/strncpy()).

Here's a little of what Todd Miller and Theo de Raadt had to say about problems with strncat() and strncpy():

There are several problems encountered when strncpy() and strncat() are used as safe versions of strcpy() and strcat(). Both functions deal with NUL-termination and the length parameter in different and non-intuitive ways that confuse even experienced programmers. They also provide no easy way to detect when truncation occurs. ... Of all these issues, the confusion caused by the length parameters and the related issue of NUL-termination are most important. When we audited the OpenBSD source tree for potential security holes we found rampant misuse of strncpy() and strncat(). While not all of these resulted in exploitable security holes, they made it clear that the rules for using strncpy() and strncat() in safe string operations are widely misunderstood.

OpenBSD's security audit found that bugs with these functions were "rampant". Unlike gets(), these functions can be used safely, but in practice there are a lot of problems because the interface is confusing, unintuitive and difficult to use correctly. I know that Microsoft has also done analysis (though I don't know how much of their data they may have published), and as a result have banned (or at least very strongly discouraged - the 'ban' might not be absolute) the use of strncat() and strncpy() (among other functions).

Some links with more information:

Michael Burr
It's far better to keep track of the length of strings outside the string itself. Like that, concatenating two strings(-with-length) safely becomes a matter of a simple calculation (to get the required buffer size) a possible reallocation, and a `memmove()`. (Well, you can use `memcpy()` in the normal case when the strings are independent.)
Donal Fellows
Your second point is dead wrong - `strncat()` *always* terminates the destination string.
caf
Your second point is wrong for `strncat()`. However, it is correct for `strncpy()`, which has some other problems. `strncat()` is a reasonable replacement for `strcat()`, but `strncpy()` is not a reasonable replacement for `strcpy()`.
David Thornley
caf and David are 100% right about my claim that `strncat()` doesn't always null terminate. I was confusing the behaviors of `strncat()` and `strncpy()` a bit (another reason they're functions to avoid - they have names that imply similar behaviors, but in fact behave differently in important ways...). I've amended my answer to correct this as well as add additional information.
Michael Burr
+2  A: 

Don't forget about sprintf - it is the cause of many problems. This is true because the alternative, snprintf has sometimes different implementations which can make you code unportable.

  1. linux: http://linux.die.net/man/3/snprintf

  2. windows: http://msdn.microsoft.com/en-us/library/2ts7cx93%28VS.71%29.aspx

In case 1 (linux) the return value is the amount of data needed to store the entire buffer (if it is smaller than the size of the given buffer then the output was truncated)

In case 2 (windows) the return value is a negative number in case the output is truncated.

Generally you should avoid functions that are not:

  1. buffer overflow safe (a lot of functions are already mentioned in here)

  2. thread safe/not reentrant (strtok for example)

In the manual of each functions you should search for keywords like: safe, sync, async, thread, buffer, bugs

Iulian Şerbănoiu
Thank you (Multumesc :) )
Andrei Ciobanu
`_sprintf()` was created by Microsoft before the standard `snprintf()` arrived, IIRC. [´StringCbPrintf()´](http://msdn.microsoft.com/en-us/library/ms647510%28VS.85%29.aspx) is quite similar to `snprintf()` though.
Bastien Léonard
+10  A: 

Once again people are repeating, mantra-like, the ludicrous assertion that the "n" version of str functions are safe versions.

If that was what they were intended for then they would always null terminate the strings.

The "n" versions of the functions were written for use with fixed length fields (such as directory entries in early file systems) where the nul terminator is only required if the string does not fill the field. This is also the reason why the functions have strange side effects that are pointlessly inefficient if just used as replacements - take strncpy() for example:

If the array pointed to by s2 is a string that is shorter than n bytes, null bytes are appended to the copy in the array pointed to by s1, until n bytes in all are written.

As buffers allocated to handle filenames are typically 4kbytes this can lead to a massive deterioration in performance.

If you want "supposedly" safe versions then obtain - or write your own - strl routines (strlcpy, strlcat etc) which always nul terminate the strings and don't have side effects. Please note though that these aren't really safe as they can silently truncate the string - this is rarely the best course of action in any real-world program. There are occasions where this is OK but there are also many circumstances where it could lead to catastrophic results (e.g. printing out medical prescriptions).

Dipstick
You are right about `strncpy()`, but wrong about `strncat()`. `strncat()` was not designed for use with fixed-length fields - it was actually designed to be a `strcat()` that limits the number of characters concatenated. It is quite easy to use this as a "safe `strcat()`" by keeping track of the space remaining in the buffer when doing multiple concatenations, and even easier to use it as a "safe `strcpy()`" (by setting the first character of the destination buffer to `'\0'` before calling it). `strncat()` *always* terminates the destination string, and it does not write extra `'\0'`s.
caf
@caf - yes but strncat() is totally useless as it takes as a parameter the maximum length to be copied - not the length of the destination buffer. To prevent buffer overflow you need to know the length of the current destination string, and if you know that why would you use strncat() - which has to work it out the dest length again - rather than just strlcat() the source string to the end of the dest string.
Dipstick
That still doesn't change the fact that you imply that `strncat()` doesn't always nul-terminate the destination, and that it was designed for use with fixed-length fields, both of which are wrong.
caf
@chrisharris: `strncat()` will work properly regardless of the length of the source string, while `strcat()` won't. The problem with `strlcat()` here is that it's not a standard C function.
David Thornley
@caf - you are correct about strncat(). I've never actually used it because - as I pointed out above - it is totally useless. It should therefore still be avoided.
Dipstick
@caf, @David, not only is "strlcat()" not a standard C function, but it is also not in IEEE Std 1003.1 or in the Single UNIX Specification. It is only in BSD-based OSs.
Michael Aaron Safyan
+1 for giving a correct explanation of the idiotic strn* functions.
R..
+1  A: 

It is very hard to use scanf safely. Good use of scanf can avoid buffer overflows, but you are still vulnerable to undefined behavior when reading numbers that don't fit in the requested type. In most cases, fgets followed by self-parsing (using sscanf, strchr, etc.) is a better option.

But I wouldn't say "avoid scanf all the time". scanf has its uses. As an example, let's say you want to read user input in a char array that's 10 bytes long. You want to remove the trailing newline, if any. If the user enters more than 9 characters before a newline, you want to store the first 9 characters in the buffer and discard everything until the next newline. You can do:

char buf[10];
scanf("%9[^\n]%*[^\n]", buf));
getchar();

Once you get used to this idiom, it's shorter and in some ways cleaner than:

char buf[10];
if (fgets(buf, sizeof buf, stdin) != NULL) {
    char *nl;
    if ((nl = strrchr(buf, '\n')) == NULL) {
        int c;
        while ((c = getchar()) != EOF && c != '\n') {
            ;
        }
    } else {
        *nl = 0;
    }
}
Alok
+3  A: 

It is probably worth adding again that strncpy() is not the general-purpose replacement for strcpy() that it's name might suggest. It is designed for fixed-length fields that don't need a nul-terminator (it was originally designed for use with UNIX directory entries, but can be useful for things like encryption key fields).

It is easy, however, to use strncat() as a replacement for strcpy():

if (dest_size > 0)
{
    dest[0] = '\0';
    strncat(dest, source, dest_size - 1);
}

(The if test can obviously be dropped in the common case, where you know that dest_size is definitely nonzero).

caf
A: 

atoi is not thread safe. I use strtol instead, per recommendation from the man page.

Fred
This sounds like it applies to one particular implementation. There's no reason why `strtol()` would be thread-safe and `atoi()` wouldn't be.
David Thornley
+2  A: 

Also check out Microsoft's list of banned APIs. These are APIs (including many already listed here) that are banned from Microsoft code because they are often misused and lead to security problems.

You may not agree with all of them, but they are all worth considering. They add an API to the list when its misuse has led to a number of security bugs.

Adrian McCarthy