views:

142

answers:

2

Is there an elegant, cross-platform, industry standard way of implementing substr() in C?

or is it a case of every developer reinventing the wheel?

EDIT: Added 'cross-platform'.

+3  A: 

its like that:

#include <string.h>
char* substr(int offset, int len, const char* input)
{
    /* this one assumes that strlen(input) >= offset */
    return strndup(input+offset, len);
}

EDIT: added the handling of offset>strlen and removed the strndup usage

#include <string.h>
#include <stdlib.h>

char* substr(size_t offset, size_t len, const char* input)
{
    const size_t inputLength = strlen(input);
    if(offset <= inputLength)
    {
        size_t resultLength = inputLength-offset;
        char* result = NULL;
        if (len < resultLength)
        {
            resultLength = len;
        }
        result = malloc(resultLength+1);
        if(NULL != result)
        {
            strncpy(result, input+offset, resultLength);
            result[resultLength] = 0;
        }
        return result;
    }
    else
    {
        /* Offset is larger than the string length */
        return NULL;
    }
}
Rudi
Is this cross-platform?
Gary Willoughby
@gary: yes. @rudi: You should really handle the case where `offset > strlen(input)`
JeremyP
@JeremyP: By what measure do you consider that "cross-platform"? `strndup` is a POSIX function not an ISO function and therefore not universally available.
Clifford
Even with POSIX, strndup() did not appear until 2006 as part of the standard. For this to be portable, your function would have to implement the functionality of strndup(). See the notes at the bottom of http://www.opengroup.org/onlinepubs/9699919799/functions/strdup.html
Tim Post
So it's cross most platforms. The only major missing one is possibly Windows and it would be fairly easy to implement.
JeremyP
@JeremyP +1 for cross most platforms
Rudi
@rudi +1 Your rewrite is better than relying on Posix because it uses only a ISO C99 function (strncpy).
JeremyP
@Clifford: It turns out that an ISO conforming program is not necessarily "universally available". Visual Studio 2010 *still* does not fully support C99. However, strncpy is there, so Rudi's amended program will work.
JeremyP
You might as well use `memcpy()` there instead of `strncpy()`, since you know the exact length you're copying and you're adding the nul-terminator manually anyway.
caf
@JeremyP: "most platforms" only in terms of OS distributions, not in terms of deployed systems, and therefore most end-users' systems, and the largest target market. Windows, and also most embedded systems where the function is either not provided, or its use would be ill-advised. I did not specify or intent C99 compliance; if portability were an issue, that would be foolish, but the point is irellevant with respect to strndup in any case.
Clifford
+4  A: 

My comment to the original question notwithstanding, the problem is that such a function needs a buffer to store the result, and it is not clear whether this should be provided by the caller or instantiated by the function (leaving the question of how it is later destroyed), or even created in place by modifying the original string. In different situations you may want different behaviour, so rolling your own may be beneficial (and is trivial in any case).

// Caller supplied destination
char* substr( const char* source, size_t start, size_t end, char* dest )
{
    memmove( dest, &source[start], end - start ) ;
    dest[end - start] = 0 ;
    return dest ;
}

// Automatically instantiated destination (and a memory leak!)
char* substr( const char* source, size_t start, size_t end )
{
    char* dest = malloc( end - start + 1) ;
    memcpy( dest, &source[start], end - start ) ;
    dest[end - start] = 0 ;
    return dest ;
}

// Modify in-place (original string truncated)
char* substr( char* source, size_t start, size_t end )
{
    source[end+1] = 0 ;
    return &source[start] ;
}

Note in all the above, the validity of the arguments such as bounds checking and determining that start < end is left to the caller, and in this respect they reflect the philosophy of the standard string library. In most cases I'd prefer the first as having the greatest utility and being more in-keeping with the design of the standard library. memmove() is used over memcpy(), in order that the source and destination may be the same location, if you do not need that, it may be more efficient to use memcpy().

Clifford
the last substr function, the source pointer should be of the type char* and not const char*, because you are going to modify the string
quinmars
Clifford