ansaurus

Question

Homework: In C, how does one get a substring of an array using only pointers?

Answer 1

+2 A:

#include <stdio.h>

int main(void)
{
    char string1[] = "something"; //[s][o][m][e][t][h][i][n][g][\0]
    int start = 2, count = 3;
    char string2[10] = {0};
    char *ptr1 = &string1[start];
    char *stop = ptr1 + count;
    char *ptr2 = string2;
    while ((ptr1 < stop) && (*ptr2++ = *ptr1++));
    printf("%s",string2);
    return 0;
}

Graham Lee 2010-02-21 15:54:20

thanks! I was missing the plain ole addition of ptr1 + count to test the memory location at the end. Not perfect, but enough to get me going in the right direction.

randy melder 2010-02-21 17:04:48

Missing the null terminator for string2, I fear...

Jonathan Leffler 2010-02-21 17:53:07

@Jonathan: `char string2[10] = {0};`

Graham Lee 2010-02-21 19:23:19

@Graham: Oh, yes...in this case, the target buffer is pre-initialized with zero bytes. In the general case, adding the null is necessary.

Jonathan Leffler 2010-02-21 19:56:17

@Jonathan: or you can take my general solution to include zero-initialisation. I'm a security guy, I don't like junk data.

Graham Lee 2010-02-21 20:32:16

@Graham - me too; me neither :D. But it depends how the buffer is going to be used. If the buffer will be treated as a null terminated string, there is no junk data in the initialized part that will be used. If the buffer will be written in its entirety somewhere (like disk), then the junk after the terminal null does matter. If you're unsure, then nulling out the residual data is a good idea - one possible advantage of `strncpy()` after all. But most often, in my experience, the buffer is treated as a string and there is little or no danger in simple null termination. YMMV, of course.

Jonathan Leffler 2010-02-21 22:31:39

@Jonathan: I'd never use `strncpy()`, by the way. It doesn't guarantee to null-terminate the destination. Consider `strlcpy()` if your platform supports it.

Graham Lee 2010-02-21 23:05:15

Answer 2

A:

In order to get the size (i.e. number of elements) in a static array, you would usually do

sizeof(string1) / sizeof(*string1)

which will divide the size (in bytes) of the array by the size (in bytes) of each element, thus giving you the number of elements in the array.

But as you're obviously trying to implement a strcpy clone, you could simply break the loop if the source character *ptr1 is '\0' (C strings are zero-terminated). If you only want to copy N characters, you could break if ptr1 >= string1 + start + count.

AndiDog 2010-02-21 15:54:46

sizeof(char)==1. That's guaranteed by the standard.

Graham Lee 2010-02-21 15:59:03

@Graham Lee: Yes, but this is the generic solution for arrays.

AndiDog 2010-02-21 16:56:45

Answer 3

+3 A:

When you write ptr1++;, it is equivalent to ptr1 = ptr1 + 1;. Adding an integer to a pointer moves the memory location of the pointer by the size (in bytes) of the type being pointed to. If ptr1 is a char pointer with value 0x5678 then incrementing it by one makes it 0x5679, because sizeof(char) == 1. But if ptr1 was a Foo *, and sizeof(Foo) == 12, then incrementing the pointer would make its value 0x5684.

If you want to point to an element that is 3 elements away from an element you already have a pointer to, you just add 3 to that pointer. In your question, you wrote:

char *ptr1 = &string1[start]; // array notation

Which is the same thing as:

char *ptr1 = string1 + start; // pointer arithmetic

You could rewrite as follows:

int main(void)
{
    char string1[] = "something"; //[s][o][m][e][t][h][i][n][g][\0]
    int start = 2, count = 3;
    char string2[10] = {0};

    // Ensure there is enough room to copy the substring
    // and a terminating null character.
    assert(count < sizeof(string2));

    // Set pointers to the beginning and end of the substring.
    const char *from = string1 + start;
    const char *end = from + count;

    // Set a pointer to the destination.
    char *to = string2;

    // Copy the indicated characters from the substring,
    // possibly stopping early if the end of the substring
    // is reached before count characters have been copied.
    while (from < end && *from)
    {
        *to++ = *from++
    }

    // Ensure the destination string is null terminated
    *to = '\0';

    printf("%s",&string2);
}

Using const and meaningful variable names (from, to, or src, dst, instead of ptr1, ptr2) helps you avoid mistakes. Using assert and ensuring the string is null-terminated helps you avoid having to debug segfaults and other weirdness. In this case the destination buffer is already zeroed, but when you copy parts of this code to use in another program it may not be.

Dan 2010-02-21 16:44:26

Answer 4

+1 A:

I usually use a specific set of variable names in these situations, called:

src - source
dst - destination
end - the end of either the source (used here) or the destination

So:

int main(void)
{
    char string1[] = "something";
    int start = 2;
    int count = 3;
    char string2[10] = {0};

    const char *src = &string1[start];
    const char *end = &string1[start+count];
    char *dst = string2;

    assert(count < sizeof(string2);
    while (src < end)
        *dst++ = *src++;
    *dst = '\0';             // Null-terminate copied string!

    printf("%s",&string2);
    return(0);
}

Or, more plausibly, packaged as a function:

char *copy_substr(char *dst, const char *str, size_t start, size_t len)
{
    const char *src = str + start;
    const char *end = src + len;
    while (src < end)
        *dst++ = *src++;
    *dst = '\0';
    return(dst);
}

int main(void)
{
    char string1[] = "something";
    char *end;
    char string2[10] = {0};

    end = copy_substr(string2, string1, 2, 3);
    printf("%s",&string2);
    return(0);
}

The function returns a pointer to the end of the string which is aconventional and doesn't provide a marked benefit in the example, but which does have some merits when you are building a string piecemeal:

struct substr
{
    const char *str;
    size_t      off;
    size_t      len;
};

static struct substr list[] =
{
    { "abcdefghijklmnopqrstuvwxyz",  2, 5 },
    ...
    { "abcdefghijklmnopqrstuvwxyz", 18, 3 },
};

int main(void)
{
    char buffer[256];
    char *str = buffer;
    char *end = buffer + sizeof(buffer) - 1;
    size_t i;

    for (i = 0; i < 5; i++)
    {
         if (str + list[i].len >= end)
             break;
         str = copy_substr(str, list[i].str, list[i].off, list[i].len);
    }

    printf("%s\n", buffer);
    return(0);
}

The main point is that the return value - a pointer to the NUL at the end of the string - is what you need for string concatenation operations. (In this example, with strings that have known lengths, you could survive without this return value without needing to use strlen() or strcat() repeatedly; in contexts where the called function copies an amount of data that cannot be determined by the calling routine, the pointer to the end is even more useful.)

Jonathan Leffler 2010-02-21 16:56:28

ansaurus

tags:

views:

answers:

Homework: In C, how does one get a substring of an array using only pointers?

related questions