views:

317

answers:

4

Using pointer arithmetic, it's possible to assign characters from one array to another. My question is, how does one do it given arbitrary start and stop points?

int main(void)
{
    char string1[] = "something"; //[s][o][m][e][t][h][i][n][g][\0]
    int start = 2, count = 3;
    char string2[10] = {0};
    char *ptr1 = &string1[start];
    char *ptr2 = string2;
    while (*ptr2++ = *ptr1++) { } //but stop after 3 elements???
    printf("%s",&string2);
}

There's some kind of pointer arithmetic I'm missing to count/test the quantity of elements in a particular array. I do NOT want to declare an integral to count the loop! I want to do it all using pointers. Thanks!

+2  A: 
#include <stdio.h>

int main(void)
{
    char string1[] = "something"; //[s][o][m][e][t][h][i][n][g][\0]
    int start = 2, count = 3;
    char string2[10] = {0};
    char *ptr1 = &string1[start];
    char *stop = ptr1 + count;
    char *ptr2 = string2;
    while ((ptr1 < stop) && (*ptr2++ = *ptr1++));
    printf("%s",string2);
    return 0;
}
Graham Lee
thanks! I was missing the plain ole addition of ptr1 + count to test the memory location at the end. Not perfect, but enough to get me going in the right direction.
randy melder
Missing the null terminator for string2, I fear...
Jonathan Leffler
@Jonathan: `char string2[10] = {0};`
Graham Lee
@Graham: Oh, yes...in this case, the target buffer is pre-initialized with zero bytes. In the general case, adding the null is necessary.
Jonathan Leffler
@Jonathan: or you can take my general solution to include zero-initialisation. I'm a security guy, I don't like junk data.
Graham Lee
@Graham - me too; me neither :D. But it depends how the buffer is going to be used. If the buffer will be treated as a null terminated string, there is no junk data in the initialized part that will be used. If the buffer will be written in its entirety somewhere (like disk), then the junk after the terminal null does matter. If you're unsure, then nulling out the residual data is a good idea - one possible advantage of `strncpy()` after all. But most often, in my experience, the buffer is treated as a string and there is little or no danger in simple null termination. YMMV, of course.
Jonathan Leffler
@Jonathan: I'd never use `strncpy()`, by the way. It doesn't guarantee to null-terminate the destination. Consider `strlcpy()` if your platform supports it.
Graham Lee
A: 

In order to get the size (i.e. number of elements) in a static array, you would usually do

sizeof(string1) / sizeof(*string1)

which will divide the size (in bytes) of the array by the size (in bytes) of each element, thus giving you the number of elements in the array.

But as you're obviously trying to implement a strcpy clone, you could simply break the loop if the source character *ptr1 is '\0' (C strings are zero-terminated). If you only want to copy N characters, you could break if ptr1 >= string1 + start + count.

AndiDog
sizeof(char)==1. That's guaranteed by the standard.
Graham Lee
@Graham Lee: Yes, but this is the generic solution for arrays.
AndiDog
+3  A: 

When you write ptr1++;, it is equivalent to ptr1 = ptr1 + 1;. Adding an integer to a pointer moves the memory location of the pointer by the size (in bytes) of the type being pointed to. If ptr1 is a char pointer with value 0x5678 then incrementing it by one makes it 0x5679, because sizeof(char) == 1. But if ptr1 was a Foo *, and sizeof(Foo) == 12, then incrementing the pointer would make its value 0x5684.

If you want to point to an element that is 3 elements away from an element you already have a pointer to, you just add 3 to that pointer. In your question, you wrote:

char *ptr1 = &string1[start]; // array notation

Which is the same thing as:

char *ptr1 = string1 + start; // pointer arithmetic

You could rewrite as follows:

int main(void)
{
    char string1[] = "something"; //[s][o][m][e][t][h][i][n][g][\0]
    int start = 2, count = 3;
    char string2[10] = {0};

    // Ensure there is enough room to copy the substring
    // and a terminating null character.
    assert(count < sizeof(string2));

    // Set pointers to the beginning and end of the substring.
    const char *from = string1 + start;
    const char *end = from + count;

    // Set a pointer to the destination.
    char *to = string2;

    // Copy the indicated characters from the substring,
    // possibly stopping early if the end of the substring
    // is reached before count characters have been copied.
    while (from < end && *from)
    {
        *to++ = *from++
    }

    // Ensure the destination string is null terminated
    *to = '\0';

    printf("%s",&string2);
}

Using const and meaningful variable names (from, to, or src, dst, instead of ptr1, ptr2) helps you avoid mistakes. Using assert and ensuring the string is null-terminated helps you avoid having to debug segfaults and other weirdness. In this case the destination buffer is already zeroed, but when you copy parts of this code to use in another program it may not be.

Dan
+1  A: 

I usually use a specific set of variable names in these situations, called:

  • src - source
  • dst - destination
  • end - the end of either the source (used here) or the destination

So:

int main(void)
{
    char string1[] = "something";
    int start = 2;
    int count = 3;
    char string2[10] = {0};

    const char *src = &string1[start];
    const char *end = &string1[start+count];
    char *dst = string2;

    assert(count < sizeof(string2);
    while (src < end)
        *dst++ = *src++;
    *dst = '\0';             // Null-terminate copied string!

    printf("%s",&string2);
    return(0);
}

Or, more plausibly, packaged as a function:

char *copy_substr(char *dst, const char *str, size_t start, size_t len)
{
    const char *src = str + start;
    const char *end = src + len;
    while (src < end)
        *dst++ = *src++;
    *dst = '\0';
    return(dst);
}

int main(void)
{
    char string1[] = "something";
    char *end;
    char string2[10] = {0};

    end = copy_substr(string2, string1, 2, 3);
    printf("%s",&string2);
    return(0);
}

The function returns a pointer to the end of the string which is aconventional and doesn't provide a marked benefit in the example, but which does have some merits when you are building a string piecemeal:

struct substr
{
    const char *str;
    size_t      off;
    size_t      len;
};

static struct substr list[] =
{
    { "abcdefghijklmnopqrstuvwxyz",  2, 5 },
    ...
    { "abcdefghijklmnopqrstuvwxyz", 18, 3 },
};

int main(void)
{
    char buffer[256];
    char *str = buffer;
    char *end = buffer + sizeof(buffer) - 1;
    size_t i;

    for (i = 0; i < 5; i++)
    {
         if (str + list[i].len >= end)
             break;
         str = copy_substr(str, list[i].str, list[i].off, list[i].len);
    }

    printf("%s\n", buffer);
    return(0);
}

The main point is that the return value - a pointer to the NUL at the end of the string - is what you need for string concatenation operations. (In this example, with strings that have known lengths, you could survive without this return value without needing to use strlen() or strcat() repeatedly; in contexts where the called function copies an amount of data that cannot be determined by the calling routine, the pointer to the end is even more useful.)

Jonathan Leffler