views:

639

answers:

13

I’m much less experienced in C than I am in higher-level languages. At Cisco, we use C, and I sometimes run into something that would be easy to do in Java or Python, but very difficult to do in C. Now is one of those times.

I have a dynamically-allocated array of unsigned integers which I need to convert to a comma-separated string for logging. While the integers are not likely to be very large, they could conceptually be anywhere from 0 to 4,294,967,295 In Python, that’s one short line.

my_str = ','.join(my_list)

How elegantly can people do this in C? I came up with a way, but it’s gross. If anyone knows a nice way to do it, please enlighten me.

A: 

Assuming that when you mention "for logging" you mean writing to a log file, then your solution could look something like this (pseudocoded):

for (int x in array) {
    fprintf(log, "%d", x);
    if (! last element)
        fputc(log, ',');
}
Steven Schlansker
A: 
char buf [11 * sizeof (my_list)];
for (int n = 0, int j = 0;  j < sizeof (my_list) / sizeof (my_list [0]);  ++j)
    n += sprintf (&buf [n], "%s%u",   (j > 0) ? "," : "",  my_list [j]);
wallyk
Need to be careful with `sprintf` here - you don't want a buffer overflow.
Michael E
Is sprintf safe to call in this way? To make the code comparable to the python posted, you should show the initialization of buf.
Suppressingfire
Of course one would allocate buf[] to be big enough. 11 times the number of numbers in my_list should do it.
wallyk
+6  A: 

Code now tested and builds under gcc.

In contrast to other answers, does not mandate C99.

The real problem here is not knowing the length of the string you're going to need. Getting a number is as easy as sprintf("%u", *num) using num to walk your array of ints, but how much space are you going to need? To avoid overrunning a buffer, you have to keep track of a lot of integers.

size_t join_integers(const unsigned int *num, size_t num_len, char *buf, size_t buf_len) {
    size_t i;
    unsigned int written = 0;

    for(i = 0; i < num_len; i++) {
        written += snprintf(buf + written, buf_len - written, (i != 0 ? ", %u" : "%u"),
            *(num + i));
        if(written == buf_len)
            break;
    }

    return written;
}

Notice, that I keep track of how much of the buffer I have used and use snprintf so I don't overrun the end. snprintf will tack on a \0, but since I'm using buf + written I will start at the \0 of the previous snprintf.

In use:

int main() {
    size_t foo;
    char buf[512];

    unsigned int numbers[] = { 10, 20, 30, 40, 1024 };

    foo = join_integers(numbers, 5, buf, 512);
    printf("returned %u\n", foo);
    printf("numbers: %s\n", buf);
}

Outputs:

returned 20
numbers: 10, 20, 30, 40, 1024

Forcing the limiting to kick in, instead of overrunning:

char buf[15];    
foo = join_integers(numbers, 5, buf, 14);
buf[14] = '\0';

Outputs, expectedly:

returned 14
numbers: 10, 20, 30, 4
Jed Smith
Why do you say it does not mandate C99? snprintf is from C99.
Roger Pate
Technically, `snprintf()` is a C99 function and not in C89. However, it is widely enough available not to be a major issue. Also, use sizeof(buf) rather than 512, even in test code.
Jonathan Leffler
I suppose one could argue for some platforms that `snprintf()` *is* in SUSV2.
DigitalRoss
@Jonathan Leffler: then why don't just use sizeof(buf) inside the function?
Spidey
Feel free to leave a message as to why you're downvoting -- this answer is costing me some rep, and I'm curious why.
Jed Smith
+1 for actually being safe with the C string.
Doug T.
Heh, I was downvoted too. We both had snarky titles, I guess we asked for it.
DigitalRoss
Instead of: "%s%u", i != 0 ? ", " : ""I'd suggest: i != 0 ? ", %u" : "%u"Simpler and faster. Also, you don't appear to use the variable declared here: const unsigned int *p = num;
Darius Bacon
@Darius: I did in a prior version, thanks for the catch.
Jed Smith
@Spidey: you can't use 'sizeof(buf)' inside a function because C passes pointers, so the size given would be the size of the pointer (probably 4 or 8) and not the size of the array it points at.
Jonathan Leffler
A: 

Personally, for simplicity, and probably speed too, I would malloc a large buffer, with space for an array the size of "4,294,967,295" and ", " for each element. It's not space efficient during the creation of the list though!

Then I'd sprintf the ints into there, and append ", " to all elements

At last I'd realloc the pointer to have no more space than required. (size = strlen)

sprintf: On success, the total number of characters written is returned. This count does not include the additional null-character automatically appended at the end of the string.

That's how you keep track of where strcpy to in the string. :)

Hope that helps! :)

And if you just want to print them out, see the other replies instead. (for-loop and printf)

pbos
size = strlen + 1 ;)
Heath Hunnicutt
Actually, size = strlen-1, since you don't want the trailing ", ". :)
pbos
A: 

Unfortunately there will always be three cases:

  • Empty list (no commas, no items)
  • One item (no commas, one item)
  • Two or more items (n-1 commas, n items)

The join method hides this complexity for you, which is why it is so nice.

In C, I'd do:

for (i = 0; i < len; i++)
{
    if (i > 0)   /* You do need this separate check, unfortunately. */
        output(",");
    output(item[i]);
}

Where output is however you're appending to the string. It could be as simple as strcat on a pre-allocated buffer, or a printf to some stream (like a memory stream I learned about today in http://stackoverflow.com/questions/1741191/creating-a-file-stream-that-results-in-a-string :-).

If you were annoyed about doing that check every time for all i >= 1, you could do:

if (i > 0)
{
    output(item[0]);
    for (i = 1; i < len; i++)
    {
        output(",");
        output(item[i]);
    }
}
Edmund
Use Duff's Device (see my answer) for how to avoid both the check-per-loop and duplicating output logic.
Roger Pate
+2  A: 

What about this?

char *join_int_list(const unsigned int *list, size_t n_items)
{
     enum { SIZEOF_INT_AS_STR = sizeof("4294967295,")-1 };
     char *space = malloc(SIZEOF_INT_AS_STR * n_items);
     if (space != 0)
     {
         size_t i;
         char *pad = "";
         char *dst = space;
         char *end = space + SIZEOF_INT_AS_STR * n_items;
         for (i = 0; i < n_items; i++)
         {
              snprintf(dst, end - dst, "%s%u", pad, list[i]);
              pad = ",";
              dst += strlen(dst);
         }
         space = realloc(space, dst - space + 1);
     }
     return(space);
}

It is the responsibility of the caller to release the returned pointer - and to check that it is not null before using it. The 'realloc()' releases extra space if the amount allocated was enough too large to make it worth while. This code is blithely assuming that the values are indeed 32-bit unsigned integers; if they can be bigger, then the enum needs appropriate adjustment.

Tested code:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char *join_int_list(const unsigned int *list, size_t n_items)
{
    enum { SIZEOF_INT_AS_STR = sizeof("4294967295,")-1 };
    char *space = malloc(SIZEOF_INT_AS_STR * n_items);
    if (space != 0)
    {
        size_t i;
        char *pad = "";
        char *dst = space;
        char *end = space + SIZEOF_INT_AS_STR * n_items;
        for (i = 0; i < n_items; i++)
        {
            snprintf(dst, end - dst, "%s%u", pad, list[i]);
            pad = ",";
            dst += strlen(dst);
        }
        space = realloc(space, dst - space + 1);
    }
    return(space);
}

int main(void)
{
    static unsigned int array[]= { 1, 2, 3, 49, 4294967295U, 0, 332233 };
    char *str = join_int_list(array, sizeof(array)/sizeof(array[0]));
    printf("join: %s\n", str);
    free(str);
    return(0);
}

Checked with valgrind - seems to be OK.


Discussion of converting INT_MAX or UINT_MAX to string:

You can use sizeof("," STRINGIZE(INT_MAX)) instead of hard coding it. The stringize macro is a common cpp tool that can be defined as #define STRINGIZE_(v) #v and #define STRINGIZE(v) STRINGIZE_(v). – R. Pate

@R Pate: good idea - yes, you could do that quite effectively. Actually, there are two interesting ideas in there: the use of string concatenation with sizeof() (parentheses required for clarity - but string concatenation happens early enough that the compiler isn't worried) and the use of a stringize operation on INT_MAX. – Jonathan Leffler

Using a stringize operation on INT_MAX is not a good idea - it just has to be a "constant expression", not necessarily a sequence of digits. It could be defined as ((1<<32)-1), or even something whacky like __int_max, as long as the compiler lets you use it anywhere you can use a constant expression. – caf

And @caf is right. Consider this code:

#include <limits.h>
#include <stdio.h>

#undef INT_MAX
#define INT_MAX (INT_MIN-1 - 100 + 100)

#define QUOTER(x)   #x
#define STRINGIZER(x)   QUOTER(x)

enum { SIZEOF_INT_AS_STR = sizeof("4294967295,")-1 };
enum { SIZEOF_INT_AS_STR_1 = sizeof(STRINGIZER(INT_MAX) ",")-1 };

int main(void)
{
    printf("size = %d = %d\n", SIZEOF_INT_AS_STR, SIZEOF_INT_AS_STR_1);
    printf("INT_MAX  = %d\n", INT_MAX);
    printf("UINT_MAX = %u\n", UINT_MAX);
    return(0);
}

This doesn't even compile on MacOS X 10.5.8 with GCC 4.0.1 - because the identifier INT_MAX is not defined. A preliminary version of the code that did not print INT_MAX or UINT_MAX worked; it showed that the value of SIZEOF_INT_AS_STR_1 was 31 - so @caf was correct. Adding the double-checking on the values of INT_MAX and UINT_MAX then failed to compile, which did surprise me. A look at the output from gcc -E reveals why:

enum { SIZEOF_INT_AS_STR = sizeof("4294967295,")-1 };
enum { SIZEOF_INT_AS_STR_1 = sizeof("((-INT_MAX - 1)-1 - 100 + 100)" ",")-1 };

int main(void)
{
 printf("size = %d = %d\n", SIZEOF_INT_AS_STR, SIZEOF_INT_AS_STR_1);
 printf("INT_MAX  = %d\n", ((-INT_MAX - 1)-1 - 100 + 100));
 printf("UINT_MAX = %u\n", (((-INT_MAX - 1)-1 - 100 + 100) * 2U + 1U));
 return(0);
}

As predicted, the string for SIZEOF_IN_AS_STR_1 is not a digit string at all. The pre-processor can evaluate the expression (as much as it needs to), but does not have to produce a digit string.

The expansion of INT_MAX turns out to be in terms of INT_MIN, and INT_MIN is, in turn, defined in terms of INT_MAX, so when the rewritten INT_MAX macro is evaluated, the 'recursive expansion' is prevented by the C pre-processor rules of operation, and INT_MAX appears in the pre-processed output - to the confusion of all.

So, there are multiple reasons why it the superficially attractive idea turns out to be a bad idea.

Jonathan Leffler
Note that SIZEOF_INT_AS_STR is only valid for systems with 32-bit ints.
Michael E
Yup: added a comment to that effect, but the range quoted by the asker effectively enforces that, so it meets the specification.
Jonathan Leffler
You can use `sizeof("," STRINGIZE(INT_MAX))` instead of hard coding it. The stringize macro is a common cpp tool that can be defined as `#define STRINGIZE_(v) #v` and `#define STRINGIZE(v) STRINGIZE_(v)`.
Roger Pate
@R Pate: good idea - yes, you could do that quite effectively. Actually, there are two interesting ideas in there: the use of string concatenation with `sizeof()` (parentheses required for clarity - but string concatenation happens early enough that the compiler isn't worried) and the use of a stringize operation on `INT_MAX`.
Jonathan Leffler
Using a stringize operation on `INT_MAX` is not a good idea - it just has to be a "constant expression", not necessarily a sequence of digits. It could be defined as `((1<<32)-1)`, or even something whacky like `__int_max`, as long as the compiler lets you use it anywhere you can use a constant expression.
caf
A: 
unsigned *a; /* your input a[N] */
unsigned i,N;
char *b,*m;
b=m=malloc(1+N*11); /* for each of N numbers: 10 digits plus comma (plus end of string) */
for (i=0;i<N;++i)
  b+=sprintf(b,"%u,",a[i]);
if (N>0) b[-1]=0; /* delete last trailing comma */
/* now use m */
free(m);

pretty, right? :)

wrang-wrang
Mine is the first good implementation posted. -1 for what?
wrang-wrang
A: 

If you want it to a file, Steven Schlansker's answer is fine.

If you want to put it in a string, though, things get more complicated. You can use sprintf, but you need to be careful that you don't run out of room in your string. If you have a C99-compatble snprintf (Linux, BSD, not Windows), the following (untested, uncompiled) code should work:

char *buf = malloc(1024); /* start with 1024 chars */
size_t len = 1024;
int pos = 0;
int rv;
int i;
for (i = 0; i < n; i++) {
    rv = snprintf(buf+pos, len - pos, "%s%d", i = 0 ? "" : ",", my_list[i]);
    if (rv < len - pos) {
        /* it fit */
        pos += rv;
    } else {
        len *= 2;
        buf = realloc(buf, len);
        if (!buf) abort();
        i--; /* decrement i to repeat the last iteration of the loop */
    }
}
return buf;

The caller must then free buf.

Michael E
Why start with 1024, and why allocate it yourself? Allow the caller to specify the size of his buffer, and keep it his responsibility. Work within what you're given.
Jed Smith
Agreed. If caller frees, it makes sense to have the caller allocate too.
Seamus
+1  A: 

You can actually use a library like Glib, which contains functions like

gchar* g_strjoin (const gchar *separator, ...);

Joins a number of strings together to form one long string, with the optional separator inserted between each of them. The returned string should be freed with g_free().

(you still need to use g_snprintf(), possibly with g_printf_string_upper_bound() to ensure space)

Suppressingfire
+1  A: 
#include <stdio.h>
#include <stdlib.h>

/* My approach is to count the length of the string required. And do a single alloc.
     Sure you can allocate more, but I don't know for how long this data will be retained.
*/ 

#define LEN(a) (sizeof a / sizeof *a)

int main(void) {

    unsigned a[] = {1, 23, 45, 523, 544};
    int i, str_len=0, t_written=0;
    char tmp[11]; /* enough to fit the biggest unsigned int */

    for(i = 0; i < LEN(a); i++) 
        str_len += sprintf(tmp, "%d", a[i]);

    /* total: we need LEN(a) - 1 more for the ',' and + 1 for '\0' */
    str_len += LEN(a);
    char *str = malloc(str_len); 
    if (!str) 
        return 0;

    if (LEN(a) > 1) {
        t_written += sprintf(str+t_written, "%d", a[0]);
        for(i = 1; i < LEN(a); i++)
            t_written += sprintf(str+t_written, ",%d", a[i]);
    } else if (LEN(a) == 1) 
        t_written += sprintf(str+t_written, "%d", a[0]);

    printf("%s\n", str);

    free(str);
    return 0;
}
prime_number
`#define LEN(a) (sizeof a / sizeof *a)`
Roger Pate
How much do you need to change to convert this main program into a reusable function?
Jonathan Leffler
I cringe every time I see people storing string or array lengths in `int` types. What's so terrible about `size_t`?
Chris Lutz
Chill out Chris. Everyone knows that sizeof returns size_t. For this demonstration, an int is just fine.
prime_number
Chris one more thing, I checked your profile, followed your website link, just to find out horrible C code like this: char ext[16], *lang = malloc(16*sizeof(char));while(--start) { /* go backwards, find the last period in the filename */sizeof(char) ? really?lol. How about you check the return value of lang before using it as well?And what's up with the formatting?
prime_number
+2  A: 

You guys and your needless special cases to take care of the trailing comma... It's cheaper just to destroy the last comma then to have a conditional check every time the loop runs.

:)

#include <stdio.h>

char* toStr(int arr[], unsigned int arrSize, char buff[])
{
    if (arr && arrSize && buff)
    {
        int* currInt = arr;
        char* currStr = buff;
        while (currInt < (arr + arrSize))
        {
            currStr += sprintf(currStr, "%d,", *currInt++);
        }
        *--currStr = '\0';
    }
    return buff;
}

int main()
{
    int arr[] = {1234, 421, -125, 15251, 15251, 52};
    char buff[1000];

    printf("Arr is:%s\n", toStr(arr, 6, buff));    
}

Assumes buff is big enough allocate it to be (length of largest int + 2) * arrSize). Inspired my memcpy :)

Edit I realized I had a brainfart earlier and could have just incremented by the return value of sprintf instead of storing temp. Apparently other answers do this as well, edited my answer to remove the 2 needless lines.

Edit2 Looks like wrang-wrang beat me to the punch! His answer is pretty much identical to mine and was submitted earlier. I humbly suggest giving him a + 1.

Doug T.
`*--currStr = '\0'` needs an explanatory comment to anybody who has to maintain the code after you. `(i == 0 ? :)` does not.
Jed Smith
I guess its a matter of opinion, but I +1'd yours for actually being safer about the string len.
Doug T.
This breaks if arrSize == 0. Also you need to s/size/arrSize/, at least.
Darius Bacon
@Darius - thanks. Fixed the function.
Doug T.
Still a problem: if arrSize is 0 then you miss setting buff[0] to '\0'.
Darius Bacon
A: 

Are you guys getting paid by the line? :-)


f() is declared with a char * parameter for prototyping purposes, just change char -> int. I interpreted the question as requiring a string as output rather than just code to write to a file.

#define PRINT(s, l, x, i) snprintf((s), (l), "%s%d", (i) ? ",":"", (x)[i]);

char *f(size_t len, char *x) {
  size_t  i, j = 0, k;

  for(i = 0; i < len; ++i)
      j += PRINT(NULL, 0, x, i);
  char *result = malloc(k = ++j);
  for(*result = i = j = 0; i < len; ++i)
      j += PRINT(result + j, k - j, x, i);
  return result;
}

Here is a test framework:

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

// put f() here

int main(int ac, char **av) {
    for(int i = 1; i < ac; ++i) { 
        printf("%s\n", f(strlen(av[i]), av[i]));
    }
    return 0;
}
DigitalRoss
If you are willing to use a statically allocated temporary buffer just get rid of the first loop...
DigitalRoss
A: 
void join(int arr[], int len, char* sep, char* result){
    if(len==0){
        *result='\0';
    } else {
        itoa(arr[0],result,10);
        if(len > 1){
            strcat(result,sep);
            join(arr+1,len-1,sep,result+strlen(result));
        }
    }
}
ipeev