ansaurus

Question

Allocating sufficient memory for unknown length tokens

Answer 1

+5 A:

This is pretty much the right approach - with two tweaks.

Firstly, instead of adding a constant BUFF_CHUNK_SIZE, it's usually better to multiply it by a fixed amount. This means that your number of reallocs on a long string of length N becomes proportional to log N rather than N - meaning that the time spent in realloc() is proportional to N log N rather than N². It doesn't really matter what the constant is - 1.5 might be a good choice (n += n / 2;).

Secondly, in a longer program you should really check for realloc() failing.

caf 2010-08-14 07:24:51

Yeap, otherwise that code just wastes all the time reallocating and copying instead of doing real work. Also maybe the initial block size should be bigger - depending on how frequent are tokens of different lengths.

sharptooth 2010-08-14 08:59:28

I know I shold be checking return of failed *allocs, but this was a quick and dirty mock up. But taking the advice from @caf and previously @RBerteig I should resize like this, correct?n *= 1.5;buffer = (char *)realloc(buffer, n);memset(buffer + i, 0, n - i);

Timothy 2010-08-14 15:07:46

Answer 2

A:

realloc was right, but you should use a char as token-separator?!

#define BUFF_CHUNK_SIZE 4
#define TOKSEP ";"

char *getOneToken(char *s,size_t n)
{
  int c;
  char *p=s;
  while( p-s < n-1 && !feof(stdin) && ((c=getchar())=='\n'?c=getchar():1) )
    if( isalnum(c) )
      *p++=c;
  *p=0;
  return s;
}

main() 
{
  char *buffer=calloc(1,1),
        tok[BUFF_CHUNK_SIZE+1];

  while( *getOneToken(tok,sizeof tok) )
  {
    buffer=realloc(buffer,strlen(buffer)+strlen(tok)+2);
    if( *buffer ) strcat(buffer,TOKSEP);
    strcat(buffer,tok);
  }

  puts(buffer);
  free(buffer);
  return 0;
}

2010-08-14 07:34:46

ansaurus

tags:

views:

answers:

Allocating sufficient memory for unknown length tokens

related questions