views:

215

answers:

5

Hi I'm trying to tokenize a string by loading an entire file into a char[] using fread. For some strange reason it is not always working, and valgrind complains in this very small sample program.

Given an input like test.txt

first
second

And the following program

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <sys/stat.h>


//returns the filesize in bytes
size_t fsize(const char* fname){
  struct stat st ;
  stat(fname,&st);
  return st.st_size;
}

int main(int argc, char *argv[]){
  FILE *fp = NULL;
  if(NULL==(fp=fopen(argv[1],"r"))){
    fprintf(stderr,"\t-> Error reading file:%s\n",argv[1]);
    return 0;
  }
  char buffer[fsize(argv[1])];
  fread(buffer,sizeof(char),fsize(argv[1]),fp);
  char *str = strtok(buffer," \t\n");

  while(NULL!=str){
    fprintf(stderr,"token is:%s with strlen:%lu\n",str,strlen(str));
    str = strtok(NULL," \t\n");
  }
  return 0;
}

compiling like

gcc test.c -std=c99 -ggdb

running like

./a.out test.txt

thanks

+5  A: 

Your buffer size should be filesize + 1. The +1 is for the null char.

filesize = fsize(argv[1]);
char buffer[filesize + 1];

Also fread does not put a \0 at the end of the string. So you'll have to do it yourself as:

fread(buffer,sizeof(char),filesize,fp);
buffer[filesize] = 0;
codaddict
+2  A: 

buffer is not null-terminated. You need to make it one byte larger than the size of the file, and you need to set the last byte to be \0.

James McNellis
+1  A: 

Your buffer is too small. Try this:

int fileSize = fsize(argv[1]);
char buffer[fileSize + 1]; 
buffer[fileSize] = 0;

right before your call to fread.

RTBarnard
+2  A: 

Your buffer must be filesize + 1 and you will also need to set the terminating 0:

int size = fsize(argv[1]);
char buffer[size + 1];
buffer[size] ='\0';

Also, you should probably allocate the buffer on the heap instead of the stack...

Jen
+4  A: 

From this site:

int main(int argc, char* argv[])
{
  std::string str = "The quick brown fox";

  // construct a stream from the string
  std::istringstream stream(str);

  // use stream iterators to copy the stream to the vector
  // as whitespace separated strings
  std::istream_iterator<std::string> it(stream), end;

  std::vector<std::string> results(it, end);

  // results = ["The", "quick", "brown", "fox"]
}

SO much easier than dealing with those nasty C-strings that keep banging you on the head.

And you know what's great about using higher-order methods ? It takes less screen estate and is easier to understand.

Matthieu M.
+1, The question is tagged both C and C++, but many times that means that *I am programming C++, but can use a C solution* even if in this case the code and filename seem to indicate plain C. Anyway this is worth the upvote (people doing C++ will surely read this) even if it may not suit the user.
David Rodríguez - dribeas