views:

93

answers:

6

So i am trying to read a text file line by line and save each line into a char array.

From my printout in the loop I can tell it is counting the lines and the number of characters per line properly but I am having problems with strncpy. When I try to print the data array it only displays 2 strange characters. I have never worked with strncpy so I feel my issue may have something to do with null-termination.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char* argv[])
{
    FILE *f = fopen("/home/tgarvin/yes", "rb");
    fseek(f, 0, SEEK_END);
    long pos = ftell(f);
    fseek(f, 0, SEEK_SET);
    char *bytes = malloc(pos); fread(bytes, pos, 1, f);
    int i = 0; 
    int counter = 0; 
    char* data[counter]; 
    int length; 
    int len=strlen(data); 
    int start = 0;
    int end = 0;

    for(; i<pos; i++)
    {
        if(*(bytes+i)=='\n'){
            end = i;
            length=end-start;
            data[counter]=(char*)malloc(sizeof(char)*(length)+1);
            strncpy(data[counter], bytes+start, length);
            printf("%d\n", counter);
            printf("%d\n", length);
            start=end+1;
            counter=counter+1;
        }
    }
    printf("%s\n", data);
    return 0;
}
+1  A: 

Your "data[]" array is declared as an array of pointers to characters of size 0. When you assign pointers to it there is no space for them. This could cause no end of trouble.

The simplest fix would be to make a pass over the array to determine the number of lines and then do something like "char **data = malloc(number_of_lines * sizeof(char *))". Then doing assignments of "data[counter]" will work.

You're right that strncpy() is a problem -- it won't '\0' terminate the string if it copies the maximum number of bytes. After the strncpy() add "data[counter][length ] = '\0';"

The printf() at the end is wrong. To print all the lines use "for (i = 0; i < counter; i++) printf("%s\n", data[counter]);"

George Phillips
It's also possible to use `memcpy`, since he's always copying exactly `length` characters.
Matthew Flaschen
He has allocated memory for length +1
Praveen S
Right, the + 1 is needed for the `NUL`.
Matthew Flaschen
Yes and hencee '\0' will be copied is what i meant :).
Praveen S
A: 

You are trying to print data with a format specifier %s, while your data is a array of pointer s to char.

Now talking about copying a string with giving size:

As far as I like it, I would suggest you to use strlcpy() instead of strncpy()

size_t strlcpy( char *dst, const char *src, size_t siz);

as strncpy wont terminate the string with NULL, strlcpy() solves this issue.

strings copied by strlcpy are always NULL terminated.

Kumar Alok
+1  A: 

Allocate proper memory to the variable data[counter]. In your case counter is set to 0. Hence it will give segmentation fault if you try to access data[1] etc.

Declaring a variable like data[counter] is a bad practice. Even if counter changes in the subsequent flow of the program it wont be useful to allocate memory to the array data. Hence use a double char pointer as stated above.

You can use your existing loop to find the number of lines first.

The last printf is wrong. You will be printing just the first line with it. Iterate over the loop once you fix the above issue.

Praveen S
+1  A: 

Several instances of bad juju, the most pertinent one being:

int counter = 0;  
char* data[counter];  

You've just declared data as a variable-length array with zero elements. Despite their name, VLAs are not truly variable; you cannot change the length of the array after allocating it. So when you execute the lines

data[counter]=(char*)malloc(sizeof(char)*(length)+1);   
strncpy(data[counter], bytes+start, length);   

data[counter] is referring to memory you don't own, so you're invoking undefined behavior.

Since you don't know how many lines you're reading from the file beforehand, you need to create a structure that can be extended dynamically. Here's an example:

/**
 * Initial allocation of data array (array of pointer to char)
 */
 char **dataAlloc(size_t initialSize)
 {
   char **data= malloc(sizeof *data * initialSize);
   return data;
 }

 /**
  * Extend data array; each extension doubles the length
  * of the array.  If the extension succeeds, the function
  * will return 1; if not, the function returns 0, and the 
  * values of data and length are unchanged.
  */
 int dataExtend(char ***data, size_t *length)
 {
   int r = 0;
   char **tmp = realloc(*data, sizeof *tmp * 2 * *length);
   if (tmp)
   {
     *length= 2 * *length;
     *data = tmp;
     r = 1;
   }
   return r;
 }

Then in your main program, you would declare data as

char **data;

with a separate variable to track the size:

size_t dataLength = SOME_INITIAL_SIZE_GREATER_THAN_0;

You would allocate the array as

data = dataAlloc(dataLength);

initially. Then in your loop, you would compare your counter against the current array size and extend the array when they compare equal, like so:

if (counter == dataLength)
{
  if (!dataExtend(&data, &dataLength))
  {
    /* Could not extend data array; treat as a fatal error */
    fprintf(stderr, "Could not extend data array; exiting\n");
    exit(EXIT_FAILURE);
  }
}
data[counter] = malloc(sizeof *data[counter] * length + 1);
if (data[counter])
{
  strncpy(data[counter], bytes+start, length); 
  data[counter][length] = 0; // add the 0 terminator
}
else
{
  /* malloc failed; treat as a fatal error */
  fprintf(stderr, "Could not allocate memory for string; exiting\n");
  exit(EXIT_FAILURE);
}
counter++;
John Bode
A: 

Change

int counter = 0;
char* data[counter];
...
int len=strlen(data);
...
for(; i<pos; i++)
...
      strncpy(data[counter], bytes+start, length);
...

to

int counter = 0;
#define MAX_DATA_LINES 1024
char* data[MAX_DATA_LEN]; //1
...
for(; i<pos && counter < MAX_DATA_LINES ; i++) //2
...
       strncpy(data[counter], bytes+start, length);
...

//1: to prepare valid memory storage for pointers to lines (e.g. data[0] to data[MAX_DATA_LINES]). Without doing this, you may hit into 'segmentation fault' error, if you do not, you are lucky.

//2: Just to ensure that if the total number of lines in the file are > MAX_DATA_LINES. You do not run into 'segmentation fault' error, because the memory storage for pointer to line data[>MAX_DATA_LINES] is no more valid.

ttchong
A: 

I think that this might be a quicker implementation as you won't have to copy the contents of all the strings from the bytes array to a secondary array. You will of course lose your '\n' characters though.

It also takes into account files that don't end with a new line character and as pos is defined as long the array index used for bytes[] and also the length should be long.

#include <stdio.h>
#include <stdlib.h>

#define DEFAULT_LINE_ARRAY_DIM 100

int main(int argc, char* argv[])
{
    FILE *f = fopen("test.c", "rb");
    fseek(f, 0, SEEK_END);
    long pos = ftell(f);
    fseek(f, 0, SEEK_SET);
    char *bytes = malloc(pos+1); /* include an extra byte incase file isn't '\n' terminated */
    fread(bytes, pos, 1, f);
    if (bytes[pos-1]!='\n')
    {
        bytes[pos++] = '\n';
    }
    long i;
    long length = 0;
    int counter = 0;
    size_t size=DEFAULT_LINE_ARRAY_DIM;
    char** data=malloc(size*sizeof(char*));
    data[0]=bytes;

    for(i=0; i<pos; i++)
    {
        if (bytes[i]=='\n') {
            bytes[i]='\0';
            counter++;
            if (counter>=size) {
                size+=DEFAULT_LINE_ARRAY_DIM;
                data=realloc(data,size*sizeof(char*));
                if (data==NULL) {
                    fprintf(stderr,"Couldn't allocate enough memory!\n");
                    exit(1);
                }
            }
            data[counter]=&bytes[i+1];
            length = data[counter] - data[counter - 1] - 1;
            printf("%d\n", counter);
            printf("%ld\n", length);
        }
    }

    for (i=0;i<counter;i++)
        printf("%s\n", data[i]);

    return 0;
}
roguenut