views:

156

answers:

5

I am trying to teach myself C from a python background. My current mini-problem is trying to do less hard-coding of things like array lengths and allocate memory dynamically based on input.

I've written the following program. I was hoping for suggestions from the community for modifying it in the following ways:

1.) Make first and last Name elements variable length. Currently their length is hardcoded as MAX_NAME_LENGTH. This will involve both change Names structdeclaration as well as the way I'm assigning values to its elements.

2.) Bonus: Figure out some way to progressively add new elements to the name_list array without having to determine its length beforehand. Basically make it an expandable list.

/* namelist.c 

   Loads up a list of names from a file to then do something with them.

*/
#include <stdlib.h>
#include <stdio.h>
#include <memory.h>

#define DATAFILE "name_list.txt"
#define DATAFILE_FORMAT "%[^,]%*c%[^\n]%*c"
#define MAX_NAME_LENGTH 100

typedef struct {
  char first[MAX_NAME_LENGTH];
  char last[MAX_NAME_LENGTH];
} Name;


int main() {
  FILE *fp = fopen(DATAFILE, "r");

  // Get the number of names in DATAFILE at runtime.
  Name aName;
  int lc = 0;
  while ((fscanf(fp, DATAFILE_FORMAT, aName.last, aName.first))!=EOF) lc++;
  Name *name_list[lc];

  // Now actually pull the data out of the file
  rewind(fp);
  int n = 0;
  while ((fscanf(fp, DATAFILE_FORMAT, aName.last, aName.first))!=EOF)
  {
    Name *newName = malloc(sizeof(Name));
    if (newName == NULL) {
      puts("Warning: Was not able to allocate memory for ``Name`` ``newName``on the heap.");
    } 
    memcpy(newName, &aName, sizeof(Name));
  name_list[n] = newName;
  n++;
  }

  int i = 1;
  for (--n; n >= 0; n--, i++) {
    printf("%d: %s %s\n", i, name_list[n]->first, name_list[n]->last);
    free(name_list[n]);
    name_list[n] = NULL;
  }

  fclose(fp);
  return 0;
}

Sample contents of name_list.txt:

Washington,George
Adams,John 
Jefferson,Thomas
Madison,James

Update 1:

I've implemented a linked list and some helper functions as @Williham suggested, results are below.

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

#define DATAFILE "name_list.txt"
#define MAX_NAME_LENGTH 30
#define DATAFILE_FORMAT "%29[^,]%*c%29[^\n]%*c"

static const int INPUT_BUFFER_SIZE_DEFAULT = sizeof(char) * MAX_NAME_LENGTH;

typedef struct _Name Name;

struct _Name {
  char *first;
  char *last;
  Name *next;
};

int get_charcount(char *str);

Name * create_name_list(char *filename);
void print_name_list(Name *name);
void free_name_list (Name *name);

int main() {

  // Read a list of names into memory and 
  // return the head of the linked list.
  Name *head = create_name_list(DATAFILE);

  // Now do something with all this data.
  print_name_list(head);

  // If you love something, let it go.
  free_name_list(head);
  head = NULL;
  return 0;
}

int get_charcount (char *str) 
{
  int input_length = 1;
    while (str[input_length] != '\0')
    {
      input_length++;
    }
  return input_length;
}

Name * create_name_list(char *filename)
{
  FILE *fp = fopen(DATAFILE, "r");
  char *first_input_buffer = malloc(INPUT_BUFFER_SIZE_DEFAULT);
  char *last_input_buffer = malloc(INPUT_BUFFER_SIZE_DEFAULT);

  Name *firstNamePtr;
  Name *previousNamePtr;
  while ((fscanf(fp, DATAFILE_FORMAT, last_input_buffer, first_input_buffer))!=EOF)
  {
    Name *newNamePtr = malloc(sizeof(Name));

    if (previousNamePtr) 
    {
      previousNamePtr->next = newNamePtr;
      previousNamePtr = newNamePtr;
    } 
    else 
    {
      firstNamePtr = previousNamePtr = newNamePtr;
    }

    char *temp_buffer = malloc(get_charcount(first_input_buffer));
    strcpy(temp_buffer, first_input_buffer);
    newNamePtr->first = malloc(get_charcount(first_input_buffer));
    strcpy(newNamePtr->first, temp_buffer);


    realloc(temp_buffer, get_charcount(last_input_buffer));

    strcpy(temp_buffer, last_input_buffer);
    newNamePtr->last = malloc(get_charcount(last_input_buffer));
    strcpy(newNamePtr->last, temp_buffer);

    free(temp_buffer);
    temp_buffer = NULL;
  }
  previousNamePtr->next = NULL;
  previousNamePtr = NULL;
  free(first_input_buffer);
  free(last_input_buffer);
  first_input_buffer = NULL;
  last_input_buffer = NULL;
  fclose(fp);

  return firstNamePtr;
}

void print_name_list (Name *name)
{
  static int first_iteration = 1;
  if (first_iteration) 
  {
    printf("\nList of Names\n");
    printf("=============\n");
    first_iteration--;
  }
  printf("%s %s\n",name->first, name->last);
  if (name->next)
    print_name_list(name->next);
  else
    printf("\n");
}

void free_name_list (Name *name)
{
  if (name->next)
    free_name_list(name->next);
  free(name->first);
  free(name->last);
  name->first = NULL;
  name->last = NULL;
  name->next = NULL;
  free(name);
  name = NULL;
}
+3  A: 

There isn't a way to expand an array in C. All that you can do is allocate a larger block of memory and copy the items over. This is what Python does under the covers.

There's also no way to determine the size of an array, since it stores only the items an no additional information (Python arrays store the length alongside the elements.) You could put a marker at the end of the array and count elements until you hit the marker, this is how strings work (the null char '\0' is the marker.)

Adam Ruth
On the contrary, the `realloc()` call's main purpose in life is to resize arrays, though it only works on arrays that were originally allocated with `malloc()`.
caf
@caf Thanks, I experimented with `realloc()` in Update 1.
Prairiedogg
@caf: `realloc` will often actually allocate new memory and then copy the contents. You just don't see it happening because it's inside the memory allocator.
Borealid
@Borealid: Sure, it will when it has to - but it's also possible for `realloc()` just to expand the size of the allocation (eg with the OpenBSD allocator ( http://www.openbsd.org/papers/eurobsdcon2009/otto-malloc.pdf ) it will almost always be able to do so)
caf
+5  A: 

A very simple approach is to not use an array at all, but rather a linked list:

This can be done in a number of ways, but the simplest is probably to modify the Name struct as follows:

typedef struct _Name Name;

struct _Name {
  char *first;
  char *last;
  Name *next;
};

The use of char * instead of char[] will necessitate some strcpying, but that's really neither here nor there. To expand the array you can now simply malloc these elements one at a time; and set next appropriately.

Note: Remember to set next to NULL when you create new tail elements.

Williham Totland
Thanks, I've implemented your suggestion in update 1. The only part that I'm not happy with is when I read the first and last names into their buffers, things will go awry if a name is longer than ``INPUT_BUFFER_SIZE_DEFAULT``. Any tips on how that is generally dealt with? Otherwise, great answer and thanks for the code snippet!
Prairiedogg
If you are on Linux, there is a rather nice little function called `getline()` that will allocate a buffer for the entire line for you. The information can then be extracted from this at a leisurely pace. If you're not, you're probably better off using a suitably sized buffer (80 or 120 bytes would *probably* do), and reading entire lines using `gets()`. If you encounter a line that is too long, expand your buffer using `realloc()`, and keep it at the new bigger size, in case you need it. (Remember to keep track of how big it is.)
Williham Totland
+2  A: 

You'll have to have a maximum length array to take the input - you don't know how long the input is going to be. However, you only need one array of that length. Once you've got the input you can get it's length and allocate an array of just the right size to store the names.

As for growing your list you can use realloc or use a linked list of course.

ChrisF
So that maximum length array would just be an input buffer, and there's no way of getting around hard-coding that value? If I try and stick a char array that is longer than the hardcoded max length of the input buffer array, do I get undetermined behavior?
Prairiedogg
@Prairiedogg - that's right, you have to have an input buffer of a known length and if you try to read more than it will hold you will get undefined behaviour - buffer overruns etc.
ChrisF
+1  A: 

Have you started reading about linked lists? It may help you to have a dynamic structure.

Also you can check what is the difference between array and linked list. A char pointer can be member of your linked list which can be allocated memory dynamically.

Check this link from stackoverflow for more information on the same.

Praveen S
+1  A: 

Since noone mentioned it yet, if you use *scanf to read untrusted input strings, you should always use maximum field width specifiers. Like

scanf("%19s", str);

Note that this specify maximum string length not including the terminating NUL, so in this example, str should be 20 chars long.

If you don't limit scanf input conversions like this, on buffer overflow you will not only get undefined behaviour, but also a security vulnerability.

Back to your question about dinamically growing buffers. Imagine you are reading some input composed of lines, and each line can be at a maximum 78 chars wide. In situations like this, you know the maximum length of each line, but have no idea of the maximum length of your input as a whole. A common way to do this is to allocate some default space, and if you need to grow that space, grow it by *2, so you don't need to realloc() many times.

ninjalj
Thanks for the tip, I've added it to the code for Update 1
Prairiedogg