tags:

views:

279

answers:

7

I am new to C. (I'm using C89 on Visual Studio 2010.) I wrote a little program that uses a tree to give a histogram of arguments presented to the program. Are there any obvious beginner mistakes I've make? Everything appears to run fine in my extremely limited testing.

hist.c

#include "tree.h"
#include <stdlib.h>
#include <stdio.h>

int main(int argc, char *argv[]) {
    unsigned int i;
    struct tree *tree;
    tree = NULL;

    for (i = 1; i < argc; i++) {
        tree = tree_add(tree, argv[i]); // adds strings to tree
    }

    tree_dump(tree); // prints results
    tree_free(tree); // frees memory
    getchar();

    return 0;
}

tree.h

struct tree {
    struct tree *left;
    struct tree *right;
    char *value;
    unsigned count;
};

struct tree *tree_add(struct tree *tree, char *value);
struct tree *new_tree();
void tree_dump(struct tree *tree);
void tree_free(struct tree *tree);

tree_free.c

#include "tree.h"
#include <stdlib.h>

void tree_free(struct tree *tree) {
    if (tree->left != NULL) {
        tree_free(tree->left);
    }
    if (tree->right != NULL) {
        tree_free(tree->right);
    }
    free(tree);
}

tree_dump.c

#include "tree.h"
#include <stdlib.h>
#include <stdio.h>

void tree_dump(struct tree *tree) {
    if (tree != NULL) {
        tree_dump(tree->left);
        printf("%4u\t%s\n", tree->count, tree->value);
        tree_dump(tree->right);
    }
}

tree_add.c

#include "tree.h"
#include <stdlib.h>
#include <string.h>

struct tree *tree_add(struct tree *tree, char *value) {
    if (tree == NULL) {
        tree = new_tree();
        tree->value = value;
        tree->count = 1;
    }
    else if (strcmp(tree->value, value) == 0) {
        tree->count++;
    }
    else if (strcmp(value, tree->value) < 0) {
        tree->left = tree_add(tree->left, value);
    }
    else if (strcmp(value, tree->value) > 0) {
        tree->right = tree_add(tree->right, value);
    }
    return tree;
}

struct tree *new_tree() {
    struct tree * tree;
    tree = malloc(sizeof *tree);
    tree->left = NULL;
    tree->right = NULL;
    tree->value = NULL;
    tree->count = 0;
    return tree;
}
A: 

One small errors: cast the return value of malloc to the correct pointer-type.

I didn't do a full check of your code. If the output is correct, it probably works. I see no major flaws.

Some tips:

  • Take a look at C++. Writing a C++ class is not that hard and will make your code easier to understand (once you get used to C++)
  • Write a unit test to test the correctness of your application. Test some obvious and non-obvious cases.
Patrick
"cast the return value of malloc to the correct pointer-type" - in C89 definitely *not* required, and I might go as far as to say *not recommended*: http://stackoverflow.com/questions/605845/do-i-cast-the-result-of-malloc
Mark E
I didn't know that, Mark.Something learned today.
Patrick
According to Wikipedia: Omitting the cast, however, creates an incompatibility with C++, which does require it.So, adding the cast might still be a good idea, if you plan to go to C++ later.
Patrick
If you want to change it to C++ in the future, you'll be wanting to rewrite tree.c anyways -- at the very least to change 'malloc' to 'new', so having it give an error is actually a good thing...
Chris Dodd
@Chris If he changes to C++ in the future, he will probably want to forget implementing your own trees and use something like std::set instead.
anon
A: 

I don't see any obvious errors. The code looks clean and is easy to read, which is always a good sign. One issue for a production application is that checks for allocation failures would be needed, but you are probably aware of that.

Mark Wilkins
+4  A: 

You could improve on tree_add.c by storing the result of the strcmp() call in a local variable so you don't have to call it three times. Also, if it's not == 0 and it's not < 0, it's gotta be > 0.

You're not checking for malloc() to return NULL in new_tree. For a simple program like this, you could simply exit with a "memory allocation failed" error message, or return NULL to the caller of new_tree() (without setting the left/right/value/count members) and let the caller handle the error.

Overall, looks like a nice implementation.

tomlogic
Most OSes, malloc will never fail, as the OS overcommits. malloc will return you some address space, but not allocated ram. When you touch it, the OS will attempt to allocate, and if it can't, it will take drastic action, possibly sending you a SIGKILL for example.
smcameron
@smcameron: thanks, I wasn't aware of that. I spend most of my time doing embedded development -- platforms without VM and often without malloc (heap fragmentation is a real problem on systems that run 24/7 for years without restarting).
tomlogic
@smcameron On HP-UX it is easy to limit how much memory from heap a process can get. As a result it is really easy to get 0 from `malloc()`.
skwllsp
+4  A: 

I see one thing wrong with it - you have needlessly split the implementation up into several C source files. This is not a giant code base - it will be easier for you and for future maintainers if you put all the implementation in a single source file called tree.c.

anon
+1, there's really no need to split functions in multiple files for such a simple program. In general, a better approach would probably be organizing code into modules each consisting of two files (.h for public interface and .c for implementation). You'd use tree.h and tree.c in the OP's example.
Ree
+1  A: 

Some minor suggestions:

  • you should add an inclusion guard to your header file

  • a const qualifier should be added to the value member as modifying it can invalidate the tree's order

  • on Windows, I prefer system("pause") over getchar() to keep the console window open; on *nix, don't do it at all

and a small note on the coding style: in my opinion,

if(tree) { ... }
if(!tree) { ... }

is more readable (and 'in the spirit' of C - remember, any scalar value can be used in boolean contexts!) than

if(tree != NULL) { ... }
if(tree == NULL) { ... }

Other people might disagree.

Christoph
+1 for the inclusion guard. I didn't know about that.
Rosarch
A: 
struct tree *new_tree() {
    struct tree * tree;
    tree = malloc(sizeof *tree);

Has an error.

malloc(sizeof(tree)); is what you want. In your code, you allocate the size of a 'pointer'. Not the struct.

edit: I'm wrong. Keeping this answer alive for future reference by myself and others.

Paul Nathan
sizeof *tree is correct here - it gives the size of the struct. sizeof tree gives the size of the pointer. Having the struct and the pointer have the same name is NOT good practice however. Remember struct names and variable names are in different namespaces. Without the 'struct' qualifier, we are in the variable namespace.
anon
@Paul: you are mistaken; Rosarch's code is actually recommended to avoid duplicating the type information
Christoph
Guh. I didn't catch that - didn't know struct and variable names had different namespaces.
Paul Nathan
+1  A: 

I don't immediately see any mistakes in the code that would affect its functionality... Only some stylistic stuff.

(1) C89 is not very friendly to declarations with initializer, but in many cases it is perfectly possible. So, for example, instead of

int main(int argc, char *argv[]) { 
    unsigned int i; 
    struct tree *tree; 
    tree = NULL;
    ...

I'd write

int main(int argc, char *argv[]) { 
    unsigned int i; 
    struct tree *tree = NULL; 
    ...

(2) I'm not a big fan of premature optimizations, but nevertheless several consecutive calls to strcmp with the same arguments are too much for me. In tree_add I'd do

int cmp = strcmp(tree->value, value);

only once and then branch on the value of cmp.

(3) A matter of very personal style, but when you are working with values of with essentially identical semantics, a chained assignment might actually improve the readability of the code. In your case

tree->left = NULL;            
tree->right = NULL;

might look better as

tree->left = tree->right = NULL;

But again, this is easily abused. And again, it is a matter of personal style.

(4) Is your tree intended to provide a modifiable access path to the string values stored in it? If not (and ordered data structures usually don't), then using const char * everywhere you currently have a char * might be a better idea (in struct tree as well).

(5) As others already noted, there's usually no reason to split a module into so many small implementation files.

Anyway, your code looks great. I'd give your code high marks for tree = malloc(sizeof *tree) (target-based expression under sizeof instead of type), for if (tree != NULL) (instead of if (!tree)), for using unsigned type to represent unsigned value, for making an effort to properly zero-initialize an object instead of just hitting it with memset or calloc.

AndreyT