views:

128

answers:

2

I'm doing K&R exercise 6-4, which is:

6-4. Write a program that prints the distinct words in its input sorted into decreasing order of frequency of occurrence. Precede each word by its count.

What I decided to do is create a struct called dstncFreqNode6_4:

struct dstncFreqNode6_4
{
   char *word;
   int count;
   struct dstncFreqNode6_4 *left;
   struct dstncFreqNode6_4 *right;
};

I then parsed the input for words, and for each distinct word created one "dstncFreqNode6_4" node and two pointers to it: one to insert in to a BST (to add new words/update the counts of already encountered words), and one to insert in to a global array of "dsntFredNode6_4" pointers.

This was done so that I could update the counts of words (nodes) by traversing the BST (which contains pointers to all the words encountered so far). After the entire input has been parsed, the array of pointers would be sorted by the members' "count" variables, and then printed. Since the

The code the adding of new words/updating the counts is here:(I don't think anything is wrong with it,since the BST and array appear to be populated correctly, so you may ignore this):

//"wordCounts" is originally a global dstncFreqNode6_4** declared outside the function, numWords is the current length of the array

struct dstncFreqNode6_4 *addFreqNode6_4(struct dstncFreqNode6_4 *root, char *word)
{

    if(root == NULL)
    {

       root = (struct dstncFreqNode6_4 *) malloc(sizeof(struct dstncFreqNode6_4));
       root -> word = word;
       root -> count = 1;

       root -> left = NULL;
       root -> right = NULL;


       struct dstncFreqNode6_4 **p;

       wordCounts = (struct dstncFreqNode6_4 **) realloc(wordCounts, sizeof(struct dstncFreqNode6_4*) * (numWords +1));
       p = wordCounts + numWords++;

       (*p) = root;

    }
    else if(strcmp(word, root -> word) < 0)
        root -> left = addFreqNode6_4(root -> left, word);
    else if(strcmp(word, root -> word) > 0)
        root -> right = addFreqNode6_4(root -> right, word);
    else
        root -> count++;

return root;

}

So I've got everything working right except the sorting; it simply won't sort the array. That is to say... the order of the elements remains unchanged EDIT: I get a segmentation error. EDIT #2: No segmentation error now; the original problem still exists.

I'm using stlib.h's qsort method; the compare function i'm using is:

int qcmp6_4(const void *a, const void *b)
{
    return (*(struct dstncFreqNode6_4 **)a) -> count - (*(struct dstncFreqNode6_4 **)b) -> count;
}

I can't seem to figure out why it won't sort correctly. I actually implemented my own quicksort algorithm and am getting the same results. I really don't know at this point.

It would be nice to get some fresh, and expert eyes to point me in the right direction. Thanks.

EDIT

Sorry, here's the call to qsort:

qsort(wordCounts, numWords, sizeof(struct dstncFreqNode6_4 *), qcmp6_4);

EDIT #2:

Following the suggestion, I've made "wordCounts" an array of pointers to nodes (all the code in this post has been updated to reflect this). So in essence, the BST and the array contain identical information (in fact the array pointers are initialized to the corresponding pointer in the BST), but their uses are different. The BST is used to add new words/update the counts of already encountered ones, and the array is sorted at the end (by the each of the words' counts) and printed. However, i'm running in to the same problem I originally ran in to: the order of the array remains the same after the call to qsort.

A: 

It looks to me like you're trying to sort an array of nodes which each contain pointers to other nodes in the array. After sorting, these pointers will of course point to the wrong elements of the array. Perhaps that is your problem?

By the way, I see that you're using realloc. The same problem with pointers to array elements will come up with realloc whenever it has to move the allocated array to a new location to satisfy the new size requirement, only much worse: the node pointers will now all point to invalid addresses and using them will result in undefined behavior.

One possible solution would be to never modify your original array of nodes, and instead make a second array of pointers to nodes and sort that array of pointers using a comparison function that compares the nodes they point to.

R..
The problem with realloc is something that I was concerned with and aware of, but I didn't see how that would have an effect on what i'm trying to do.
Kevin
So i've implemented an array of pointers, however the problem still persists
Kevin
A: 

So i've managed to get it to sort with my own implementation of quicksort. The stlib.h's version still leaves my array unsorted. Can it only be used on arrays of primitive types? Because that is the only reason I can see that prevents it from working.

Here is my custom qsort code for reference:

void swapFreq6_4(struct dstncFreqNode6_4 **a, struct dstncFreqNode6_4 **b)
{
   struct dstncFreqNode6_4 *t=*a; *a=*b; *b=t;
}

void **sortFreq6_4(struct dstncFreqNode6_4 **arr, int beg, int end)
{

   if (end > beg + 1)
   {
      struct dstncFreqNode6_4 *piv = *(arr + beg);
      int l = beg + 1;
      int r = end;


     while (l < r)
     {
         if ( (*(arr + l))-> count <= piv -> count)
             l++;
         else
             swapFreq6_4((arr + l), (arr + --r));
     }



     swapFreq6_4((arr + --l), (arr + beg));
     sortFreq6_4(arr, beg, l);
     sortFreq6_4(arr, r, end);

  }
}

Does anyone know why the stdlib's qsort code doesn't work with my struct? Am I using it wrong (my call to it, as well as the comparing function are in the original post)?

Kevin