views:

441

answers:

4

Put simply a Soundex Algorithm changes a series of characters into a code. Characters that produce the same Soundex code are said to sound the same.

  • The code is 4 characters wide
  • The first character of the code is always the first character of the word

Each character in the alphabet belongs in a particular group (at least in this example, and code thereafter this is the rule I'll be sticking with):

  • b, p, v, f = 1
  • c, g, j, k, q, s, x, z = 2
  • d, t = 3
  • l = 4
  • m, n = 5
  • r = 6
  • Every other letter in the alphabet belongs to group 0.

Other notable rules include:

  • All letters that belong to group 0 are ignored UNLESS you have run out of letters in the provided word, in which case the rest of the code is filled with 0's.
  • The same number cannot be used twice or more consecutively, thus the character is ignored. The only exception is the rule above with multiple 0's.

For example, the word "Ray" will produce the following Soundex code: R000 (R is the first character of the provided word, a is apart of group 0 so it's ignored, y is apart of group 0 so it's ignored, there are no more characters so the 3 remaining characters in the code are 0).

I've created a function that has passed to it 1) a 128 character array which is used in create the Soundex code and 2) an empty 5 character array which will be used to store the Soundex code at the completion of the function (and pass back by reference as most arrays do for use in my program).

My problem is however, with the conversion process. The logic I've provided above isn't exactly working in my code. And I do not know why.

// CREATE A SOUNDEX CODE
// * Parameter list includes the string of characters that are to be converted to code and a variable to save the code respectively.
void SoundsAlike(const char input[], char scode[])
{
    scode[0] = toupper(input[0]); // First character of the string is added to the code

    int matchCount = 1;
    int codeCount = 1;
    while((matchCount < strlen(input)) && (codeCount < 4))
    {
     if(((input[matchCount] == 'b') || (input[matchCount] == 'p') || (input[matchCount] == 'v') || (input[matchCount] == 'f')) && (scode[codeCount-1] != 1))
     {
      scode[codeCount] = 1;
      codeCount++;
     }
     else if(((input[matchCount] == 'c') || (input[matchCount] == 'g') || (input[matchCount] == 'j') || (input[matchCount] == 'k') || (input[matchCount] == 'q') || (input[matchCount] == 's') || (input[matchCount] == 'x') || (input[matchCount] == 'z')) && (scode[codeCount-1] != 2))
     {
      scode[codeCount] = 2;
      codeCount++;
     }
     else if(((input[matchCount] == 'd') || (input[matchCount] == 't')) && (scode[codeCount-1] != 3))
     {
      scode[codeCount] = 3;
      codeCount++;
     }
     else if((input[matchCount] == 'l') && (scode[codeCount-1] != 4))
     {
      scode[codeCount] = 4;
      codeCount++;
     }
     else if(((input[matchCount] == 'm') || (input[matchCount] == 'n')) && (scode[codeCount-1] != 5))
     {
      scode[codeCount] = 5;
      codeCount++;
     }
     else if((input[matchCount] == 'r') && (scode[codeCount-1] != 6))
     {
      scode[codeCount] = 6;
      codeCount++;
     }
     matchCount++;
    }

    while(codeCount < 4)
    {
     scode[codeCount] = 0;
     codeCount++;
    }
    scode[4] = '\0';

    cout << scode << endl;
}

I'm not sure if it's because of my overuse of strlen, but for some reason while the program is running within the first while loop none of the characters are actually converted to code (i.e. none of the if statements are actually run).

So what am I doing wrong? Any help would be greatly appreciated.

A: 

C++ does not support dynamic arrays, which you seem to be attempting to use. You need to investigate the use of the std::string class. I essence your loop becomes something like this:

void Soundex( const string & input, string & output ) {
   for ( int i = 0; i < input.length(); i++ ) {
       char c = input[i];        // get character from input
       if ( c === .... ) {       // if some decision
            output += 'X';       // add some character to output
       }
       else if ( ..... )  {       // more tests
       }
   }
}
anon
My character arrays have a definitive length. I forgot to mention that (I've updated my question to reflect this). The matchstr character array has 128 elements and the scode character array has 5 elements.
tmhai
You should still be using std::string.
anon
While I truly appreciate your support, and your promptness in answering my question I cannot use functions or dependencies which I have yet to have covered in class. Thank you very much for your support all the same. Is there a different alternative?
tmhai
I've updated the code to reflect suggestions by everyone. However, it still doesn't do what it's supposed to.
tmhai
It's customary to tag homework questions as "homework". That means you won't get answers like "use <external library>" or "you should be doing that instead", which are very useful in real applications.
David Thornley
A: 

You are calling strlen() without having added any null char termination in the string. So the return value of strlen() could be just anything. You could fix this by filling "scode" with '\0's before you begin, alhough it whould be better to have a separate counter for that and just add the '\0' when you are done.

danbystrom
I followed your advice and used another counter for scode, rather than using strlen (seeing as I know that scode is to be 4 characters long anyway this makes sense).
tmhai
A: 

This is actually a C implementation and not C++. Anyway, are you sure that your strings are null terminated? Otherwise strlen will not work.

These are some advices that will make your code easier to read and debug:

  • Convert your input to lower case before starting. Test for illegal charactes.
  • Define a variable, set it to input[matchCount] and use this. It will make the code more readable.
  • I would recommend to replace if-else statements with a switch-case one.
  • Accommodate for the default case (none of the if-else or case statements called)
kgiannakakis
I've equaled scode[4] to '\0'. Thanks. matchstr is already null terminated.Now, I'm just getting the first character of the input string (minus gibberish thereafter... which I guess was memory I shouldn't have been reading). However, still no soundex code.
tmhai
+2  A: 

Instead of

scode[codeCount] = 1;

you should write

scode[codeCount] = '1';

as you are forming a char array, the former is actually the first ascii character while the latter is the character '1'.

Recep
Perfect! You solved all my problems! Amazing...
tmhai