views:

137

answers:

3

I'm writing a program that deciphers sentences, syllables, and words given in a basic text file.

The program cycles through the file character by character. It first looks if it is some kind of end-of-sentence marker, like ! ? : ; or . Then if the character is not a space or tab, it assumes it is a character. Finally, it identifies that if it is a space or tab, and the last character before it was a valid letter/character (e.g. not an end-of-sentence marker), it is a word.

I was a bit light on the details, but here is the problem I have. My word count is equal to my sentence count. What this interprets to, is it realizes that a word stops when there is an end of sentence marker, BUT the real problem is the spaces are considered valid letters.

Heres my if statement, to decide if the character in question is a valid letter in a word:

else if(character != ' ' || character != '\t')

I've already ruled out end-of-sentence markers by that point in the program. (In the original if actually). From reading off an Ascii table, 32 should be the space character. However, when i output all of the characters that make it into that block of code, spaces are in there.

So what am I doing wrong? How can i stop spaces from getting through this if?

Thanks in advance, and I have a feeling the question may be a bit vague, or poorly worded. If you have any questions or need clarification, let me know.

A: 

It would probably be better to just compare against the specific characters you consider whitespace, also use an &&:

if ((character != ' ') &&
    (character != '\t'))
Mark Synowiec
Yes, i know that is a valid way. I tried this before the other way actually. But regardless of how i tell it to avoid characters that are spaces or tabs, it does not.
Blackbinary
Alok
I agree with Alok, I didn't think about the code but every character is always going to be != ' ' OR != '\t'. I'll update my code, didn't catch that issue
Mark Synowiec
+4  A: 

I note that

(character != 32 || character != 9)

is always true. because if the character is 32 it is not 9, and true OR false is true...

You probably mean

(character != ' ' && character != '\t')
dmckee
Blackbinary
oops double post
Blackbinary
Why the cast to `int`?
Thomas Matthews
@Thomas: Because it was in the original code and---having spotted the logic error and the string literal thing---I had gotten busy typing and stopped thinking. Basic cut-n-paste error. Thanks.
dmckee
+8  A: 

You should not rely on actual numbers for characters: that depends upon the encoding your platform uses, and may not be ASCII. You can check for any particular character by simply testing against it. For example, to test if c is a space character:

if (c == ' ')

will work, is easier to read, and is portable.

If you want to skip all white-space, you should use #include <ctype.h> and then use isspace():

if (isspace((unsigned char)c))

Edit: As others said, your condition to check for "not a space" is wrong, but the above point still applies. So, your condition can be replaced by:

if (!isspace((unsigned char)c))
Alok
Blackbinary
@Blackbinary: please see my edit: you probably won't need more code, but you should replace your condition with `if (!isspace(...))` anyway.
Alok
yay for using the proper library!
rampion