views:

110

answers:

5

Hallo

I have this assignment to print only alphabets in a C++ string. It works for most input but when [ and ] are present in the input they are printed as well.

#include <iostream>
#include <string>

using namespace std;

int main()
{
        string input = "THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG]";

        for(int i=0;i<input.size();i++) 
        {
                if(input[i] >='A' && input[i] <= 'z')
                //if(isalpha(input[i]))
                        cout<<input[i];
        }
        cout<<endl;
        return 0;
}
+7  A: 

The problem is here:

if(input[i] >='A' && input[i] <= 'z')
              ^^^                ^^^ 

You are using uppercase 'A' and lowercase 'z'.

The range A-z is not same as A-Z + a-z.
The ASCII value of Z is 90 and that of a is 97.

Between them there are 6 other characters which you are considering as alphabets.

ASCII value   Character
------------------------
90            Z 
91            [ \ 
92            \  \
93            ]   \  your range A-z includes these.
94            ^   /
95            _  /
96            ` /
97            a

To allow only uppercase and lowercase alphabets you should use:

if( (input[i] >='A' && input[i] <= 'Z') || (input[i] >='a' && input[i] <= 'z') )

or even better just use isalpa:

if(isalpha(input[i])) 
codaddict
I now see codeaddict. Thanks.
Ggnaik
Omnifarious
+2  A: 

That happens because there are some special characters in the A-z range. You need to use the A-Z range, and then the a-z range individually, if you want to filter out those special characters like [ and ]

Luis Miguel
Won't isalpha take care of those characters ?
Tanuj
It should, and it does for me at least;)
Michał Trybus
To be pedantic, contiguity of A-Z and a-z ranges isn't guaranteed by the standard. Of course that happens to be the case in most of the implementations out there today, but in any case, why not let `isalpha` to take care of that?
usta
Yes. I just mentioned why some special characters were appearing. Indeed isalpha is the best solution.
Luis Miguel
+1  A: 

You don't need this line:

if(input[i] >='A' && input[i] <= 'z')

And your program is working fine..

Ruel
A: 

or, #include<cctype> and use std::isalpha(input[i])

Armen Tsirunyan
+1  A: 

AFAIK isalpha should return 0 on square brackets, since it is defined by the standard (§7.4.1.2.2) to

tests for any character for which isupper or islower is true, or any character that is one of a locale-specific set of alphabetic characters for which none of iscntrl, isdigit, ispunct, or isspace is true. In the "C" locale, isalpha returns true only for the characters for which isupper or islower is true.

and they shouldn't be considered uppercase or lowercase characters.

On the other hand, your

if(input[i] >='A' && input[i] <= 'z')

is wrong, since the ['A','z'] range usually includes also nonalphabetic characters, in particular, in standard ASCII, the characters [ \ ] ^ _ `.

So, you should either split your check in two parts (to check if the character is in range ['A','Z'] or ['a','z']) or simply use isalpha and forget about this stuff.

By the way, IIRC the standard doesn't even mandate that the ['A','Z'] and ['a','z'] ranges must be contiguous, and in facts the original EBCDIC codepage was a real mess to deal with, since to check if a character was alphabetical you couldn't check for the character to be in those ranges. Thus, to be strictly standard you can't even expect that

if((input[i] >='A' && input[i] <= 'Z') || (input[i] >='a' && input[i] <= 'z'))

will work as you expect.

Long story short: if it's not just for homework, use isalpha, which is guaranteed to work whichever strange codepage your platform uses.

Matteo Italia