tags:

views:

270

answers:

5

Simply put, if the input is always in the same case (here, lower case), and if the characters are always ASCII, can one use string::compare to determine reliably the alphabetical order of two strings?

Thus, with stringA.compare(stringB) if the result is 0, they are the same, if it is negative, stringA comes before stringB alphabetically , and if it is positive, stringA comes after?

+2  A: 

Yes, as long as all of the characters in both strings are of the same case, and as long as both strings consist only of letters, this will work.

compare is a member function, though, so you would call it like so:

stringA.compare(stringB);
James McNellis
Oh, right. yes! Duh... Thanks for correcting my syntax!
MPelletier
+2  A: 

According to the docs at cplusplus.com,

The member function returns 0 if all the characters in the compared contents compare equal, a negative value if the first character that does not match compares to less in the object than in the comparing string, and a positive value in the opposite case.

So it will sort strings in ASCII order, which will be alphabetical for English strings (with no diacritical marks or other extended characters) of the same case.

Moishe
IIRC, apostrophes and hyphens are not diacritical marks, right?
MPelletier
Right, they'll sort in ASCII order.
Moishe
Generally, when alphabetizing, one does not consider punctuation marks in determining order, so punctuation should be removed from the strings. In any case, the apostrophe and hyphen are both below the alphabetic ranges in ASCII.
James McNellis
+1 for mentionning the problem of diacritical marks. `string::compare` only works for the lower ASCII (0-127).
Matthieu M.
A: 

yes,

The member function returns 0 if all the characters in the compared contents compare equal, a negative value if the first character that does not match compares to less in the object than in the comparing string, and a positive value in the opposite case.

For string objects, the result of a character comparison depends only on its character code (i.e., its ASCII code), so the result has some limited alphabetical or numerical ordering meaning.

aJ
+2  A: 

In C++, string is the instantiation of the template class basic_string with the default parameters: basic_string<char, char_traits<char>, allocator<char> >. The compare function in the basic_string template will use the char_traits<TChar>::compare function to determine the result value.

For std::string the ordering will be that of the default character code for the implementation (compiler) and that is usually ASCII order. If you require a different ordering (say you want to consider { a, á, à, â } as equivalent), you can instantiate a basic_string with your own char_traits<> implementation. providing a different compare function pointer.

David Rodríguez - dribeas
A: 

The specifications for the C and C++ language guarantee for lexical ordering, 'A' < 'B' < 'C' ... < 'Z'. The same is true for lowercase.

The ordering for text digits is also guaranteed: '0' < ... < '9'.

When working with multiple languages, many people create an array of characters. The array is searched for the character. Instead of comparing characters, the indices are compared.

Thomas Matthews
The C++ language standard does _not_ specify an execution character set, and as such, it is not guaranteed by the language standard that the letters will be in such an order and it is thus implementation-dependent.
James McNellis