ansaurus

Question

Answer 1

+2 A:

http://www.boost.org/doc/libs/1_38_0/doc/html/string_algo.html

David Lehavi 2009-04-09 17:41:40

Answer 2

A:

try the toupper() function (#include <ctype.h>). it accepts characters as arguments, strings are made up of characters, so you'll have to iterate over each individual character that when put together comprise the string

zmf 2009-04-09 17:41:41

Answer 3

+25 A:

#include <algorithm>
#include <string>

std::string str = "Hello World";
std::transform(str.begin(), str.end(),str.begin(), ::toupper);

Pierre 2009-04-09 17:41:53

Actually, `toupper()` can be implemented as a macro. This may cause an issue.

dirkgently 2009-04-09 17:43:32

Good point dirk (unfortunately). Otherwise I think this is certainly the cleanest and clearest way.

j_random_hacker 2009-04-09 17:44:39

I haven't checked, but doesn't C++ require that these functions are implemented as actual functions, even when C allowed them to be macros?

jalf 2009-04-09 17:46:56

I believe C required there to be functions also, in case you wanted to take the address of the function or whatever, but I don't have a reference handy.

David Thornley 2009-04-09 17:50:55

Updated my post with a quote from the recent draft. This solution has two perils -- so please beware.

dirkgently 2009-04-09 18:01:26

@Pierre: Please correct your post (and the quotes too).

dirkgently 2009-04-09 18:02:50

a bind(::toupper, construct<unsigned char>(_1)) with boost.lambda will serve perfectly fine i think.

Johannes Schaub - litb 2009-04-09 18:49:08

i've corrected the quotes, thinking that's quite non-controversial.

Johannes Schaub - litb 2009-04-09 21:11:54

You can easily guarantee that toupper() won't be called as a macro.Look here, at the end of the 9.1.1 subsection: http://publications.gbdirect.co.uk/c_book/chapter9/introduction.html.

Bastien Léonard 2009-04-10 14:22:24

This approach works fine for ASCII, but fails for multi-byte character encodings, or for special casing rules like German 'ß'.

dan04 2010-08-01 04:32:14

Answer 4

+5 A:

struct convert {
   void operator()(char& c) { c = toupper((unsigned char)c); }
};

// ... 
string uc_str;
for_each(uc_str.begin(), uc_str.end(), convert());

Note: A couple of problems with the top solution:

21.5 Null-terminated sequence utilities

The contents of these headers shall be the same as the Standard C Library headers , , , , and [...]

Which means that the cctype members may well be macros not suitable for direct consumption in standard algorithms.
Another problem with the same example is that it does not cast the argument or verify that this is non-negative; this is especially dangerous for systems where plain char is signed. (The reason being: if this is implemented as a macro it will probably use a lookup table and your argument indexes into that table. A negative index will give you UB.)

dirkgently 2009-04-09 17:42:12

The normal cctype members are macros. I remember reading that they also had to be functions, although I don't have a copy of the C90 standard and don't know if it was explicitly stated or not.

David Thornley 2009-04-09 18:08:51

they have to be functions in C++ - even if C allows them to be macros. i agree with your second point about the casting though. the top solution could pass negative values and cause UB with that. that's the reason i didn't vote it up (but i didn't vote it down either) :)

Johannes Schaub - litb 2009-04-09 18:32:23

@litb: Can you cite a reference, I couldn't find anything to that effect in the standard.

dirkgently 2009-04-09 18:33:40

standard quote must not be missing: 7.4.2.2/1 (poor litb, that's referencing a C99 TC2 draft only), and C++ 17.4.1.2/6 in the glory c++98 standard.

Johannes Schaub - litb 2009-04-09 18:34:00

(note the foot-note to it: "This disallows the common practice of providing a masking macro.... blah blupp .. only way to do it in C++ is to provide a extern inline function.") :)

Johannes Schaub - litb 2009-04-09 18:35:47

@litb: Footnotes are not part of the normative text, are they? I have had this confusion :P

dirkgently 2009-04-09 18:38:38

you are right, they are not part of the normative text :) but they describe the intent of their authors of course. which means if my cited text isn't really making sure there must not be macros, another paragraph will make it sure. hold on i'll see whether i find it.

Johannes Schaub - litb 2009-04-09 18:41:00

well but even if the note has no backing normative text, then there will still be a ::toupper function (beside the macro), because of that normative text i cited. since ::tupper will not be replaced by that macro (parens for the arguments are missing), it will work nicely, the same as in C :)

Johannes Schaub - litb 2009-04-09 18:52:34

hmm, i think i quoted the paragraph wrongly. It seems that when it talks about "Standard C++ Library", it means only those "cname" and "name" headers, but excludes those "name.h" headers, which it refers to by "Standard C Library". so ctype.h is not at all affected by that rule. :)

Johannes Schaub - litb 2009-04-09 19:15:34

However, D.5/1 seems to contradict. It says "For compatibility with the Standard C library, the C++ Standard library provides the 18 C headers, as shown in Table 100:" this looks like a defect i think. i'll report it.

Johannes Schaub - litb 2009-04-09 19:20:58

@litb: Thanks for taking the trouble. Are you co-consulting with C99?

dirkgently 2009-04-09 19:23:48

Johannes Schaub - litb 2009-04-09 19:28:07

... that's achieved by this trickery: http://stackoverflow.com/questions/650461/what-are-some-tricks-i-can-use-with-macros/650711#650711

Johannes Schaub - litb 2009-04-09 19:28:48

Actually, in order to force a function call we need to write (toupper) instead of just toupper in the transform

dirkgently 2009-04-09 19:45:10

Answer 5

A:

not sure there is a built in function. Try this:

Include either the ctype.h OR cctype libraries, as well as the stdlib.h as part of the preprocessor directives.

string StringToUpper(string strToConvert)
{//change each element of the string to upper case
   for(unsigned int i=0;i<strToConvert.length();i++)
   {
      strToConvert[i] = toupper(strToConvert[i]);
   }
   return strToConvert;//return the converted string
}

string StringToLower(string strToConvert)
{//change each element of the string to lower case
   for(unsigned int i=0;i<strToConvert.length();i++)
   {
      strToConvert[i] = tolower(strToConvert[i]);
   }
   return strToConvert;//return the converted string
}

Brandon Stewart 2009-04-09 17:43:14

Answer 6

+21 A:

Boost string algorithms:

#include <boost/algorithm/string.hpp>
#include <string>

std::string str = "Hello World";

boost::to_upper(str);

std::string newstr = boost::to_upper_copy("Hello World");

Tony Edgecombe 2009-04-09 17:47:37

That was useful.

quant_dev 2009-12-19 02:46:22

This also has the benefit of i18n, where `::toupper` is most likely assumes ASCII.

Ben Straub 2010-03-09 22:07:01

Answer 7

+3 A:

typedef std::string::value_type char_t;

char_t up_char( char_t ch )
{
    return std::use_facet< std::ctype< char_t > >( std::locale() ).toupper( ch );
}

std::string toupper( const std::string &src )
{
    std::string result;
    std::transform( src.begin(), src.end(), std::back_inserter( result ), up_char );
    return result;
}

const std::string src  = "test test TEST";

std::cout << toupper( src );

bb 2009-04-09 17:55:02

wouldnt recommend a back_inserter as you already know the length; use std::string result(src.size()); std::transform( src.begin(), src.end(), result.begin(), up_char );

Viktor Sehr 2010-03-09 21:24:57

Altough I am sure you know this.

Viktor Sehr 2010-03-09 21:25:15

Answer 8

+9 A:

Do you have ASCII or International characters in strings?

If it's the latter case, "uppercasing" is not that simple, and it depends on the used alphabet. There are bicameral and unicameral alphabets. Only bicameral alphabets have different characters for upper and lower case. Also, there are composite characters, like Latin capital letter 'DZ' (\u01F1 'DZ') which use the so called title case. This means that only the first character (D) gets changed.

I suggest you look into ICU, and difference between Simple and Full Case Mappings. This might help:

http://userguide.icu-project.org/transforms/casemappings

Milan Babuškov 2009-04-09 17:58:58

Or the German eszet (sp?), the thing that looks like the Greek letter beta, and means "ss". There is no single German character that means "SS", which is the uppercase equivalent. The German word for "street", when uppercased, gets one character longer.

David Thornley 2009-04-09 18:11:17

Another special case is the Greek letter sigma (Σ), which has *two* lowercase versions, depending on whether it's at the end of a word (ς) or not (σ). And then there are language specific rules, like Turkish having the case mapping I↔ı and İ↔i.

dan04 2010-08-01 04:30:59

Answer 9

A:

In all the machines I tested, it was faster. Perhaps because he is not concerned with a very wide range of characters. Or because using switch() it makes a jump table, do not know how it works in the assembly ... just know that is faster :P

string Utils::String::UpperCase(string CaseString) {
    for (unsigned short i = 0, tamanho = CaseString.length(); i < tamanho; i++) {
        switch (CaseString[i]) {
            case 'a':
                CaseString[i] = 'A';
                break;
            case 'b':
                CaseString[i] = 'B';
                break;
            case 'c':
                CaseString[i] = 'C';
                break;
            case 'd':
                CaseString[i] = 'D';
                break;
            case 'e':
                CaseString[i] = 'E';
                break;
            case 'f':
                CaseString[i] = 'F';
                break;
            case 'g':
                CaseString[i] = 'G';
                break;
            case 'h':
                CaseString[i] = 'H';
                break;
            case 'i':
                CaseString[i] = 'I';
                break;
            case 'j':
                CaseString[i] = 'J';
                break;
            case 'k':
                CaseString[i] = 'K';
                break;
            case 'l':
                CaseString[i] = 'L';
                break;
            case 'm':
                CaseString[i] = 'M';
                break;
            case 'n':
                CaseString[i] = 'N';
                break;
            case 'o':
                CaseString[i] = 'O';
                break;
            case 'p':
                CaseString[i] = 'P';
                break;
            case 'q':
                CaseString[i] = 'Q';
                break;
            case 'r':
                CaseString[i] = 'R';
                break;
            case 's':
                CaseString[i] = 'S';
                break;
            case 't':
                CaseString[i] = 'T';
                break;
            case 'u':
                CaseString[i] = 'U';
                break;
            case 'v':
                CaseString[i] = 'V';
                break;
            case 'w':
                CaseString[i] = 'W';
                break;
            case 'x':
                CaseString[i] = 'X';
                break;
            case 'y':
                CaseString[i] = 'Y';
                break;
            case 'z':
                CaseString[i] = 'Z';
                break;
        }
    }
    return CaseString;
}

osmano807 2010-03-09 21:21:33

What advantage does this code have over the other solutions posted?

Konrad Rudolph 2010-03-09 21:27:16

=) It indeeds does the job, but I'd say its a strange coding style.

Viktor Sehr 2010-03-09 21:27:42

In all the machines I tested, it was faster. Perhaps because he is not concerned with a very wide range of characters.Or because using switch() it makes a jump table, do not know how it works in the assembly ... just know that is faster :P

osmano807 2010-03-10 01:59:29

It seems that here only accept simple answers ... I made this code to the raw performance, and works well for this use.

osmano807 2010-03-10 18:21:23

I think this is a case of sacrificing memory for speed. However, don't reinvent the wheel - in fact that code can be shorted to a couple of lines by just adding 32 to the character, assuming you are dealing with the English alphabet. Which a single addition would be infinitely faster than your solution. I won't up or downvote. Brush up on your coding skills a little bit, not saying what you put is a bad thing but I have seen that coding style many times with the CS students in college and it certainly isn't the best.

Nathan Adams 2010-09-12 16:54:04

Answer 10

A:

//works for ASCII -- no clear advantage over what is already posted...

std::string toupper(const std::string & s)
{
    std::string ret(s.size(), char());
    for(unsigned int i = 0; i < s.size(); ++i)
        ret[i] = (s[i] <= 'z' && s[i] >= 'a') ? s[i]-('a'-'A') : s[i];
    return ret;
}

David 2010-08-01 04:24:06

ansaurus

tags:

views:

answers:

Convert a String In C++ To Upper Case

related questions