tags:

views:

2317

answers:

6

I'm trying to parse a simple string in C++. I know the string contains some text with a colon, followed immediately by a space, then a number. I'd like to extract just the number part of the string. I can't just tokenize on the space (using sstream and <<) because the text in front of the colon may or may not have spaces in it.

Some example strings might be:

Total disk space: 9852465

Free disk space: 6243863

Sectors: 4095

I'd like to use the standard library, but if you have another solution you can post that too, since others with the same question might like to see different solutions.

+7  A: 
std::string strInput = "Total disk space: 9852465";
std::string strNumber = "0";
size_t iIndex = strInput.rfind(": ");
if(iIndex != std::string::npos && strInput.length() >= 2)
{
  strNumber = strInput.substr(iIndex + 2, strInput.length() - iIndex - 2)
}
Brian R. Bondy
DO NOT use `!= -1`! Use `!= string::npos` instead.
Konrad Rudolph
Why not, Konrad, does it not work, is it inefficient??? Inquiring minds want to know.
paxdiablo
Good catch Konrad!
BobbyShaftoe
-1 is a implementation detail, string::npos is correct by definition.
dmckee
It's not guaranteed to work. The standard says that `find` returns `npos` if nothing was found, and it defines that `npos = -1`. However, `npos` is of type `std::size_t`. Some compilers will issue a warning for comparing an unsigned to a signed number, and depending on settings, treat it as an error
Konrad Rudolph
That's better - my personality type likes to see explanations for why to do things a certain way, not just some edict handed down from above :-)
paxdiablo
@Brian: Thanks, that works great. That's not the first time you've answered a C++ question of mine within a single digit number of minutes. You must subscribe to my feed. :)
Bill the Lizard
Thanks also to Konrad for the correction and Pax for demanding an explanation. ;)
Bill the Lizard
@Pax: good personality trait. Unfortunately, comments aren't really a great place for technical explanations. :-/
Konrad Rudolph
+1  A: 
const std::string pattern(": ");
std::string s("Sectors: 4095");
size_t num_start = s.find(pattern) + pattern.size();
orip
Brian's solution checks errors, mine doesn't - use his :)
orip
I'll still give you +1 for not using a magic number. :)
Bill the Lizard
+5  A: 

For completeness, here's a simple solution in C:

int value;
if(sscanf(mystring.c_str(), "%*[^:]:%d", &value) == 1)
    // parsing succeeded
else
    // parsing failed

Explanation: the %*[^:] says to read in as many possible characters that aren't colons, and the * suppresses assignment. Then, the integer is read in, after the colon and any intervening white space.

Adam Rosenfield
Thanks, I like when people give alternate solutions. I'm sure this will prove helpful to future C programmers.
Bill the Lizard
great personally i like your solution the same like Konrads :) even tho they do not search a substring, they show how to parse it cleanly
Johannes Schaub - litb
+4  A: 

I can't just tokenize on the space (using sstream and <<) because the text in front of the colon may or may not have spaces in it.

Right, but you can use std::getline:

string not_number;
int number;
if (not (getline(cin, not_number, ':') and cin >> number)) {
    cerr << "No number found." << endl;
}
Konrad Rudolph
That looks for a newline; I'm assuming you intended to use the overload of getline that takes a delimiter as a third parameter and to pass in ':' for that parameter?
Adam Rosenfield
Thanks Adam, I forgot the third argument, which is really what his posting was all about, as you guessed. :-/
Konrad Rudolph
… and yet another stupid mistake. Seems like I'm done in.
Konrad Rudolph
Thanks, Konrad. I didn't know about the overloaded getline() function. It seems strange to me to overload the function in this way, since it no longer gets a line.
Bill the Lizard
@Adam: Thanks for explaining this. It had me thoroughly confused. :)
Bill the Lizard
alternatively, if(!(cin.ignore(numeric_limits<streamsize>::max(), ':') >> number)) cout << "no number found" << endl; could be used too i think
Johannes Schaub - litb
+2  A: 

Similar to Konrads answer, but using istream::ignore:

int number;
std::streamsize max = std::numeric_limits<std::streamsize>::max();
if (!(std::cin.ignore(max, ':') >> number)) {
    std::cerr << "No number found." << std::endl;
} else {
    std::cout << "Number found: " << number << std::endl;
}
Johannes Schaub - litb
Yeah, that's actually the better answer. However, IIRC there are some problems with `ignore` and `max` on some platforms (probably due to signed/unsigned mismatch). This information might be dated, however.
Konrad Rudolph
yeah i read in the std that streamsize must have signed type. i actually just looked up, since i wondered why ppl don't do streamsize(-1) :)
Johannes Schaub - litb
+2  A: 

I'm surprised that no one mentioned regular expressions. They were added as part of TR1 and are included in Boost as well. Here's the solution using regex's

typedef std::tr1::match_results<std::string::const_iterator> Results;

std::tr1::regex re(":[[:space:]]+([[:digit:]]+)", std::tr1::regex::extended);
std::string     str("Sectors: 4095");
Results         res;

if (std::tr1::regex_search(str, res, re)) {
    std::cout << "Number found: " << res[1] << std::endl;
} else {
    std::cerr << "No number found." << std::endl;
}

It looks like a lot more work but you get more out of it IMHO.

D.Shawley