views:

233

answers:

3

Possible Duplicates:
C++: How to split a string?
Splitting a string

What is the best way to go about splitting a string up by whitespace in c++?

I'd like to be able to split it based on tab, space, etc. and of course ignore multiple tabs/spaces/etc. in a row as well as not have issues with having those things at the end.

Ultimately, I am going to end up storing this in a vector, but I can easily convert between data types if there is some easy built-in standard library way of splitting.

I am building this on a UNIX machine with g++, not using Microsoft Visual C++

+2  A: 

It may well be overkill for this particular problem, but consider Boost.Regex.

(Honestly, I could probably just write a script that responded to every c++ question on SO with 'use Boost', and come out ahead in karma. But it really does help).

David Seiler
+1 for your second paragraph. Couldn't leave you without increased rep for that.
David Thornley
+6  A: 

It may be open to question whether it's best, but one really easy way to do this is to put your string into a stringstream, then read the data back out:

// warning: untested code.
std::vector<std::string> split(std::string const &input) { 
    std::stringstream buffer(input);
    std::vector<std::string> ret;

    std::copy(std::istream_iterator<std::string>(buffer), 
              std::istream_iterator<std::string>(),
              std::back_inserter(ret));
    return ret;
}

This should be work with any reasonable C++ compiler.

Jerry Coffin
I'd prefer `std::istringstream`, but otherwise it's good. +1
sbi
+1  A: 

This is what I use:

/* Tokenizing a string */
    std::vector<std::string> Parser::tokenizer( const std::string& p_pcstStr, char delim )  {
        std::vector<std::string> tokens;
        std::stringstream   mySstream( p_pcstStr );
        std::string         temp;

        while( getline( mySstream, temp, delim ) ) {
            tokens.push_back( temp );
        }

        return tokens;
    } 

Your delim would be a whitespace, p_pcstStr would be the string to tokenize and the return would be a vector with all strings which have a whitespace in between.

Layne