I might look into std::stringstream
from <sstream>
. C-style strtok
has a number of usability problems, and C-style strings are just troublesome.
Here's an ultra-trivial example of it tokenizing a sentence into words:
#include <sstream>
#include <iostream>
#include <string>
int main(void)
{
std::stringstream sentence("This is a sentence with a bunch of words");
while (sentence)
{
std::string word;
sentence >> word;
std::cout << "Got token: " << word << std::endl;
}
}
janks@phoenix:/tmp$ g++ tokenize.cc && ./a.out
Got token: This
Got token: is
Got token: a
Got token: sentence
Got token: with
Got token: a
Got token: bunch
Got token: of
Got token: words
Got token:
The std::stringstream
class is "bi-directional", in that it supports input and output. You'd probably want to do just one or the other, so you'd use std::istringstream
or std::ostringstream
.
The beauty of them is that they are also std::istream
and std::ostream
s respectively, so you can use them as you'd use std::cin
or std::cout
, which are hopefully familiar to you.
Some might argue these classes are expensive to use; std::strstream
from <strstream>
is mostly the same thing, but built on top of cheaper C-style 0-terminated strings. It might be faster for you. But anyway, I wouldn't worry about performance right away. Get something working, and then benchmark it. Chances are you can get enough speed by simply writing well-written C++ that minimizes unnecessary object creation and destruction. If it's still not fast enough, then you can look elsewhere. These classes are probably fast enough, though. Your CPU can waste thousands of cycles in the amount of time it takes to read a block of data from a hard disk or network.