views:

234

answers:

2

Hello,

I'm using boost library to match substrings in a text. to iterate over results i need to use regex_iterator (see http://www.boost.org/doc/libs/1_42_0/libs/regex/doc/html/boost_regex/ref/regex_iterator.html)

that's the only usage example i have found, but it's not clear for me (i don't understand the callback).

could someone familiar with boost post a usage example of that function ?

let's assume that my input text is

"Hello everybody this is a sentense
Bla bla 14 .. yes 
date 04/15/1986 
"

i want to get "Hello" "everybody" "this" "is" "a" "sentense" "bla" "yes" "date".

thanks.

A: 

From your explanation You may use tokenizer function. And put some more logic into it. look at boost::tokenizer

ex:

boost::char_separator<char> sep_1(" ");


std::string msg_copy ("Hello everybody this is a sentense Bla bla 14 .. yes date 04/15/1986 ");
boost::tokenizer< boost::char_separator<char> > tokens(msg_copy, sep_1);
BOOST_FOREACH(std::string t, tokens)
{
        // here you itterate t
}

edit:

You can put as many special characters to separator as you want ex:

boost::char_separator<char> sep_1(" *^&%~/|");
bua
Yes this is a possible solution, but i forgot to mention that the text i realy want to fetch contains words separated by anything spaces, comas, dashs, pipe .. the best is to use regular expressions..boost is the best for C++.I have tried using Boost::regex_search() but it only returns the first match..i need to get all matches.. for that i was told to use boost::regex_iterator() but i don't understand anything Boost's documentation realy sucks...
youssef azari
then use:boost::char_separator<char> sep_1(" *^It will tokenize against all of special chars ;)Post updated as well
bua
A: 

If the only part of the example you don't understand is the callback, consider that:

std::for_each(m1, m2, &regex_callback);

is roughly equivalent to:

for (; m1 != m2; ++m1){
    class_index[(*m1)[5].str() + (*m1)[6].str()] = (*m1).position(5);
}

Assuming that, in your case, you want to store all the matches in a vector, you would write something like:

//Warning, untested:
boost::sregex_iterator m1(text.begin(), text.end(), expression);
boost::sregex_iterator m2;
std::vector<std::string> tokens;
for (; m1 != m2; ++m1){
    tokens.push_back(m1->str()).
}
Éric Malenfant
Thanks a lot :)
youssef azari