tags:

views:

139

answers:

5

How can I split a text into tokens using strtok without deleting the delim? I want just to split at their place.

A: 

You can't. The behavior of strtok is that it replaces the delimiter with a NUL character. This behavior is not configurable. To return each substring, including the delimiter, you will have to find a function other than strtok, or else combine strtok with some of your own processing.

Ben Voigt
+2  A: 

You can't. strtok does the splitting by replacing the delimiter with a '\0'. Without doing that, no splitting would happen.

You could, however, create a function that did splitting kind of like strtok does, but by finding where the string should be split and (for example) allocating storage and copying the characters up to the delimiter into that storage. strcspn or strpbrk would probably be a useful start at this.

Jerry Coffin
A: 

If your libc implementation has it, take a look at strsep (3)

Matt Edlefsen
Docs http://kernel.org/doc/man-pages/online/pages/man3/strsep.3.html
Bklyn
+1  A: 

Can you use boost? boost::algorithm::split does exactly what you want.

You can, of course, write one yourself; it's not like split is complicated: (Note: I have not actualy tested this)

std::wstring source(L"Test\nString");
std::vector<std::wstring> result;
std::wstring::iterator start, end;
start = source.begin();
end = std::find(source.begin(), source.end(), L'\n');
for(; end != source.end(); start = end, end = std::find(end, source.end(), L'\n'))
    result.push_back(std::wstring(start, end));
result.push_back(std::wstring(start, end));
Billy ONeal
A: 

Simple don't use strtok.

Use the C++ stream operator.
The getline() function can be used with an extra parameter that defines the end of line token.

#include <string>
#include <sstream>
#include <vector>

int main()
{
    std::string         text("This is text; split by; the semicolon; that we will split into bits.");
    std::stringstream   textstr(text);

    std::string               line;
    std::vector<std::string>  data;
    while(std::getline(textstr,line,';'))
    {
        data.push_back(line);
    }
}

With a tiny bit more work we can even get the STL algorithms to pay their part we just need to define how a token is streamed. To do this just define a token class (or struct) then define the operator>> that reads up to the token separator.

#include <string>
#include <sstream>
#include <vector>
#include <iterator>
#include <algorithm>
#include <iostream>

struct Token
{
    std::string data;
    operator std::string() const { return data;}
};
std::istream& operator>>(std::istream& stream,Token& data)
{
    return std::getline(stream,data.data,';');
}

int main()
{
    std::string         text("This is text; split by; the semicolon; that we will split into bits.");
    std::stringstream   textstr(text);

    std::vector<std::string>  data;

    // This statement does the work of the loop from the last example.
    std::copy(std::istream_iterator<Token>(textstr),
              std::istream_iterator<Token>(),
              std::back_inserter(data)
             );

    // This just prints out the vector to the std::cout just to illustrate it worked.
    std::copy(data.begin(),data.end(),std::ostream_iterator<std::string>(std::cout,"\n"));
}
Martin York