views:

662

answers:

9

I am new to programming. I have been trying to write a function in C++ that explodes the contents of a string into a string array at a given parameter, example:

string str = "___this_ is__ th_e str__ing we__ will use__";

should return string array:

cout << stringArray[0]; // 'this'
cout << stringArray[1]; // ' is'
cout << stringArray[2]; // ' th'
cout << stringArray[3]; // 'e str'
cout << stringArray[4]; // 'ing we'
cout << stringArray[5]; // ' will use'

I can tokenize the string just fine, but the hardest part for me is how can i specify the number of elements in stringArray before assigning it the current string toke and also how to return stringArray from the function.

Thanks.

Edit1: I don't necessarily need the results to been in string array just any container that i can call as a regular variable with some sort of indexing.

+1  A: 

If you insist on making stringArray an array as oppossed to a std::vector<> (which would be the right thing to do) you have to either:

  1. Make two passes (one to count, you see)
  2. Implement a dynamic array yourself.

Using a vector is easier vector::push_back() appends new stuff to the end. So:

vector* explode(string s){
  vector<string> *v = new vector<string>
  //...
  // in a loop
    v->push_back(string_fragment);
  //...
  return v;
}


Not needed after all Left in for completeness.

To return the array of strings you use char **.

As in

char ** explode(const char *in){
  ...

}

BTW-- How will the calling function know how many elements are in the returned array? You'll have to solve that too. Use std::vector<> unless you are constrained by outside forces...

dmckee
v->push_back(string_fragment);
Magnus Skog
@Magnus: Oops. Thanks.
dmckee
I think passing a stack allocated vector back, or taking in a reference to a vector to fill would be a better choice then allocating a raw vector in the function, then passing the delete responsibility to the client.
GMan
@GMan: Perhaps you are right. I still speak c++ in translation, and have a lot of c habits that hang around...
dmckee
+1  A: 

You can use a vector of string (std::vector<std::string>), append each token to it with push_back, and then return it from your tokenize function.

Jem
+1  A: 

Use std::vector as a dynamic array and return that as your result.

workmad3
+1  A: 

Use std::vector<string>.

Andrew
+1  A: 

Perhaps you should use a list instead of an array. That way you would not need to know the number of elements ahead of time. You may also consider using the STL containers.

+2  A: 

Here's my first attempt at this using vectors and strings:

vector<string> explode(const string& str, const char& ch) {
    string next = "";
    vector<string> result;

    // For each character in the string
    for (string::const_iterator it = str.begin(); it != str.end(); it++) {
     // If we've hit the terminal character
     if (*it == ch) {
      // If we have some characters accumulated
      if (next.length() > 0) {
       // Add them to the result vector
       result.push_back(next);
       next = "";
      }
     } else {
      // Accumulate the next character into the sequence
      next += *it;
     }
    }

    return result;
}

Hopefully this gives you some sort of idea of how to go about this. On your example string it returns the correct results with this test code:

int main (int, char const **) {
    std::string blah = "___this_ is__ th_e str__ing we__ will use__";
    std::vector<std::string> result = explode(blah, '_');

    for (size_t i = 0; i < result.size(); i++) {
     cout << "\"" << result[i] << "\"" << endl;
    }
    return 0;
}
Eric Scrivner
The first parameter to explode() should be a constant reference. Compiler will then complain, so 'it' needs to be a string::const_iterator.
Magnus Skog
@Magnus: Thanks, added that fix :)
Eric Scrivner
+1  A: 

Using STL (sorry no compiler not tested)

#include <vector>
#include <string>
#include <sstream>

int main()
{
    std::vector<std::string>   result;

    std::string str = "___this_ is__ th_e str__ing we__ will use__";

    std::stringstream  data(str);

    std::string line;
    while(std::getline(data,line,'_'))
    {
        result.push_back(line); // Note: You may get a couple of blank lines
                                // When multiple underscores are beside each other.
    }
}

// or define a token

#include <vector>
#include <string>
#include <iterator>
#include <algorithm>
#include <sstream>

struct Token: public std::string  // Yes I know this is nasty.
{                                 // But it is just to demosntrate the principle.    
};

std::istream& operator>>(std::istream& s,Token& t)
{
    std::getline(s,t,'_');

    // *** 
    // Remove extra '_' characters from the stream.
    char c;
    while(s && ((c = s.get()) != '_')) {/*Do Nothing*/}
    if (s)
    {
        s.unget(); // Put back the last char as it is not '_'
    }
    return s;
}

int main()
{   
    std::vector<std::string>   result;

    std::string str = "___this_ is__ th_e str__ing we__ will use__";

    std::stringstream  data(str);

    std::copy(std::istream_iterator<Token>(data),
              std::istream_iterator<Token>()
              std::back_inserter(result)
             );
}
Martin York
So add the check inside the while loop to skip empty "line"s.
jmucchiello
That's an excercise for the user.
Martin York
A: 

Wait until your data structures class and then code it with a linked list. If it is for homework though, you may be able to get away with just initing the array be very large.

JasonRShaver
A: 

The code below:

template <typename OutputIterator> int explode(const string &s, const char c, OutputIterator output) { stringstream data(s); string line; int i=0; while(std::getline(data,line,c)) { *output++ = line; i++; } return i; }

int main(...) { string test="H:AMBV4:2:182.45:182.45:182.45:182.45:182.41:32:17700:3229365:201008121711:0"; cout << test << endl; vector<string> event; **This is the main call** int evts = explode(test,':', back_inserter(event)); for (int k=0; k<evts; k++) cout << event[k] << "~"; cout << endl; }

Outputs H:AMBV4:2:182.45:182.45:182.45:182.45:182.41:32:17700:3229365:201008121711:0 H~AMBV4~2~182.45~182.45~182.45~182.45~182.41~32~17700~3229365~201008121711~0~

Paulo Lellis