tags:

views:

402

answers:

1

Lets say that I'm trying to solve a parsing problem of string to char ** For some reason the below code generates a lot of trash, can anyone have a look at it please?

  1. Here's what it's supposed to do :
  2. Dump all argv into a string_array container
  3. Dump everything in the string_array container into a std::string and separate with spaces
  4. Break the string down into string tokens using boost/algorithm/string
  5. create a new char ** and dump all tokens into it, print out the new char **, clean up

What have I done wrong ?

#include <string>
#include <vector>
#include <iostream>

#include <boost/algorithm/string.hpp>

using namespace std;
using namespace boost;

typedef vector<string> string_array;

int main(int argc, char ** argv)
{
    string_array args;
    string_array tokens;
    cout << "Real arguments :" << endl;
    for(int i = 0; i < argc; i++)
    { cout << argv[i] << endl;}

    string arg = "";
    for(int i = 1; i < argc; i++)
    {
     args.push_back(argv[i]);
    }
    for(int i = 0; i < (int)args.size(); i++)
    {
     arg += args[i];
     if(i != (int)args.size() - 1)
      arg += " ";
    }

    split(tokens, arg, is_any_of(" "));

    char ** new_args = NULL;
    new_args = new char*[(int)tokens.size()];
    for(int i = 0; i < (int)tokens.size(); i++)
    { 
     new_args[i] = new char[(int)tokens[i].size()];
     for(int j = 0; j < (int)tokens[i].size(); j++)
     {
      new_args[i][j] = tokens[i][j];
     }
    }

    for(int i = 0; i < (int)tokens.size(); i++)
    { std::cout << new_args[i] << std::endl; }
    delete [] new_args;
}
+4  A: 

C-style strings (char*) are meant to be zero-terminated. So instead of new char[tokens[i].size()], you need to add 1 to the allocation: new char[token[i].size() + 1]. Also, you need to set new_args[i][tokens[i].size()] = 0 to zero-terminate the string.

Without the zero-terminator, programs would not know when to stop printing, as char* does not hold a string length, unlike std::string.

Chris Jester-Young
Technically, that should be "null-terminated" lest someone start terminating their strings with '0' rather than '\0'.
Jeff Yates
Well, zero means `(char) 0`, not `'0'`. But I hear what you're saying. I don't like using "null-terminated" because it causes people to mix up NULL (pointer) and NUL (the character), and start writing NULL instead of 0 when they meant NUL, which is seriously uncool in my opinion. Suggestions welcome.
Chris Jester-Young
That did it, cheers!
Maciek
@Chris: I write NUL-terminated for strings and NULL-terminated for pointer arrays. I'm never sure whether anyone ever notices the difference, though...
Steve Jessop