views:

472

answers:

7

Hi, I'm trying to make a constructor for a graph class that accepts a string as a parameter and uses it to build the graph.

The string is formatted as follows: |vertex list|Edges list|
e.g. |1,2,3,4,15|(1->2),(3->2),(4->15)|

The idea is that the constructor will take the values from the string and then know to perform the following actions (inserting the vertexes into the vertex list and then inserting the edges into the edges list):

addVertex(1)
addVertex(2)
addVertex(3)
addVertex(4)
addVertex(15)
addEdge(1,2)
addEdge(3,2)
addEdge(4,15)

I would have just made a couple of "for" loops to scan the string, but I don't know what to do about double(or more) digit numbers. I'm starting to imagine all sorts of seriously complicated for loops and I'm wondering if anyone here could share with me any more intelligent ways to extract and use this data.

+2  A: 

I've never used it before, but there's a Boost tokenizer class. You could have it easily break the thing down into components for you without all the for-looping.

Michael Kohne
boost tokenizer is a great option on this case. +1
Edison Gustavo Muenz
Boost Tokenizer is wonderful for breaking up strings. Even if you don't use it in this situation, its good to know about for the future.
Dr. Watson
+1  A: 

You can use a stringstream and use the stream extraction operator to get your integers.

string s("12 34");
istringstream ss(s);
int x, y;
ss >> x >> y;

Since this is homework, I urge you to explore the possibilities and figure out the complete code for yourself.

dirkgently
stringstream won't make things much easier here because you're not guaranteed any delimiting whitespace. nobody said anything about integer either; they were just an example.
wilhelmtell
Stringstream seems like a good solution, but how do I deal with commas? my list is not separated by spaces, but by commas. e.g (12,34)
Dave
You'd do 'ss >> x >> comma >> y;' where 'comma' is a char.
dirkgently
Thanks, I've managed to use this answer to parse the vertexes and edges in a pretty efficient and simple way. I learned a lot from this. Your help is really appreciated.
Dave
A: 

Use a stringstream. Note the example on that page for reading in numbers with an istringstream.

Naaff
+5  A: 

You seem to be getting overwhelmed looking at the whole thing. Break it into pieces...tasks. What you're trying to do seems to be separate functionalities here.

  1. Tokenizing
  2. Parsing Vertices
  3. Parsing Edges
  4. Execution on Vertices
  5. Execution on Edges

That's 5 functions more-or-less.

You're wanting to tokenize based on the pipe (|) so take a substring based on the pipe and pass each side to the appropriate parser, parse on the commas and so on.

Not going to do it for you, but hopefully I can get you thinking in the right direction. Learning to program isn't so much about any particular language, but about changing the way you think.

McAden
+1 for homework-friendly advice.
Pontus Gagge
+1  A: 

Without doing your homework for you, this will give you a good head start. I have given you the basic work flow to parse vertex list, you should be able to do the edge list yourself. I also leave the error checking to you, for example in parseVertex() you may want to give an error when you encounter invalid characters.

void skipWhiteSpace(const char*& first , const char* last) {
    // do whatever need to be done to skip white space
}

// parse integer only, no error checking is performed
bool parseVertex(const char*& first , const char* last) {
    skipWhiteSpace(first, last);
    const char* numBegin = first;
    for (; first != last && ::isdigit(static_cast<unsigned char>(*first)); 
     ++first) {}
    if (numBegin != first) {
     std::cout << "addVertex(" << std::string(numBegin, first) << ")" << std::endl;
     return true;
    }

    return false;
}

bool parseComma(const char*& first , const char* last) {
    skipWhiteSpace(first, last);
    if (first != last && ',' == *first) {
     ++first;
     return true;
    }

    return false;
}

// VL := V (, VL)
// a vertex list (VL) is a vertex (V) followed by a comma than another vertex list
bool parseVertexList(const char*& first, const char* last) {
    if (parseVertex(first, last)) {
     parseComma(first, last) && parseVertexList(first, last);
     return true;
    }

    return false;
}
}

void test() {
    const char* str = "1,2,3,4,15";
    parseVertexList(str, str + sizeof("1,2,3,4,15"));
}
Shing Yip
Really appreciate the code. The sstream seems a bit more elegant, but your answer is great.
Dave
A: 

I would certainly use this problem as a pretext to play with boost spirit! Writing a little grammar for this tiny language should be a lot of fun.

Martin Cote
+1  A: 

Parsing this kind of thing is fairly straightforward (though tedious) with recursive descent techniques. The idea is to separate the language being parsed into logical units, then write a function to parse each of those units.

If we figure in the example "|1,2,3,4,15|(1->2),(3->2),(4->15)|" that the whole string is a "polygon", we'd write parsePolygon(), which would look something like this:

void parsePolygon (Buffer& b)
{
  parseVertices (b);
  parseEdges (b);
}

Let's assume Buffer is a class that runs through your string. You'll need two basic operations: peek at the next character without consuming it, and consume the next character.

parseVertices might look like this:

void parseVertices (Buffer& b)
{
  if (b.peek() != '|') { /* error */ }
  b.consume (); // burn the '|'
  parseVertexList (b);
  if (b.peek() != '|') { /* error */ }
  b.consume (); // burn the '|'
}

You'd want to handle errors much better, obviously. If the stream encounters any error it needs to pass the error code up the callstack or throw an exception.

Two more examples... parseVertexList and parseNumber might look like this:

void parseVertexList (Buffer& b)
{
  addVertex (parseNumber (b));
  while (b.peek() == ',')
  {
     b.consume (); // eat the comma
     addVertex (parseNumber (b));
  }
}

int parseNumber (Buffer& b)
{
  char accum[80] = { '0' };  // sensible default in case of failure
  int accumPos = 0;
  while (isDigit (b.peek())
  {
    accum[accumPos++] = b.consume();
  }
  return atoi(accum);
}

This is all very quick and dirty, but hopefully it gives you an idea how the technique works. You can intermix your handling with your parsing as shown above, where the parseVertexList function actually does the addVertex calls for you.

I think this is really one of the simplest methods of manual parsing. Ideally we'd always be able to use generated parsers like boost spirit or pyparsing or lex/yacc, but life isn't always that good, especially for homework.

Also I suppose it's worth noting that the above technique can be a lot of overkill for some parsing situations.

Dan Olson
Thanks for your help. I can see how your solution would be great if I was parsing something a bit more complicated. Have a great day!
Dave
I just used this to help on another project. Thanks so much!
Dave