tags:

views:

271

answers:

5

Can anyone give me a hand with a touch of regex?

I'm reading in a list of "locations" for a simple text adventure (those so popular back in the day). However, I'm unsure as to how to obtain the input.

The locations all follow the format:

<location_name>, [<item>]
    [direction, location_name]

Such as:

Albus Square, Flowers, Traffic Cone
    NORTH, Franklandclaw Lecture Theatre
    WEST, Library of Enchanted Books
    SOUTH, Furnesspuff College

Library of Enchanted Books
    EAST, Albus Square
    UP, Reading Room

(Subsequent locations are separated by a blank line.)

I'm storing these as Location objects with the structure:

public class Location {

    private String name;

    private Map<Direction, Location> links;

    private List<Item> items;

}

I use a method to retrieve the data from a URL and create the Location objects from the read text, but I'm at a complete block as to do this. I think regex would be of help. Can anyone lend me a well-needed hand?

A: 

Can you change the format of the data. That format is klunky. I suspect that you're busy reinventing the square wheel... This screems "Just use XML" to me.

corlettk
But I would suspect that re-formatting the data as XML would require it to be parsed by a RegExp (Or some other technique) in the first place.
belugabob
The idea is not to use text in the first place but something which is more structured.
Aaron Digulla
The issue is that I do not have the data, it's at an external URL, in the text-based format outlined above. Normally I would use XML as well.
Beau Martínez
+3  A: 

You don't want to use a text-only format for this:

  • What happens when you have more than a single flower item? Are they all the same? Can't an adventurer collect a bouqet at by picking single flowers at several locations?

  • There will probably be several rooms with the same name ("cellar", "street corner"), i.e. filler rooms which add to the atmosphere but nothing to the game. They don't get a description of their own, though. How to keep them apart?

  • What if a name contains a comma?

  • Eventually, you'll want to use Unicode for foreign names or formatting instructions.

Since this is structured data which can contain lots of odd cases, I suggest to use XML for this:

<locations>
    <location>
        <name>Albus Square</name>
        <summary>Short description for returning adventurer</summary>
        <description>Long text here ... with formatting, etc.</description>
        <items>
            <item>Flowers</item>
            <item>Traffic Cone</item>
        <items>
        <directions>
            <north>Franklandclaw Lecture Theatre</north>
            <west>Library of Enchanted Books</west>
            <south>Furnesspuff College</south>
        </directions>
    </location>
    <location>
        <name>Library of Enchanted Books</name>
        <directions>
            <east>Albus Square</east>
            <up>Reading Room</up>
        </directions>
    </location>
</locations>

This allows for much greater flexibility, solves a lot of issues like formatting description text, Unicode characters, etc. plus you can use more than a single item/location with the same name by using IDs (numbers) instead of text.

Use JDom or DecentXML to parse the game config.

Aaron Digulla
That assumes he has control over the input format. His decription sounds like he reads some external URL, which isn't under his control.
rudolfson
Yes, I assume that he also writes the server since this format doesn't look like something which you can find in many places on the 'net.
Aaron Digulla
Thanks for the extensive answer, but unfortunately I do not have control over the server's data, it is just provided to me as is. Normally I would resort to using an XML format solution as well.
Beau Martínez
+3  A: 

Agree w/ willcodejavaforfood, regex could be used but isn't a big boost here.

Sounds like you just need a little algorithm help (sloppy p-code follows)...

currloc = null
while( line from file )
    if line begins w/ whitespace
        (dir, loc) = split( line, ", " )
        add dir, loc to currloc
    else
        newlocdata = split( line, ", " )
        currloc = newlocdata[0]
        for i = 1 to size( newlocdata ) - 1
            item = newlocdata[i]
            add item to currloc
John Pirie
Beautiful! Nice simple pseudocode, thanks.
Beau Martínez
+2  A: 

Can't get my head into Java-mode right now, so here's some pseudo-code that should do it:

Data = MyString.split('\n\n++\s*+');

for ( i=0 ; i<Data.length ; i++ )
{
 CurLocation = Data[i].split('\n\s*+');

 LocationInfo = CurLocation[0].split(',\s*+');

 LocationName = LocationInfo[0];

 for ( n=1 ; n<LocationInfo.length ; n++ )
 {
  Items[n-1] = LocationInfo[n];
 }


 for ( n=1 ; n<CurLocation.length ; n++ )
 {
  DirectionInfo = LocationInfo[n].split(',\s*+');

  DirectionName = DirectionInfo[0];

  for ( x=1 ; x<DirectionInfo.length ; x++ )
  {
   DirectionLocation[x-1] = DirectionInfo[x];
  }

 }


}
Peter Boughton
A: 

I think using XML is overkill (shooting sparrows with cannons) while regexps are "underkill" (using a too weak tool, scrubbing floors with a toothbrush).

The right balance sounds like it's "the .ini format" or "mail headers with sections". For python there are library docs at http://docs.python.org/library/configparser.html.

A brief example:

[albus_square]
name: Albus Square
items: Flowers, Traffic Cone
north: lecture_theatre
west: library_enchanted_books
south: furnesspuff_college

I'd assume there's a Java library for this format. As another poster has pointed out, you might have name collision so I took the liberty of adding a "name:" field. The name in the square brackets would be the unique identifier.

Jonas Kölker
Python, sweet. Wish I could use it here.
Beau Martínez