views:

115

answers:

6

I am working on a personal project that uses a custom config file. The basic format of the file looks like this:

[users]
name: bob
attributes:
    hat: brown
    shirt: black
another_section:
    key: value
    key2: value2

name: sally
sex: female
attributes:
    pants: yellow
    shirt: red

There can be an arbitrary number of users and each can have different key/value pairs and there can be nested keys/values under a section using tab-stops. I know that I can use json, yaml, or even xml for this config file, however, I'd like to keep it custom for now.

Parsing shouldn't be difficult at all as I have already written code to do parse it. My question is, what is the best way to go about parsing this using clean and structured code as well as writing in a way that won't make changes in the future difficult (there might be multiple nests in the future). Right now, my code looks utterly disgusting. For example,

private void parseDocument() {  
    String current;
    while((current = reader.readLine()) != null) {
        if(current.equals("") || current.startsWith("#")) {
            continue; //comment
        } 
        else if(current.startsWith("[users]")) {
            parseUsers();
        }
        else if(current.startsWith("[backgrounds]")) {
            parseBackgrounds();
        }
    }
}

private void parseUsers()  {        
    String current;
    while((current = reader.readLine()) != null) {
        if(current.startsWith("attributes:")) {
            while((current = reader.readLine()) != null) {
                if(current.startsWith("\t")) {
                    //add user key/values to User object
                }
                else if(current.startsWith("another_section:")) {
                    while((current = reader.readLine()) != null) {
                        if(current.startsWith("\t")) {
                            //add user key/values to new User object
                        } 
                        else if (current.equals("")) {
                            //newline means that a new user is up to parse next
                        }
                    }
                }
            }
        }
        else if(!current.isEmpty()) {
            //
        }


    }
}

As you can see, the code is pretty messy, and I have cut it short for the presentation here. I feel there are better ways to do this as well maybe not using BufferedReader. Can someone please provide possibly a better way or approach that is not as convoluted as mine?

+4  A: 

I would suggest not creating custom code for config files. What you're proposing isn't too far removed from YAML (getting started). Use that instead.

See Which java YAML library should I use?

cletus
As commented on codemeit's answer, I had considered YAML in the beginning, but needed the config file to be extremely simplistic in nature.
trinth
@trinth I don't know why exactly you think YAML is heavyweight. It's as lightweight as your data is. I stand by my recommendation; better to fit your data into an existing model than to needlessly invent your own configuration format with corresponding libraries.
cletus
+1  A: 

If you could utilise XML or JSON or other well-known data encoding as the data format, it will be a lot easier to parse/deserialize the text content and extract the values. For example.

name: bob
attributes:
    hat: brown
    shirt: black
another_section:
    key: value
    key2: value2

Can be Expressed as the follow XML (there are other options to express it in XML as well)

<config>
  <User hat="brown" shirt="black" >
    <another_section>
      <key>value</key>
      <key2>value</key2>
    </another_section>
  </User>
</config>

Custom ( Extremely simple ) As I mentioned in the comment below, you can just make them all name and value pairs. e.g.

name                 :bob
attributes_hat       :brown
attributes_shirt     :black
another_section_key  :value
another_section_key2 :value2

and then do string split on '\n' (newline) and ':' to extract the key and value or build a dictionary/map object.

codemeit
The OP's data format looks remarkably like YAML. All other suggested data formats are moot.
George Jempty
Yep, I had initially chosen YAML for the project, however, I didn't think the syntax would be appropriate. The config file needs to be as simplistic as can be and XML and JSON was too verbose and YAML still didn't make the cut.
trinth
In that case you can just make them all name and value pair.e.g.name:bob;attributes_hat:brown;attributes_shirt:black;another_section_key:value;another_section_key2:value2and then do string split on ';' and ':'
codemeit
+1  A: 

I'd recommend changing the configuration file's format to JSON and using an existing library to parse the JSON objects such as FlexJSON.

{
"users": [
    {
        "name": "bob",
        "hat": "brown",
        "shirt": "black",
        "another_section": {
            "key": "value",
            "key2": "value2" 
        } 
    },
    {
        "name": "sally",
        "sex": "female",
        "another_section": {
            "pants": "yellow",
            "shirt": "red" 
        } 
    } 
] 

}

johnnieb
id love to use json but the braces, brackets, and quotes are too much of a clutter and i'd like to keep the config file simple
trinth
I understand. What if you select a standard file format (such as JSON or XML), but instead provide a user interface for editing the file? This way you can hide the complexity from your users, extend it more easily and reduce maintenance costs overtime. Moreover, this will keep your users focusing on the data instead of the config file syntax.
johnnieb
+1  A: 

Everyone will recommend using XML because it's simply better.

However, in case you're on a quest to prove your programmer's worth to yourself...

...there is nothing really fundamentally wrong with the code you posted in the sense that it's clear and it's obvious to potential readers what's going on, and unless I'm totally out of the loop on file operations, it should perform pretty much as well as it could.

The one criticism I could offer is that it's not recursive. Every level requires a new level of code to support. I would probably make a recursive function (a function that calls itself with sub-content as parameter and then again if there's sub-sub-content etc.), that could be called, reading all of this stuff into a hashtable with hashtables or something, and then I'd use that hashtable as a configuration object.

Then again, at that point I would probably stop seeing the point and use XML. ;)

Helgi Hrafn Gunnarsson
Thanks for the answer Helgi. The quote, "However, in case you're on a quest to prove your programmer's worth to yourself..." was pretty applicable in this case. I will consider your advice about making this more recursive. So far, your post has been most helpful for me but I'll wait just a bit more though, before I make this my answer ;)
trinth
+1  A: 

It looks simple enough for a state machine.

while((current = reader.readLine()) != null) {
  if(current.startsWith("[users]"))
    state = PARSE_USER;
  else if(current.startsWith("[backgrounds]"))
    state = PARSE_BACKGROUND;
  else if (current.equals("")) {
    // Store the user or background that you've been building up if you have one.
    switch(state) {
      case PARSE_USER:
      case USER_ATTRIBUTES:
      case USER_OTHER_ATTRIBUTES:
        state = PARSE_USER;
        break;
      case PARSE_BACKGROUND:
      case BACKGROUND_ATTRIBUTES:
      case BACKGROUND_OTHER_ATTRIBUTES:
        state = PARSE_BACKGROUND;
        break;
    }
  } else switch(state) {
    case PARSE_USER:
    case USER_ATTRIBUTES:
    case USER_OTHER_ATTRIBUTES:
      if(current.startsWith("attributes:"))
        state = USER_ATTRIBUTES;
      else if(current.startsWith("another_section:"))
        state = USER_OTHER_ATTRIBUTES;
      else {
        // Split the line into key/value and store into user
        // object being built up as appropriate based on state.
      }
      break;
    case PARSE_BACKGROUND:
    case BACKGROUND_ATTRIBUTES:
    case BACKGROUND_OTHER_ATTRIBUTES:
      if(current.startsWith("attributes:"))
        state = BACKGROUND_ATTRIBUTES;
      else if(current.startsWith("another_section:"))
        state = BACKGROUND_OTHER_ATTRIBUTES;
      else {
        // Split the line into key/value and store into background
        // object being built up as appropriate based on state.
      }
      break;
  }
}
// If you have an unstored object, store it.
Glomek
+1  A: 

A nice way to clean it up would be to use a table, i.e. replace your conditionals with a Map. You can then invoke you parsing methods through reflection (simple) or create a few more classes implementing a common interface (more work but more robust).

CurtainDog