views:

82

answers:

3

I have an existing system that I do not wish to change where I would like to add meta-data/configuration/annotations to an existing user object/entity.

I do not want to change the schema or UI so I am planning on letting the user add this meta-data through a description field of the object where users normally enter in a description. It turns out this field is rarely used however I would still like people to be able to enter in a description and then the meta-data.

Basically I want the parser to be similar to HTML parsers and not fail-fast.

My gut is to do something similar to the Java Properties format but use Regex. But property files are pretty weak for representing complex data.

Is there an existing non-fail-fast format I should use?

A: 

It seems that your problem isn't really wanting something that isn't strict - rather, you want to be able to tell the description and metadata apart.

You could probably just use XML and strip anything before the opening tag and after the closing one before presenting it to the parser. Alternatively, you could use whatever but require a fairly unique character sequence (say, >>>METADATA<<< on a line by itself) between the description and metadata.

Anon.
Yeah I thought that but I don't want people to have to type XML.I mean XML is for machines. But I do agree with your approach on splitting which is what I am currently doing with regex.
Adam Gent
Strict is still a problem even with separation. For Example my current solution using the splitting technique and then using JSON requires the JSON to be valid... so you better remember not to have an extra python style comma on the end of your object or array.
Adam Gent
+1  A: 

Here's a good list of standard configuration formats with pros/cons for each:

http://www.faqs.org/docs/artu/ch05s02.html

All of those formats are designed to be easily edited by hand.

EDIT: You described in a comment that you want at most two "layers" of data, in which case the best formats from that page I linked to would be the Windows-style .ini format or the "Record-Jar" format.

Max E.
I think the windows ini style format is the way to go. I still probably have to write my own fail-safe parser that will accept bad output with grokking.
Adam Gent
A: 

Would you consider writing the configuration as a DSL? Although I have never done this before myself, I heard about this technique several times at conferences especially you intend to have your users to manage this configuration. The logic is a DSL looks more "english" for your users to understand and make changes, compared to painful XML angle-bracket or configurations with syntax so "foreign" to them (yet, they look so normal to us because we stare at it for 10 years :) ). Parsing a DSL wouldn't be too difficult either, I have messed with Groovy and Ruby, and they are pretty straight forward.

My 2 cents...

limc
That is kind of the plan although the only issue with a DSL over some friendly data format like yaml or json is that you have to plan the syntax and semantics of the language. Its hard to iteratively adjust a DSL compared to just adding a field in JSON/YAML.
Adam Gent