views:

134

answers:

1

I've got the DTD for OFX 1.03 (their latest version despite having developed and released 1.60, but I digress...)

I would like to use regex to have groups that split an entity, element, other tags into its parts for further processing such that I would take a tag like this:

<!ENTITY % ACCTTOMACRO "(BANKACCTTO | CCACCTTO | INVACCTTO)">

And create an object like this

new EntityTag { string Name = "%ACCTTOMACRO"; string[] ChildTypes = new string[] {"BANKACCTTO", "CCACCTTO", "INVACCTTO"}};

I've got a regular expression that looks like this:

Regex re = new Regex(@"<!(\b)+([\s\S])?[^>]+>");

Admittedly, I'm new to regex, so I've done good so far getting this which gives me a match collection over the DTD for each tag without comments.

I would like to leverage grouping to facilitate creation of the previously mentioned object.

If I'm on the totally wrong path, please instruct me, however if you do download this document, I think you may find its not standard. (Visual studio throws up some red flags with the way this document is formatted)

I don't expect anyone to go to the trouble, but for the curious here is the link to download the specs.

+2  A: 

It looks like they've got schema available as well. Why not download the schema instead and parse that with an XML parser (for instance, LINQ-to-XML)?

TrueWill
Unfortunately, version 1.03 is in SGML not XML, so an XML schema document doesn't exist for the version 1 branch. Its also unfortunate, because 1.02/1.03 version of OFX is what I'm required to implement. Fortunately I have a working, rough SGMLTag engine. Now I have to validate it.Sorry if I wasn't clear on my question as to what version I was using.
fauxtrot
There's a free DTD-to-schema converter at http://www.hitsw.com/xml_utilites/ - I haven't tried it, but that or something similar might help.
TrueWill
OK.. so its been a while and Now I'll give you an update. I'm using a little slight of hand here. I took the 2.11 spec for OFX and used xsd.exe to generate some code. I'm marking which items are compliant with which versions using attributes and then using an intermediate layer object to handle formatting the tags back and forth between the different versions. While your answer doesn't really stay true to form on the regex portion of my question, you get the answer flag for giving me a different direction that worked! Thank you very much!
fauxtrot
@fauxtrot - Glad I could help.
TrueWill