views:

776

answers:

3

Does anyone know a tool to convert from Cobol Copybook to XSD? Or XML.

A: 

see Source Forge CB2XML or use google yourself to find more info

kloucks
+2  A: 

A long time ago, I built some code to parse COBOL copybook and to generate XSD files.

Since COBOL language structure is pretty regular, I crafted a regular expression to get variable names and to identify field lengths. With that parsed structure, I could also create XML test data, MSXML DOM code to manipulate that structure and HTML forms to test those IMS transactions.

Bottom line: regular expressions could be really useful to do that.

Rubens Farias
thanks for the regex hint
lemotdit
+4  A: 

Building a full blown parser for COBOL copybooks has few challenges:

Copybooks are incorporated into COBOL programs during the text manipulation phase of compilation. The copybook source by itself may be incomplete. The only way to obtain a complete source for parsing is by pre-processing it as if it had been brought into a COBOL souce program. Normally copybooks are brought into a COBOL program via the COPY directive. Bringing this up may seem a bit pointless, but consider the following:

1) The COPY directive comes with a REPLACING option. On the surface this may seem simple enough to deal with, but once you get into the details it becomes very "interesting". See: COPY DIRECTIVE

2) The REPLACE directive. This directive may also manipulate source text after the COPY directive has done its bit. See: REPLACE DIRECTIVE

3) Nested copybooks. This one may not be as nasty as the previous two but keep nesting in mind too.

4) The syntax of COBOL Picture strings is noting to laugh at either. Have a look at: Picture String Symbols

5) Your parser will need to deal with COBOL continuation rules as well. See: Continuation Lines, and continuation of PSEUDO TEXT in particular.

I don't want to discourage you, but parsing COBOL is not a trivial task.

On the bright side, if your copybooks have a drop-dead-simple structure to them, as many do, it may be possible to get this done using a cascade of regular expressions. This approach is fairly common among those who need to parse COBOL programs (and copybooks) on software renovation projects. Maybe have a look at: RegReg

Cheers...

NealB
What you need is a full COBOL parser front end to do this right. See http://www.semanticdesigns.com/Products/FrontEnds/COBOLFrontEnd.html
Ira Baxter