tags:

views:

74

answers:

2

I am converting XML children into the element parameters and have a dirty regex script I used in Textmate. I know that dot (.) doesn't search for newlines, so this is how I got it to resolve.

Search

language="(.*)"
(.*)<education>(.*)(\n)?(.*)?(\n)?(.*)?(\n)?(.*)?</education>
(.*)<years>(.*)</years>
(.*)<grade>(.*)</grade>

Replace

grade="$13" language="$1" years="$11">
        <education>$3$4$5$6$7$8$9</education>

I know there's a better way to do this. Please help me build my regex skills further.

+2  A: 

Use an xml parser, don't use regex to parse xml.

compie
A: 

If there are no other tags inside the <education> element, I would change that part to:

<education>([^<>]*)</education>

If possible, I would use the same technique everywhere else you're using .*. In the case of the language attribute, it would take this form:

language="([^"]*)"
Alan Moore