tags:

views:

1116

answers:

2

I have an xml file of the form:

<property name="foo" value="this is a long value">stuff</property>

There are many properties but I want to match the one with name foo and then replace its value attribute with something else as so:

<property name="foo" value="yet another long value">stuff</property>

I was thinking to write a regular expression to match everything after "foo" to the end of the tag ( ">" ) and replace that, but I can't seem to get the syntax right.

I'm trying to do this using sed, if that's any help.

+4  A: 

You probably don't want to use a regex for manipulating an xml file. Please instead consider xslt, which is aware of xml rules and won't cause your transformed document to become malformed.

TokenMacGuy
While I agree with this in general, I'm not using this code to process transactions. Rather I just want to modify a build configuration file that will let me change the contents of an about box. I'd like to do it all as part of a bash script and writing an XSL is overkill.
Daniel
@Daniel: If you specify your needs some more, you may find that an XSL transformation to change one attribute is a lot less difficult than you might think.
Tomalak
@Tomalak: It's not about the difficulty; I'm coding against a static build script target which is seldom changing. To use XSL I'd have to write a stylesheet and some other code to actually do the transformation (I don't think bash has tools for working with xml). I feel it's all a bit much just to change the build number in an about box.
Daniel
+1  A: 

/property name=\"foo\" value=\"([^\"]*)\"/

Then just replace the first submatch with the new value of your wishing.

Tommi Forsström
Hi Tommi, Thanks for the helpful post. If I read your correctly, only the first word of the value attribute is matched, yes? What if it contains many words? (I've revised the original post to highlight what I mean).
Daniel
From what I can read of this regex, the group n#1 will contain all the words inside the double-quoted value attribute.
VonC
Yep, it contains everything between the double quotes. Basically the part \"([^\"]*)\" translates to "something that starts with a double-quote, has any number of non-double-quote characters in it and ends in a double-quote, saving the stuff in between the double quotes in a submatch.A really really really cool tool for testing out regexps is RegEx Coach:http://www.weitz.de/regex-coach/Three thumbs up for that one!
Tommi Forsström
That's fantastic. Thank you! I misread the syntax earlier and I was having some setting up the search/replace but everything is sorted now. RegEx Coach also looks like a great little tool; I think they have an older version for Linux/Mac too. yay!
Daniel