tags:

views:

55

answers:

5

I want to capture text in an attribute within an XML tag. That is

<tag1 name="tag^*&,+">

I want to capture the value within the name attribute (which in this case would be tag^*&,+). This regular expression

name=\"([a-z0-9]+)\"  

will only return the value if the text in the attribute is alphanumeric. Is there any syntax that will return the captured value regardless of what symbol and characters? Thanks!

+1  A: 

You should use:

name=\"([^\"]+)\"

In other words, the capturing group can be described as at least one of "any character other than the end quotation"

wsorenson
A: 

Check out regular-expressions.info

This will do what you want:

([^"]+)
jasonbar
Thanks, works perfectly
Axsuul
And of course the obligatory "use an XML parser, regular expressions aren't suitable blah blah.."
jasonbar
Your lust for rep has aided and abetted Axsuul's descent into regex hell!
rjh
Haha, thanks. Working with Excel VBA, know of any good ones?
Axsuul
@rjh: hahah..although in this case he appears to have a fairly regular subset he is looking to handle...maybe just purgatory..?
jasonbar
A: 

. will match any character.

name = \"(.+)\"
Jesse
+1  A: 

It seems that your better of using an XML Parser I don't know what language your using but there's an XML parser for every language out there.

JeremySpouken
Excel VBA, thanks for the suggestion will look into it in the future. Do you know of any good ones for VBA?
Axsuul
Yeah you can look for MSXML Parser. http://stackoverflow.com/questions/11305/how-to-parse-xml-in-vba
JeremySpouken
+5  A: 

At the risk of beating a dead horse, don't try to "parse" XML with regular expressions. Use your programming language's XML library. It is then dead simple to select all tag1 elements and get the contents of their name attributes.

Not only is it easier for you to code, but you won't have to deal with nasty things like strings spanning multiple lines, string escapes (e.g. &quot;), weird edge cases that cause your regex to fail, etc.

rjh
+1 - with the caveat that there may be times when you don't want/need the overhead of an XML parser.
wsorenson
Reluctantly agreed... if you have a huge document and you're very confident about the form the XML will take, regexes can be a useful and seductive tool. But I've been burned by their fiery kiss too many times.
rjh