views:

686

answers:

5

Hi,

I have a string like this:

This <span class="highlight">is</span> a very "nice" day!

How should my RegEx-pattern in VB look like, to find the quotes within the tag? I want to replace it with something...

This <span class=^highlight^>is</span> a very "nice" day!

Something like <(")[^>]+> doesn't work :(

Thanks

A: 

Try this: <span class="([^"]+?)?">

Dario
A: 

This should get your the first attribute value in a tag:

<[^">]+"(?<value>[^"]*)"[^>]*>
matthewthurlow
A: 

If your intention is to replace ALL quotation marks within tags, you could use the following regular expression:

(<[^>"]*)(")([^>]*>)

That will isolate the substrings before and after your quotation mark. Note that this does not attempt to match opening and closing quotation marks. It simply matches a quotation mark within a tag.

Krsna
Yes, my intention is to replace all quotation marks within tags. Do I have to loop through all submatches then?
So, are you able to use Regex.Replace? http://msdn.microsoft.com/en-us/library/xwewhkd1.aspx
Krsna
Yes, I'd use the replace function. But I don't know how to use it with the pattern. It doesn't find the quotes within a tag.
+3  A: 

It depends on your regex flavor, but this works for most of them:

"(?=[^<]*>)

EDIT: For anyone curious how this works. This translates into English as "Find a quote that is followed by a > before the next <".

Nick Whaley
Thanks, this works great with VB.
Note that the plain `>` character is allowed in attribute values.
Gumbo
@Gumbo Interesting note but the '>' character will not be a problem if it appears in an attribute. The '<' character however will be.
Nick Whaley
@Nick The pattern has problems, if the string looks like: This "is" > great! How can we improve it?
@Moo, the '>' character is not valid have between tags. It needs to be escaped as '>'. But if you need to be that picky, you need to get a real HTML parser.
Nick Whaley
+2  A: 

Regexes are fundamentally bad at parsing HTML (see Can you provide some examples of why it is hard to parse XML and HTML with a regex? for why). What you need is an HTML parser. See Can you provide an example of parsing HTML with your favorite parser? for examples using a variety of parsers.

If you are using VB.net you should be able to use HTMLAgilityPack.

Chas. Owens