tags:

views:

235

answers:

3

I have this regular expression that extracts meta tags from HTML documents but it gives me errors while I incorporate it in my web application.

the expression is

@"<meta[\\s]+[^>]*?name[\\s]?=[\\s\"\']+(.*?)[\\s\"\']+content[\\s]?=[\\s\"\']+(.*?)[\"\']+.*?>" ;

is there anything wrong with it?

+7  A: 

You're using both the @ (verbatim string) syntax and escaping your slashes in the sample you posted. You need to either remove the @, or remove the extra slashes and escape your double quotes by doubling them up, then it should work.

(For what it's worth, if you're going to be working with regular expression on an ongoing basis, I would suggest investing in a copy of RegExBuddy.)

Jeromy Irvine
They're called "verbatim strings". And one of the benefits of RegexBuddy is that, after helping you create the right regex, it can export the regex in whatever format you need, including C# verbatim strings.
Alan Moore
+3  A: 

When using a string literal (@"") you don't need to double the back-slashes -- everything in the string is accepted as it is -- except for double quotes, which need to be doubled:

@"<meta[\s]+[^>]*?name[\s]?=[\s""']+(.*?)[\s""']+content[\s]?=[\s""']+(.*?)[""']+.*?>"

James Curran
A: 

Jeromy is right. You're using an escaped string and a string litteral. The regex itself is fine... So I guess that's where the problem is.

Jeroen Landheer