tags:

views:

86

answers:

3

Trying to get a regex that I can get a style attribute value from the example below should explain my issue.

source: font-size:11pt;font-color:red;text-align:left;

want to say give me ..

  1. font-size and returns 11pt
  2. font-colour and returns red
  3. text-align and returns left

Can someone point me in the right direction

Thanks

Lee

+3  A: 

This question reminded me of a Jeff Atwood blog post, Parsing Html The Cthulhu Way. This isn't exactly the same question, but its the same sentiment. Don't parse CSS with regular expressions! There's tons of libraries out there to do this for you.

David Pfeffer
A: 

Logically you'd want:

[exact phrase] + 1 colon + 0 or more white space characters + 0 or more characters up to the first semicolon or closing quote.

I think this will get you headed in the right direction:

font-size[:][\s]*[^;'"]*

Gotchas:

  • the closing quote might be single or double and there may be a valid quote within (ie, quoting background image urls, for instance)

  • this is all dependent on the styles not being written in shorthand

DA
A: 
var regex = new Regex(@"([\w-]+)\s*:\s*([^;]+)");
var match = regex.Match("font-size:11pt;font-color:red;text-align:left;");
while (match.Success)
{
    var key = match.Groups[1].Value;
    var value = match.Groups[2].Value;
    Console.WriteLine("{0} : {1}", key, value);
    match = match.NextMatch();
}

Edit: This is not supposed to be a 'complete' solution. It probably does the job for the 80% of the cases, and as ever the last 20% would be magnitudes more expensive ;-)

Maxwell Troy Milton King
Of course, it can't match `content:";)";` property.
KennyTM