tags:

views:

290

answers:

2

Hi, i'm trying to parse a string of html tag attributes in php. There can be 3 cases:

attribute="value"  //inside the quotes there can be everything also other escaped quotes
attribute          //without the value
attribute=value    //without quotes so there are only alphanumeric characters

can someone help me to find a regex that can get in the first match the attribute name and in the second the attribute value (if it's present)?

+7  A: 

Never ever use regular expressions for processing html, especially if you're writing a library and don't know what your input will look like. Take a look at simplexml, for example.

soulmerge
+2  A: 

Hello mck89,

Give this a try and see if it is what you want to extract from the tags.

preg_match_all('/( \\w{1,}="\\w{1,}"| \\w{1,}=\\w{1,}| \\w{1,})/i', 
    $content, 
    $result, 
    PREG_PATTERN_ORDER);
$result = $result[0];

The regex pulls each attribute, excludes the tag name, and puts the results in an array so you will be able to loop over the first and second attributes.

JasonBartholme
I found a faster and more precise solution, but i try your regex and it seems to work so it's a good starting point and i take your answer as solution. Thank you!
mck89