views:

183

answers:

1

Hi all guys!

still on regex! ;-)))

Assuming we have an html file with a lot of <tr> rows with same structure like this below, where (.*?) is the content i need to extract!

<tr align=center><th width=5%><a OnClick="(.*?)"href=#>(.*?)</a><td width=5%>(.*?)<td width=5% align=center >(.*?)</td></tr>

UPDATED

maybe with a nice preg_match_all() ?

i need something like this result

match[0] . match[1] . match[2] . match[3]

just in case someone need someting similar!

THE SOLUTION to my little problem is

/<a\s*OnClick=\"(.*?)\"href=#>(.*?)<\/a><td[^>]+>(.*?)<td[^>]+>(.*?)<\/td><\/tr>/m

thanks for the time!

Luca Filosofi!

A: 

Wildly guessing here without actual sample data to match the regex against - also quite unhappy with having to use a regex here. Unless your tables always look exactly alike, I doubt you'll have much fun with regexes.

Anyway, all the caveats aside, this might work:

<tr[^>]+><th[^>]+><a OnClick="([^"]+)"\s*href="([^"]+)">([^<]+)</a><td[^>]+>([^<]+)<td[^>]+>([^<]+)</td></tr>

It expects the tags (and the attributes within the <a> tag) exactly in this order, no angle brackets within quoted strings, no escaped quotes within quoted strings etc. etc. (all those things that you wouldn't have to worry about if you used a parser).

In PHP:

preg_match_all('%<tr[^>]+><th[^>]+><a OnClick="([^"]+)"\s*href="([^"]+)">([^<]+)</a><td[^>]+>([^<]+)<td[^>]+>([^<]+)</td></tr>%', $subject, $result, PREG_PATTERN_ORDER);

$result then is an array where $result[0] contains the entire match, $result[1] contains capturing group no. 1, etc.

Tim Pietzcker
not exactly, what i was looking for... but it helped me a lot! ps: i'm learnig regex here by asking question and step by step i'm learning! ;-) thanks again! `<a\s*OnClick=\"(.*?)\"href=#>(.*?)<\/a><td[^>]+>(.*?)<td[^>]+>(.*?)<\/td><\/tr>`
aSeptik