views:

148

answers:

3

I need to sort a html string so I get the content I need. Now I need to loop through the tr's in a table that got an ID. I could really need some help to get this regex going.

Appriciate all help I can get

+1  A: 

Regular expressions cannot be used to parse HTML; HTML is not regular. Use a proper HTML parser library.

Ignacio Vazquez-Abrams
u got any suggestions for this? i use asp.net c#
Dejan.S
Nope. http://stackoverflow.com/questions/100358/looking-for-c-html-parser
Ignacio Vazquez-Abrams
+1  A: 

It depends on how regular the HTML text is. For example, given this table:

<table>
  <tr><td>1</td><td>Apple</td></tr>
  <tr><td>2</td><td>Ball</td></tr>
  <tr><td>3</td><td>Cookie</td></tr>
<table>

The following regex expression finds the IDs in the first column:

(?<=<tr><td>).*?(?=</td>)
Mike Hanson
A: 

If you run the page through an html-parser like BeautifulSoup, then you can prettify it so that this kind of regex has a chance. But if you are parsing the html anyway...

Charles Stewart