views:

22

answers:

3

I'm using libxml2 in my c project. I was wondering how could I grab all tables in a html file using xpath. Sample code will do the trick.

I need to parse the data in html table.

Thanks

EDIT:

This is a row of the table:

<tr class="report-data-row-even">

<td class="NormalTxt report-data-cell report-data-column-even"><nobr>0.0285</nobr></td><td class="NormalTxt report-data-cell report-data-column-odd"><nobr>&#1508;&#1512;&#1496;&#1504;&#1512;</nobr></td><td class="NormalTxt report-data-cell report-data-column-even"><nobr>SMS</nobr></td><td class="NormalTxt report-data-cell report-data-column-odd"><nobr>1</nobr></td><td class="NormalTxt report-data-cell report-data-column-even"><nobr>054-2570130</nobr></td><td class="NormalTxt report-data-cell report-data-column-odd"><nobr>00:14:09</nobr></td><td class="NormalTxt report-data-cell report-data-column-even"><nobr>27/09/2010</nobr></td>

I need to be able to pull the data inside the <nobr> tags.

+1  A: 

Well, I need more info. How does the HTML look? What kind of data are you extracting? Also why C? Although DOM creating is fast in C, but afterwards string manipulations need some effort. Why not Python? Anyway here's the xpath you could try.

//table[@class='table_class']

This gives all tables in your HTML page having classname as 'table_class'. You could change this to how your HTML is organized.

MovieYoda
+1  A: 

XPath would be simle "//table"

alexanderb
That did the trick.
embedded
How do I iterate on all rows of the tables and print the value?
embedded
on table object you ask for descendant nodes with "//tr"
alexanderb
A: 

For that you need to use call back method,

for characters.

(void) characters(Xmlchar*)

see libxml documentation

iSight