views:

90

answers:

3

So, I can easily use LINQ to XML to traverse a properly set-up XML document. But I'm having some issues figuring out how to apply it to an HTML table. Here is the setup:

<table class='inner' width='100%'>
    <tr>
        <th>
            Area
        </th>
        <th>
            Date
        </th>
        <th>
            ID
        </th>
        <th>
            Name
        </th>
        <th>
            Email
        </th>
        <th>
            Zip Code
        </th>
        <th>
            Type
        </th>
        <th>
            Amount
        </th>
    </tr>
    <tr>
        <td>
           Data
        </td>
        <td>
            Data
        </td>
        <td>
            Data
        </td>
        <td>
            Data
        </td>
        <td>
            Data
        </td>
        <td>
            Data
        </td>
        <td>
            Data
        </td>
        <td>
            Data
        </td>
    </tr>
    <tr>
        <td>
            Data
        </td>
        <td>
            Data
        </td>
        <td>
            Data
        </td>
        <td>
            Data
        </td>
        <td>
            Data
        </td>
        <td>
            Data
        </td>
        <td>
            Data
        </td>
        <td>
            Data
        </td>
    </tr>                            
</table>

Essentially, there can be an endless number of rows, I want to be able to go row-by-row to check the data accordingly. Can anyone point me in the right direction? Should I be using tools other than LINQ for this?

EDIT: Sorry about the confusion, my issue is the fact that the page I am trying to gather data from is HTML, not XML. The exact extension is ".aspx.htm". This doesnt seem to load properly, and even if it did I'm not certain how to traverse the HTML page, given that there is one table before the table I'm trying to get data from.

For example, here is the XPATH to the table I'm trying to get info from:

/html/body/form/div[3]/table/tbody/tr[5]/td/table
+1  A: 

Once you have an XElement with the <table>, you can loop through its child Elements().

SLaks
A: 

linq is like sql it performs set based operations.

You want to focus on using a foreach loop to iterate over the selected set of xelements -

John Nicholas
+2  A: 
XElement myTable = xdoc.Descendants("table").FirstOrDefault(xelem => xelem.Attribute("class").Value == "inner");
IEnumerable<IEnumerable<XElement>> myRows = myTable.Elements().Select(xelem => xelem.Elements());

foreach(IEnumerable<XElement> tableRow in myRows)
{
    foreach(XElement rowCell in tableRow)
    {
        // tada..
    }
}
Jimmy Hoffa