




Hello! I'm looking to parse some information on my application. Let's say we have somewhere in that string:

<tr class="tablelist_bg1">


<td class="text_center">---</td>

<td class="text_center">19.1</td>

<td class="text_center">10.8</td>

<td class="text_center">NW</td>

<td class="text_center">50.9</td>

<td class="text_center">0</td>

<td class="text_center">1016.6</td>

<td class="text_center">---</td>

<td class="text_center">---</td>


All rest that's above or below this doesn't matter. Remember this is all inside a string. I want to get the values inside the td tags: ---, 19.1, 10.8, etc. Worth knowing that there are many entries like this on the page. Probably also a good idea to link the page here.

As you probably guessed I have absolutely no idea how to do this... none of the functions I know I can perform over the string (split etc.) help.

Thanks in advance


Assuming your string is valid XHTML, you can use use an XML parser to get the content you want. There's a simple example here that shows how to use XmlTextReader to parse XML content. The example reads from a file, but you can change it to read from a string:

new XmlTextReader(new StringReader(someString));

You specifically want to keep track of td element nodes, and the text node that follows them will contain the values you want.

+1  A: 

Just use String.IndexOf(string, int) to find a "<td", again to find the next ">", and again to find "</td>". Then use String.Substring to pull out a value. Put this in a loop.

    public static List<string> ParseTds(string input)
        List<string> results = new List<string>();

        int index = 0;

        while (true)
            string next = ParseTd(input, ref index);

            if (next == null)
                return results;


    private static string ParseTd(string input, ref int index)
        int tdIndex = input.IndexOf("<td", index);
        if (tdIndex == -1)
            return null;
        int gtIndex = input.IndexOf(">", tdIndex);
        if (gtIndex == -1)
            return null;
        int endIndex = input.IndexOf("</td>", gtIndex);
        if (endIndex == -1)
            return null;

        index = endIndex;

        return input.Substring(gtIndex + 1, endIndex - gtIndex - 1);
A very nice answer and easy to understand.
.. Thank you! ..
Use a loop to load each non empty line from the file into a String
Process the string character by charcter
 Check for characters indicating the the begining of a td tag
  use a substring function or just bulild a new string character by character to get all the content until the </td> tag begins.