ansaurus

Question

Answer 1

A:

Assuming your string is valid XHTML, you can use use an XML parser to get the content you want. There's a simple example here that shows how to use XmlTextReader to parse XML content. The example reads from a file, but you can change it to read from a string:

new XmlTextReader(new StringReader(someString));

You specifically want to keep track of td element nodes, and the text node that follows them will contain the values you want.

casablanca 2010-10-23 19:51:25

Answer 2

+1 A:

Just use String.IndexOf(string, int) to find a "<td", again to find the next ">", and again to find "</td>". Then use String.Substring to pull out a value. Put this in a loop.

    public static List<string> ParseTds(string input)
    {
        List<string> results = new List<string>();

        int index = 0;

        while (true)
        {
            string next = ParseTd(input, ref index);

            if (next == null)
                return results;

            results.Add(next);
        }
    }

    private static string ParseTd(string input, ref int index)
    {
        int tdIndex = input.IndexOf("<td", index);
        if (tdIndex == -1)
            return null;
        int gtIndex = input.IndexOf(">", tdIndex);
        if (gtIndex == -1)
            return null;
        int endIndex = input.IndexOf("</td>", gtIndex);
        if (endIndex == -1)
            return null;

        index = endIndex;

        return input.Substring(gtIndex + 1, endIndex - gtIndex - 1);
    }

arx 2010-10-23 19:55:16

A very nice answer and easy to understand.

Queops 2010-10-23 20:32:40

.. Thank you! ..

arx 2010-10-23 20:40:07

Answer 3

A:

Use a loop to load each non empty line from the file into a String
Process the string character by charcter
 Check for characters indicating the the begining of a td tag
  use a substring function or just bulild a new string character by character to get all the content until the </td> tag begins.

sca 2010-10-23 19:57:45

ansaurus

tags:

views:

answers:

Parsing big string (HTML code)

related questions