views:

44

answers:

2

Hi, I need parse a select value in html file. I have this html file:

<html>
<head></head>
<body>
    <select id="region" name="region">
        <option value="0"  selected>Všetky regiony</option> 
        <optgroup>Banskobystrický kraj</optgroup>
        <option value="k_1">Banskobystrický kraj</option>
        <option value="1">Banská Bystrica</option>
        <option value="3">Banská Štiavnica</option>
        <option value="18">Brezno</option>
        <option value="22">Detva</option>
        <option value="58">Dudince</option>
    </select>
</body>
</html>

I need get select option value and also text value in dictionary. I load this file in webBrowser component a try get select tag by ID "region".

        webBrowser1.Url = new Uri("file://\\C:\\1.html");

        if (webBrowser1.Document != null)
        {
            HtmlElement elems = webBrowser1.Document.GetElementById("region");
        }

But object elems is null, I don’t why. Any advance?

EDIT: Problem was resolved with Html Agillity Pack. Thank for everybody. I was stupid, I had rather listen to your advice with Html Agillity Pack first.

A: 

You can do it with the HtmlAgilityPack. There are many examples of using it to parsing html. You can find via a google search. Here are a few:

http://htmlagilitypack.codeplex.com/wikipage?title=Examples&amp;referringTitle=Home

http://stackoverflow.com/questions/846994/how-to-use-html-agility-pack

UPDATE:

While I think using the library is a better choice, you can do it with the webbrowser control in the following manner:

    webBrowser1.DocumentCompleted += 
          new WebBrowserDocumentCompletedEventHandler(ParseOptions);

    webBrowser1.Url = new Uri("C:\\1.html", UriKind.Absolute);

    private void ParseOptions(object sender,
        WebBrowserDocumentCompletedEventArgs e)
    {
        HtmlElement elems = webBrowser1.Document.GetElementById("region");
    }

Notice that the parsing is done in the DocumentCompleted event handler.

Garett
I dont want use a third part library
jminarik
@Tom: And yet you instantiate a WebBrowser instance? There is no standard framework way of parsing HTML, for good reason. You have no choice but to use a third party library, WebBrowser being Microsoft's implementation.
James Dunne
Thanks for advance, with Html Agillity Pack it works good.
jminarik
@jminarik. Glad to hear that it worked. If you would like to accept this as an answer See the following for instructions http://meta.stackoverflow.com/questions/5234/how-does-accepting-an-answer-work
Garett
A: 

Html Agility Pack is a HTML parser great parser.

Pieter
Thank for advance I use Html Agillity Pack. I looks goog and works fine.
jminarik