tags:

views:

520

answers:

2

Hi everyone,

I'm working in C#. I'm trying to extract the first instance of img tag from a HTML string (which is actually a post data).

This is my code:

 private string GrabImage(string htmlContent)
 {
    String firstImage;

    HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlAgilityPack.HtmlDocument();
    htmlDoc.LoadHtml(htmlContent);
    HtmlAgilityPack.HtmlNode imageNode = htmlDoc.DocumentNode.SelectSingleNode("//img");
    if (imageNode != null)
    {
        return firstImage = imageNode.ToString();          
    }
    else
        return firstImage=" ";
}

But it gets null in htmlDoc, will I use the HtmlDocument type even if I'm trying to parse the HTML from a string ?

P.S btw is it the correct way of grabbing the first instance of image tag from my HTML string?

A: 

For the P.S. part, you'll want to make sure the imageNode's html text is returned, not the name of the object.

I'll try to add an additional part for the document when I'm at a computer with the agility pack available.

BenMaddox
Thanks BenMaddox!
skhan
Please see my other answer for more information.
BenMaddox
A: 

Using the HTML you provided, I made this console application.

    static void Main(string[] args)
    {         

        var image = GrabImage("<h2>How to learn Photoshop</h2><p> Its <a href=\"/mysite.aspx\">link</a></p><br /> <img src=\"image.jpg\" alt=\"image\"/>");
        Console.WriteLine(image);
        Console.ReadLine();
    }

    private static string GrabImage(string htmlContent)
    {
        String firstImage;

        HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlAgilityPack.HtmlDocument();
        htmlDoc.LoadHtml(htmlContent);
        HtmlAgilityPack.HtmlNode imageNode = htmlDoc.DocumentNode.SelectSingleNode("//img");
        if (imageNode != null)
        {
            firstImage = imageNode.OuterHtml.ToString();
        }
        else
            firstImage = " ";
        return firstImage;
    }

I'm unable to find the problem youwere describing. Could you show where you called the GrabImage method?

BenMaddox
BenMaddox, I've figured out what I was doing wrong. Thanks for your help.
skhan
Was the solution related to my answer?
BenMaddox