views:

54

answers:

3

Hi!

I'm trying to scrape some information from a website but can't find a solution that works for me. Every code I read on the Internet generates at least one error for me.

Even the example code at their homepage generates errors for me.

My code:

         HtmlDocument doc = new HtmlDocument();
         doc.Load("https://www.flashback.org/u479804");
         foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//a[@href"])
         {
            HtmlAttribute att = link["href"];
            att.Value = FixLink(att);
         }
         doc.Save("file.htm");

Generates the following error:

'HtmlDocument' is an ambiguous reference between 'System.Windows.Forms.HtmlDocument' and 'HtmlAgilityPack.HtmlDocument' C:*\Form1.cs

Edit: My entire code is located here: http://beta.yapaste.com/55

All help is very appreciated!

+1  A: 

Use HtmlAgilityPack.HtmlDocument:

HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();

The compiler is getting confused because two of the namespaces you have imported with using contain classes called HtmlDocument - the HTML Agility Pack namespace, and the Windows Forms namespace. You can get around this by specifying which class you want to use explicitly.

Lucas Jones
Then I get another error: 'HtmlAgilityPack.HtmlDocument' does not contain a definition for 'DocumentElement' and no extension method 'DocumentElement' accepting a first argument of type 'HtmlAgilityPack.HtmlDocument' could be found (are you missing a using directive or an assembly reference?)
Victor
Lucas Jones
DocumentNode gives me more errors that DocumentElement
Victor
Hmmm... I'm not sure then. There doesn't seem to be anything wrong with the code you've pasted in... maybe there is an error elsewhere?
Lucas Jones
Is there any alternative way I could grab information from a website using C#?
Victor
The library you're using is the best way I know to do what you want to do... you could try creating a new project, and doing only the minimum necessary to get the sample code working. Or, if it includes a full sample app, looking to see if there are any differences between your code and its.
Lucas Jones
A: 

The classes in the two namespaces System.Windows.Forms and HtmlAgilityPack are conflicting. Use fully-qualified type names or use namespace aliases.

James Dunne
That did not help me too much, could you evolve what I should do a little more?
Victor
A: 

I have written a couple of articles that explain how to use HtmlAgilityPack. You might find them useful to get started:

I don't know if they have fixed it now but that snippet didn't used to work on the homepage of the site, I think it was from an earlier version of the library. Also the snippet doesn't define FixLink() so it wouldn't work even if it was correct for the library.

I would recommend getting the latest beta version of the library because it has extra extensions for performing linq queries against it which can save you from confusing xpath queries later on.

I haven't seen it used in a Windows Forms app before but it looks like you will have to use fully-qualified type names like:

HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();

As for the actual task you are trying to perform, it seems like you want to take a url, inject a username and id into it and then... not sure? You look like you are both trying to save the file out to disk and set the html code to the contents of a Form which I don't think you can do?

rtpHarry