Hi,
I want to crawl through lets say other companies websites like for cars and extract readonly information in my local database. Then I want to be able to display this collected information on my website. Purely from technology perspective, is there a .net tool, program, etc already out there that is generic enough for my purpose. Or do I have to write it from scratch?
To do it effectively, I may need a WCF job that just mines data on constant basis and refreshes the database which then provides data to the website.
Also, is there a way to mask my calls to those websites? Would I create "traffic burden" for my target websites? Would it impact their functionality if I am just harmlessly crawling them?
How do I make my request look "human" instead of coming from Crawler?
Are there code examples out there on how to use a library that parses the DOM tree?
Can I send request to a specific site and get a response in terms of DOM with WebBrowser control?