I am looking for developing a Web Scrapper (in C# windows forms).The whole idea which i am trying to accomplish is as follows.
- Get the URL from the User .
- Load the Web page , in the IE UI control(embeddeed browser) in WINForms.
- Allow the User to select a text (contiguous , small(not exceeding 50 chars)). from the loaded web page.
- When the User wishes to persist the location (the HTML DOM location) it has to be persisted into the DB, so that the user may use that location to fetch the data in that location during his subsequent visits.
Assume that the loaded website is a pricelisting site , and the quoted rate keeps on changing , the idea is to persist the DOM hierarchy , so that we may traverse it next time.
I was able to do this , if all the HTML elements had their id attributes . In case , if the id is null , i am not able to accomplish this .
Could someone suggest a valid idea on this (a bare minimum code snippet if possible).?
It would be helpful , even if you can share some online resources.
thanks,
vijay