Hey guys, i want to be able to retrieve dynamic data from a web page (share prices). I started out by retrieving the html code before i realised that as it is live data, the html code will be of little use. Although i am looking to capture specific data, all i wish to do is process a webpage that i specify which will return the text off that website and not the HTML code. Basically a copy and paste of the entire page would be great.. Any ideas would be really appreciated!
Well, the HTML contains the text of the website, so you "just" need to parse the HTML.
EDIT: If the data is not in the HTML but loaded dynamically, the situation is different. As I can see, you have two options:
- Find out how the data is loaded (i.e. read the JavaScript on the page). If it is updated via some web service, you could query the same web service in your program.
- Use a web browser to get the data and then get the dynamic HTML tree of the page. Maybe the WPF Webbrowser control can help you with this, but I'm not sure since I've never done this myself.
Is it possible to find this same data provided in a ready-to-consume format rather than scraping HTML for it? It seems like there's probably public web-services for stock quotes.
For example: A quick search for "Stock price webservice" turned up http://www.webservicex.net/stockquote.asmx; an ASMX web-service that is easy to consume in .NET.
In your Visual Studio project you should be add a reference to this service via the "Add Web Reference" command; the dialog you're given varies depending on whether your project is targeting for .NET 2.0 or .NET 3.0/3.5.
I added a reference to the service named StockPriceProxy
:
Public Function GetQuote(ByVal symbol As String) As String
Using quoteService As New StockPriceProxy.StockQuote
return quoteService.GetQuote(symbol)
End Using
End Function
'Screen Scraping' by parsing HTML is so early 2000s...what I would do is read up on Amazon's Mechnical Turk. You can develop a queued architecture where you submit urls to this Mechnical Turk service. The service would automatically distribute these bits of work to users who would then do the dirty task of copying and pasting out the valuable stock quote information you require. Users around the world would anxiously await delivery of the next URL to their Mechanical Turk inbox...pinning for the opportunity to copy/paste out another share price for your application. Sure, it might take a few minutes to update your prices, but hey, they would be HAND parsed by REAL people around the globe! Just think of the possibilities!