Not every website exposes their data well, with XML feeds, APIs, etc
How could I go about extracting information from a website? For example:
...
<div>
<div>
<span id="important-data">information here</span>
</div>
</div>
...
I come from a background of Java programming and coding with Apache XMLBeans. Is there anything similar to parse HTML, when I know the structure and the data is between a known tag?
Thanks