I am trying to use YQL to extract a portion of HTML from a series of web pages. The pages themselves have slightly different structure (so a Yahoo Pipes "Fetch Page" with its "Cut content" feature does not work well) but the fragment I am interested in always has the same class
attribute.
If I have an HTML page like this:
<html>
<body>
<div class="foo">
<p>Wolf</p>
<ul>
<li>Dog</li>
<li>Cat</li>
</ul>
</div>
</body>
</html>
and use a YQL expression like this:
SELECT * FROM html
WHERE url="http://example.com/containing-the-fragment-above"
AND xpath="//div[@class='foo']"
what I get back are the (apparently unordered?) DOM elements, where what I want is the HTML content itself. I've tried SELECT content
as well, but that only selects textual content. I want HTML. Is this possible?