Hi
I have to parse a big HTML file, and Im only interested in a small section (a table). So I thought about using an XSLT to simplify/transform the HTML in something simpler that I could then easily process.
The problem Im having is that the is not finding my table. So I don't know if its even possible to parse HTML using a XSL stylesheet.
By the way, the HTML file has this look (schematic, missing tags):
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html id="ctl00_htmlDocumento" xmlns="http://www.w3.org/1999/xhtml" lang="es-ES" xml:lang="es-ES">
<div> some content </div>
<div class="NON_IMPORTANT"></div>
<div class="IMPORTANT_FATHER>
<div class="IMPORTANT">
<table>
HERE IS THE DATA IM LOOKING FOR
</table>
</div>
</div>
as per request, here is my xsl
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="tbody">
tbody found, lets process it
<xsl:for-each select="tr">
new tf found, lets process it
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
The full HTML is quite big so I dont know how to present it here... I've tested for valid document on Oxygen, and it says its valid.
Thanks in advance. Gonso