I would like to parse HTML document and replace action attribute of all the forms and add some hidden fields with XSL. Can someone show some examples of XSL that can do this?
You can start from http://www.w3schools.com/xsl/
But be aware that generally xsl requires well-formed xml as input and HTML isn't always well-formed
What you need first is well formed HTML (at least transitional), although best recommended XHTML. Some XSLT processors could accept malformed HTML but it is not the rule.
To try the example below you can download this small Microsoft command line app.
Quick and dirty XSLT example for what you need (example-xslt.xsl):
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="*">
<xsl:copy>
<xsl:copy-of select="@*"/>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
<xsl:template match="form[@action='foo']">
<xsl:copy>
<xsl:copy-of select="@*"/>
<xsl:attribute name="action">non-foo</xsl:attribute>
<input type="hidden" name="my-hidden-prop" value="hide-foo-here"/>
<xsl:apply-templates select="*"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
And the corresponding XML example (example.xml).
<?xml version ="1.0"?>
<?xml-stylesheet type="text/xsl" href="example-xslt.xsl"?>
<html>
<head></head>
<body>
<form action="foo">
</form>
<form action="other">
</form>
</body>
</html>
Thinking of gurin's answer: one possible XSLT-based pathway for HTML is to use tidy to convert it to XHTML, apply XSLT to the XHTML, but use xsl:output[@method="html"]
to get HTML back out. The @doctype-system
and @doctype-public
attributes let you provide a doctype declaration in the output file as well.
I don't have any sample files for shahbhat, but the general approach is straightforward from an XSLT point of view: start with an identity transform and add in templates for the action attributes to override them in the way you want. To add hidden fields, I suspect the easiest way would be to create a template explicitly for the form
element as an identity transform, but with additional elements inside it that are output as well. I think Fernando Miguélez has just posted an example.