I want to load a site/url but i do not need the images. How do i exclude images from loading when I use CURL
+1
A:
The result of a call like curl <url>
gives you only the plain HTML page and doesn't load the images. But if you want to remove the img tags from the downloaded HTML you can apply a simple XSLT with xmlstarlet.
This is the XSLT (a variation of an example I found at http://www.usingxml.com/Transforms/XslIdentity ) delimg.xslt
:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<!-- Remove img tag -->
<xsl:template match="img" />
<!-- IdentityTransform -->
<xsl:template match="/ | @* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Then call xmlstarlet with the --html
option:
curl <url> | xmlstarlet tr --html delimg.xslt > output.html
vanje
2010-10-02 12:16:36