I am using Clojure/Ring/Compojure-0.4/Enlive stack to build a web application.
Are there functions in this stack that would either strip HTML or HTML-encode (i.e. <a>
to <a>
) user-supplied strings in order to prevent XSS attacks?
I am using Clojure/Ring/Compojure-0.4/Enlive stack to build a web application.
Are there functions in this stack that would either strip HTML or HTML-encode (i.e. <a>
to <a>
) user-supplied strings in order to prevent XSS attacks?
Update: I knew there had to be more than that...
ring.util.codec
from ring-core
has a functions called which work like so:
user> (require '[ring.util.codec :as c])
nil
user> (c/url-encode "<a>")
"%3Ca%3E"
user> (c/url-decode "<a>")
"<a>"
These are wrappers around java.net.URLEncoder
and java.net.URLDecoder
. The same namespace provides functions for dealing with Base64 encoding, based on a class from Apache Commons.
Original answer follows.
I'm not sure whether there is a public function to do this, but Enlive
has two private functions called xml-str
and attr-str
which do this:
(defn- xml-str
"Like clojure.core/str but escapes < > and &."
[x]
(-> x str (.replace "&" "&") (.replace "<" "<") (.replace ">" ">")))
(attr-str
also escapes "
.)
You could get at that function with @#'net.cgrand.enlive-html/xml-str
(Clojure doesn't tend to make things really private...) or just copy it to your own namespace.
hiccup.core/escape-html
in hiccup does it. That function used to be in Compojure itself (since all of the functionality in hiccup used to be part of Compojure). It's a simple enough function that you could easily write it yourself though.
(defn escape-html
"Change special characters into HTML character entities."
[text]
(.. #^String (as-str text)
(replace "&" "&")
(replace "<" "<")
(replace ">" ">")
(replace "\"" """)))
There's also clojure.contrib.string/escape
, which takes a map of char -> string escape sequences and a string and escapes it for you.
user> (clojure.contrib.string/escape {\< "<" \> ">"} "<div>foo</div>")
"<div>foo</div>"
This strikes me as not as useful as it could be, because you might want to escape multi-character sequences and this won't let you. But it might work for your HTML-escaping needs.
And then there are many Java libraries for this, of course. You could use StringEscapeUtils from Apache Commons:
(org.apache.commons.lang.StringEscapeUtils/escapeHtml4 some-string)
This strikes me as a bit heavyweight for this purpose though.
It turns out Enlive does escape HTML by default if you use net.cgrand.enlive-html/content
to put text into a HTML element.
(sniptest "<p class=\"c\"></p>" [:.c] (content "<script></script>"))
"<p class=\"c\"><script></script></p>"