I need to dynamically construct an XPath query for an element attribute, where the attribute value is provided by the user. I'm unsure how to go about cleaning or sanitizing this value to prevent the XPath equivalent of a SQL injection attack. For example (in PHP):
<?php
function xPathQuery($attr) {
$xml = simplexml_load_file('example.xml');
return $xml->xpath("//myElement[@content='{$attr}']");
}
xPathQuery('This should work fine');
# //myElement[@content='This should work fine']
xPathQuery('As should "this"');
# //myElement[@content='As should "this"']
xPathQuery('This\'ll cause problems');
# //myElement[@content='This'll cause problems']
xPathQuery('\']/../privateElement[@content=\'private data');
# //myElement[@content='']/../privateElement[@content='private data']
The last one in particular is reminiscent to the SQL injection attacks of yore.
Now, I know for a fact there will be attributes containing single quotes and attributes containing double quotes. Since these are provided as an argument to a function, what would be the ideal way to sanitize the input for these?