Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
361 views
in Technique[技术] by (71.8m points)

php - Cleaning/sanitizing xpath attributes

I need to dynamically construct an XPath query for an element attribute, where the attribute value is provided by the user. I'm unsure how to go about cleaning or sanitizing this value to prevent the XPath equivalent of a SQL injection attack. For example (in PHP):

<?php
function xPathQuery($attr) {
    $xml = simplexml_load_file('example.xml');
    return $xml->xpath("//myElement[@content='{$attr}']");
}

xPathQuery('This should work fine');
# //myElement[@content='This should work fine']

xPathQuery('As should "this"');
# //myElement[@content='As should "this"']

xPathQuery('This'll cause problems');
# //myElement[@content='This'll cause problems']

xPathQuery('']/../privateElement[@content='private data');
# //myElement[@content='']/../privateElement[@content='private data']

The last one in particular is reminiscent to the SQL injection attacks of yore.

Now, I know for a fact there will be attributes containing single quotes and attributes containing double quotes. Since these are provided as an argument to a function, what would be the ideal way to sanitize the input for these?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

XPath does actually include a method of doing this safely, in that it permits variable references in the form $varname in expressions. The library on which PHP's SimpleXML is based provides an interface to supply variables, however this is not exposed by the xpath function in your example.

As a demonstration of really how simple this can be:

>>> from lxml import etree
>>> n = etree.fromstring('<n a='He said "I&apos;m here"'/>')
>>> n.xpath("@a=$maybeunsafe", maybeunsafe='He said "I'm here"')
True

That's using lxml, a python wrapper for the same underlying library as SimpleXML, with a similar xpath function. Booleans, numbers, and node-sets can also be passed directly.

If switching to a more capable XPath interface is not an option, a workaround when given external string would be something (feel free to adapt to PHP) along the lines of:

def safe_xpath_string(strvar):
    if "'" in strvar:
        return "',"'",'".join(strvar.split("'")).join(("concat('","')"))
    return strvar.join("''")

The return value can be directly inserted in your expression string. As that's not actually very readable, here is how it behaves:

>>> print safe_xpath_string("basic")
'basic'
>>> print safe_xpath_string('He said "I'm here"')
concat('He said "I',"'",'m here"')

Note, you can't use escaping in the form &apos; outside of an XML document, nor are generic XML serialisation routines applicable. However, the XPath concat function can be used to create a string with both types of quotes in any context.

PHP variant:

function safe_xpath_string($value)
{
    $quote = "'";
    if (FALSE === strpos($value, $quote))
        return $quote.$value.$quote;
    else
        return sprintf("concat('%s')", implode("', "'", '", explode($quote, $value)));
}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...