Friday, June 25, 2010

Querying XML with Ampersands

XML does not allow & as values of nodes. For example

John Smith & John Doe is not a valid XML. Hence the ampersand & is usually html encoded to "&amp;"  for storing the string in XML. And so are less than(<) and greater than(>) in to &lt; and &gt; respectively.

Now, if we had to search for this node using the contains statement we don't require to html encode the search query. For example, if we want to search for name of a book that contains "This is a &amp; test" in the content node of the following XML



Test Book
<root>
  <book>
    <name>Test Book</name>
    <content>This is a &amp; test</content>
  </book>
</root>



Our xpath query should look like
root/book[content[contains(.,"This is a & test")]]/name

No comments:

Post a Comment