Solved

xpath test on "

Posted on 2004-03-30
4
407 Views
Last Modified: 2008-02-26
hi,

I used xpath expression:
//Object1/@name[contains(.,'"')]
to test if the Object1 name attribute contains "
It does not return the first Object1.

However, if I use expression:
//Object1/@name[contains(.,'"')]
It gives correct object.

Why?

<test>
<Object1 name="&quot;AA&quot; BB">11</Object1>
<Object2 name="&quot;AA&quot;">22</Object2>
<Object1 name = "CC">33</Object1>
</test>
0
Comment
Question by:danclemson
  • 2
4 Comments
 
LVL 12

Expert Comment

by:dfiala13
ID: 10721048
It's your parser.  it's not unescaping the value in the literal.  The .NET parser (which I tested with) performed as you expected:

//Object1/@name[contains(.,'&quot;')] worked and found the match
while

//Object1/@name[contains(.,'"')]
threw a parser error
0
 
LVL 26

Accepted Solution

by:
rdcpro earned 125 total points
ID: 10721347
No, it's not a matter of unescaping.  I'll bet he's using DOM XPath methods.

If he's using the XPath in a SelectSingleNode() or SelectNodes() expression, then his behavior is as expected.  

If he's using XSLT, then the XPath expression

//Object1/@name[contains(.,'"')]

is not well-formed, hence your parser throwing the exception.  All conforming parsers will throw this exception, including the one danclemson is using.  But he's probably using the XPath as SelectSingleNode, where the XPath does not get parsed!

Here's why...

The XML parser reads this:

<Object1 name="&quot;AA&quot; BB">11</Object1>

and then it expands the entity in the name attribute, so that the value, in memory, is:

"AA" BB

Once the attribute has been parsed, then entity isn't there any more.  Instead, the *actual* character represented by the entity is there. This isn't a problem, because the attribute has already been parsed, and all knowledge of the delimiters originally used in the attribute is lost.  

Now you need an XPath expression to select this attribute.  If you use XSLT, then the XPath expression you use must be well-formed, and your XPath expression will contain a &quot; instead of a doublequote character, because the doublequote character is not well formed.  So in my test, I used:

<xsl:value-of select="@*[contains(., '&quot;')]"/>

but when you're using selectSingleNode(), as long as you don't violate your language's rules for nesting of quotes and apostrophes, you DON'T use the entity reference.  The XPath in the selectSingleNode is never parsed by the XML parser!

selectSingleNode(\"@*[contains(., '"')]\")
or possibly
selectSingleNode("@*[contains(., '"')]")

Regards,
Mike Sharp
0
 

Author Comment

by:danclemson
ID: 10723720
Hi,

Thanks for the reply.
I think both of you are right.
I was using XMLSPY xpath evaluation.
Don't know what's the underline implementation of this xpath evalutation function. But seems it's not escaping the entity reference.
0
 
LVL 26

Expert Comment

by:rdcpro
ID: 10728422
In my version of XML Spy (4.4), it finds the node with

//Object1/@name[contains(.,'&quot;')]

and not

//Object1/@name[contains(.,'"')]

But XML Spy has some peculiarities.  I believe they're treating it as an XSLT XPath, because that's what you'd be doing with Spy.  I'm surprised you don't get it with the first expression and you do with the second...what version are you using?

I created an XPath evaluator some years ago that worked through the web.  That's when I discovered that the selectSingleNode method does not parse the XPath as "XML", and entities are not expanded.  

Regards,
Mike Sharp
0

Featured Post

3 Use Cases for Connected Systems

Our Dev teams are like yours. They’re continually cranking out code for new features/bugs fixes, testing, deploying, testing some more, responding to production monitoring events and more. It’s complex. So, we thought you’d like to see what’s working for us.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Browsing the questions asked to the Experts of this forum, you will be amazed to see how many times people are headaching about monster regular expressions (regex) to select that specific part of some HTML or XML file they want to extract. The examp…
Many times as a report developer I've been asked to display normalized data such as three rows with values Jack, Joe, and Bob as a single comma-separated string such as 'Jack, Joe, Bob', and vice versa.  Here's how to do it. 
Hi friends,  in this video  I'll show you how new windows 10 user can learn the using of windows 10. Thank you.
As a trusted technology advisor to your customers you are likely getting the daily question of, ‘should I put this in the cloud?’ As customer demands for cloud services increases, companies will see a shift from traditional buying patterns to new…

895 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now