• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 236
  • Last Modified:

math errors in msxml

I'm getting crazy with math errors (approximations?) in msxml (Windows 7)...

Take this XML:


Open in new window

Damn....  why does    $o.selectnodes("/xml/number/text()[.</xml/now/text()-/xml/difference/text()]").length    return 4?  (instead of 3)
  • 7
  • 6
1 Solution
Geert BormansInformation ArchitectCommented:
This is not a MicroSoft parser issue, this is a XPath1 issue.
XSLT1 mandates that numbers are calculated using double type
so the loss in precision is predictable and all XPath processors I tested this with return 4

You can fix this by using XPath2.0 (be it there is no microsoft implementation) and use xs:decimal

I also want to note that you are counting text() nodes, not number elements,
you should avoid using text() in most of the cases
I recommend this XPath instead of yours (though it does not fix your issue)
/xml/number[number(.) &lt; (number(/xml/now) - number(/xml/difference))]
lucavillaAuthor Commented:
Thanks Geert

About "number()", I though that being that text() works, text() is faster because number() has to do an (unneded) conversion.   Are you sure that the use of number() would be better?

Actually I need to extensively track my XML elements with timestamps (like "130102.123452") and to make extractions specifying time differences like for example all the timestamps that are "< now - 10 seconds".
How would you solve this need?
Geert BormansInformation ArchitectCommented:
As soon as you do numeric calculations, there is an implicit cast to numbers. By making it explicit, you help the processor. It no longer needs the logic to imply that you want to use it as a number. You tell the processor you want to use it as a number. In theory using number() is faster, though your processor might optimize and then it does not matter. So I conisder that self preservation. Make it obvious to yourself that you need it to be a number

text() is a different discussion. number() is a (casting) function. text() is a node test. That is a different thing. It belongs in the box with node() and comment() etc...
/xml/number/text()[$some-condition] selects a text node, not the element number
but noone guarantees that there is only one text() node in there. So best practice you avoid the use of text(). If you need casting to a string, use string() and use text() only when you need to explicitely address a specific text() node eg. following-sibling::text()[1] if you need to test if the next node is a text node. excessive use of text() is on the top 5 list of common XPath/XSLT mistakes.

There is no relation in my previous comment between dropping text() and adding number(). I hope I clarified that now

Working around the precision issues
- use format-number() to force an exact number of decimals to your numbers (now it is a string)
- use substring-before(., '.') and substring-after() to get the integer and decimal part as comparable integers.
- do the calculus on the integers at full precision
Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

lucavillaAuthor Commented:
Thanks Gertone, you clarified to me some things.

A question: supposing that we want to take that number inside <now></now> excluding any text inside any children(s) (is any), what would be the best msxml command?

I tried the following without success so I'm still a little confused:

Maybe I must not use number() here because msxml can only extract text (with ".text")?
and is the filter "[.]" to avoid inclusion of any children(s)?
lucavillaAuthor Commented:
Another question:  I'm seeing that I have correct results if I remove the decimal point in timestamps, even if I add the leading "20" for year   (so that I have timestamps like "20131228154710").

What are the limits of double type?  do I still have to worry about unprecision in case I use timestamps of these length without decimals?  (I would only perform date differences or comparisons)
Geert BormansInformation ArchitectCommented:
[ ] indicates a predicate
. is short for self::node()
[.] doesn't do anything in general
I told you to avoid text(), it does not do anything in this case and can only break your code
number() casts a node to a number, you can use it in tests in such an XPath, but I don't think it is a good idea to have the global expression inside a selectSingleNode wrapped in a number. Why cast to a number if you need a node that you want to get the .text from
should be

As a short option you can indeed just drop the '.' (use translate() for that)
make sure that you use format-number() first to ensure the number of decimals are alligned if you do that

I think you will be safe. double uses 64 bits for a number. If you don't have decimals, you would have double the number of digits of int32
(I quickly checked wikipedia and some java and .net references and they all seem to agree that the maximum for double type is 1.79769313486231570E+308)
Of course there is an XPath processor implementation layered on top of your type system, so you 'd better test some first
lucavillaAuthor Commented:
Problem: "supposing that we want to take that number inside <now></now> excluding any text inside any children(s) (is any), what would be the best msxml command?"

I tested with this:

Open in new window

Your solution $oDoc.SelectSingleNode("/xml/now").text  returns "130102.123456123foobar" that is not what I want.

while $oDoc.SelectSingleNode("/xml/now/text()").text  returns "130102.123456123" that is what I want.

I saw that even when I want to write/overwrite that number, "text()" is necessary to avoid loosing the "<abc>foobar</abc>" forever!
Therefore even the right command to write is $oDoc.SelectSingleNode("/xml/now/text()").text  =  123456.112233444

About the "[.]" you were right: it seems to not have any effect...
Geert BormansInformation ArchitectCommented:
You are throwing new XML into the equation
(note that I optimize the XPath based on the XML I see, not on all possible XMLs that could occur, unless you give me that information)

By using selectSingleNode and text() you only get the first node(). By choosing the higher up element (now) you get stringified content of now... safer in the earlier XML

I wanted to point out a risk, of course you can find examples that break my suggestion
(by changing requirements after the development, you start to sound like a real customer by the way ;-)

I was raising the warning bell on the text() node.
Your Xpath only selects the first text() node in your now element
If you are sure that the first actual text node in the now element is ALWAYS the one you are after, then OK, you have found a good usecase for text(). Note that whatever is between </abc> and </now> is another text node, not found by your XPath

Whilst you think your example breaks my point, it actually enforces it

Now you have proven that other nodes might exist inside the <now> element, try some of those
<now> <abc>foobar</abc>130102.123456123</now>




You could claim that the only realistic one is the first. And you might find out it works as you please... OK, please note that that one works because of a flaw in the msxml parser, it would break with any other XML parser... and it would definitely break if your XML had a DTD or a schema (are you sure it will never have one?)

So, if you are certain that text() works for you, and you are comfortable it will not break in any of your use cases, please use it. If it breaks at one point, I hope I gave you some things to think about why that could happen

If I were to do this task, and I were to go for the safe route, I would select all child nodes of the now element, combine all the text nodes into one and use that. For putting the nodes back, I would reconstruct from the child element nodes found
[caveat: my colleagues often tell me I am taking too much control, but they admit I am the one that needs the least time fixing production glitches :-) ]
lucavillaAuthor Commented:
hehe Gertone you've always an eagle eye's but I didn't change requirements, I just copied and pasted the question I posted on 2013-12-28 at 15:40:41 where I searched for a solution excluding any subchildren of the element "now" (if any).

Anyway, I understand your point that text(), while having the advantage (for me, in general use) to avoid text nodes of subchildrens, limits to the first text node of the root of the element.

So... is there a final solution in msxml + XPath 1 to both avoid subchildrens text nodes while including all the text nodes of the root of the element?
Geert BormansInformation ArchitectCommented:
try selectNodes on the text() nodes instead of selectSingleNode,
at least you get all of them
and use your .net code to join all the text nodes from the node array into one
lucavillaAuthor Commented:
It seems that selectNodes doesn't support the ".text" property.
It gives an error...   while with selectSingleNode it works...
Geert BormansInformation ArchitectCommented:
selectNodes returns an array of nodes, not a single node, so you need to iterate the array and get the .text of each single node and concat that to a string.
lucavillaAuthor Commented:

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

  • 7
  • 6
Tackle projects and never again get stuck behind a technical roadblock.
Join Now