Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XPathNodeIterator.Count returns one less that the actual count w/ XPath2SelectNodes #59

Open
Todo18 opened this issue Oct 7, 2022 · 15 comments

Comments

@Todo18
Copy link

Todo18 commented Oct 7, 2022

We've swapped MS's built-in XPath 1.0 processor with the XPath2.Net library and are conducting a couple of (unit) tests, and have noticed something weird about the Count property (of the XPathNodeIterator). Logically, and according to MSDN (and the reference implementation by MS), Count returns the index/position of the last node in the set. Positions start counting at 1 according to the XPath specs, so this interpretation of Count makes sense when you have for example 3 nodes with indices 1 to 3.

However, when swapping libs, Count suddenly returns 2 for a set with 3 nodes, 0 for a set with 1 node, and -1 for the empty set. Could it be that indices start at 0 in the implementation, or are we missing something obvious?

Regardless, thanks for this amazing library!

@StefH
Copy link
Owner

StefH commented Oct 16, 2022

@Todo18
Can you provide a unit-test for this?

@Todo18
Copy link
Author

Todo18 commented Oct 18, 2022

Sure. Here's a sample test:

using System.Diagnostics;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using System.Xml.XPath;
using Wmhelp.XPath2;

namespace Test.Functional
{
    [TestClass]
    public class Compliancy
    {
        [TestMethod]
        public void XPath2SelectNodes_Count_Returns_Number_Of_Items_In_Expression()
        {
            var xIn = Resources.Provider.LoadXmlDocument("count_xml.xml");
            var navigator = xIn.CreateNavigator();

            var brandSet = navigator.XPath2SelectNodes("/report/brand");
            var brandCount = (int)navigator.XPath2Evaluate("count(/report/brand)");

            var highVolumeBrandSet = navigator.XPath2SelectNodes("/report/brand[units > 20000]");
            var highVolumeBrandCount = (int)navigator.XPath2Evaluate("count(/report/brand[units > 20000])");

            Debug.Assert(brandCount == 5);  // These are just here to assert the correct (expected) value
            Assert.AreEqual(brandCount, brandSet.Count);

            Debug.Assert(highVolumeBrandCount == 2);  // These are just here to assert the correct (expected) value
            Assert.AreEqual(highVolumeBrandCount, highVolumeBrandSet.Count);
        }
    }
}

We tested this with the following input file (count_xml.xml):

<?xml version="1.0" encoding="utf-8"?>
<!-- chocolate.xml -->
<report month="8" year="2006">
    <title>Chocolate bar sales</title>
    <brand>
        <name>Lindt</name>
        <units>27408</units>
    </brand>
    <brand>
        <name>Callebaut</name>
        <units>8203</units>
    </brand>
    <brand>
        <name>Valrhona</name>
        <units>22101</units>
    </brand>
    <brand>
        <name>Perugina</name>
        <units>14336</units>
    </brand>
    <brand>
        <name>Ghirardelli</name>
        <units>19268</units>
    </brand>
</report>

When we run the test, if returns 1 less (4 resp. 1) than the expected number of items in the node set of each expression (5 resp. 2), and one less than the count-function (which returns the correct results). I hope I'm being clear. If not, let me know. :)

@StefH
Copy link
Owner

StefH commented Nov 28, 2022

@StefH
Copy link
Owner

StefH commented Dec 5, 2022

@Todo18
Can you please take a look at the unit tests and check the difference with your code?

@Todo18
Copy link
Author

Todo18 commented Dec 8, 2022

Apologies, I'm short on time these days. I will inspect our integration and let you know my findings.

@StefH
Copy link
Owner

StefH commented Dec 21, 2022

@Todo18 Did you have time to check?

@Todo18
Copy link
Author

Todo18 commented Dec 22, 2022

I'm checking it today. :)

@Todo18
Copy link
Author

Todo18 commented Dec 22, 2022

I've taken the exact same code that you provided above, together with the v1.1.3 Nuget from the built-in Nuget manager in VS, and it fails the test. I have no idea what's going on. I'll look into it a bit more later today.

@Todo18
Copy link
Author

Todo18 commented Jan 18, 2023

So I've had some more time to check. Before diving further into (proprietary) code, could you please try and rephrase your unit tests to conform to ours, and check the (new) results, because there seems to be a discrepancy between brandSet.Count (our code) and brandSet.Should().HaveCount(5) (your code, which translates to: brandSet.Cast<object>().Count(), which iterates the collection, and thus bypasses the Count property of the XPathNoteIterator altogether).

@StefH
Copy link
Owner

StefH commented Jan 18, 2023

@Todo18
Copy link
Author

Todo18 commented Jan 19, 2023

To my knowledge, they are not the same: Count (in our code) is a property of XPathNoteIterator (see here), while HaveCount calls the Count method of Enumerable (see here). The former returns the index of the last node of the set, while the latter iterates over the set, and never calls the property of XPathNoteIterator (which is the point of the unit test).

To illustrate my point, I've added a second assertion to our unit tests, which succeeds, while the original assertion still fails.

Assert.AreEqual(brandCount, brandSet.Cast<object>().Count()); // Succeeds
Assert.AreEqual(brandCount, brandSet.Count); // Fails

@StefH
Copy link
Owner

StefH commented Jan 20, 2023

According to the documentation: https://learn.microsoft.com/en-us/dotnet/api/system.xml.xpath.xpathnodeiterator.count?view=net-7.0

Count => Gets the index of the last node in the selected set of nodes.
So it's not the number of nodes, it's the index, which is 4 in this case because we have 5 items.

#60

@StefH StefH closed this as completed Jan 20, 2023
@Todo18
Copy link
Author

Todo18 commented Jan 20, 2023

True, but positions start counting at 1 according to the XPath specs, so this interpretation of Count makes sense when you have for example 3 nodes with indices 1 to 3. (See here, for example: https://learn.microsoft.com/en-us/dotnet/api/system.xml.xpath.xpathnodeiterator.currentposition?view=net-7.0, or: https://www.w3.org/TR/2010/REC-xpath20-20101214/#eval_context)

So the indexing needs some tinkering, I guess.

Fiddle to illustrate: https://dotnetfiddle.net/QuErKT

@StefH StefH reopened this Jan 20, 2023
@StefH
Copy link
Owner

StefH commented Jan 22, 2023

But CurrentPosition is not the same as Count.

@Todo18
Copy link
Author

Todo18 commented Jan 22, 2023

I printed the first item of the set, not the last. It was simply to illustrate the numbering of indices behind the scenes. But you're right, I've adjusted the fiddle to illustrate my point better. ;)

https://dotnetfiddle.net/YdcE3V

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants