<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clip

but Matt and Ralf agreed on the original decision in <a href="https://git

Here's the comment Matt wrote in the code: <div class="snippet-clipboard-content n

TST: make `xp_assert_{equal,close}` scalar check opt-in,about scipy/scipy

Comments (29)

rgommers commented on September 27, 2024 1

but Matt and Ralf agreed on the original decision in gh-19186, so there is some discussion there.

It was a little puzzling to me what that was about, but I think it's #19186 (comment).

But why is this not fixed at the array API side?

The short summary of the problem is:

All array libraries except numpy have only 0-D arrays, no "array scalars".
The "array scalars" thing in numpy is kind of a historical design issue/error (also Travis agreed to that); they were deemed necessary in 2004 but that has proven to not be the case. They're here to stay though in numpy 2.0 - we briefly considered attempting to remove them, but it'd be a very large effort that no one wanted to take on.
The array API standard hence only has "arrays" (including 0-D arrays) and "Python scalars" (instances of builtin bool, int, float, complex). There are no "array scalars".
In SciPy, the behavior across functions and submodules is inconsistent when a size-1 result is returned. In stats it's (almost?) always a numpy scalar, in ndimage a 0-D array.

For the introduction of array types / array API standard support we aimed so far for: no user-visible changes for numpy arrays, and always 0-D arrays (since that's the only option) for all other array types.

Now for this issue, there are three situations for comparing a_0D and a_scalar I think:

xp_assert_equal(a_0D, a_0D)
xp_assert_equal(a_scalar, a_scalar)
xp_assert_equal(a_0D, a_scalar) (or the reverse; behavior should be symmetric)

(1) and (2) should clearly be True, hence this is a valid bug report. The only one where reasonable people may disagree I think is (3). Since numpy scalars are heavily used in stats, making (3) return True seems fine to me. Requiring a keyword for that also seems okay - I don't have a strong opinion there.

from scipy.

mdhaber commented on September 27, 2024 1

I will submit a PR that adds an argument specifying whether 0D arrays are OK results or not.
It will also improve the error message when the result is a 0D array and that is not considered OK.

from scipy.

lucascolley commented on September 27, 2024

Here's the comment Matt wrote in the code:

    # We want to follow the conventions of the `xp` library. Libraries like
    # NumPy, for which `np.asarray(0)[()]` returns a scalar, tend to return
    # a scalar even when a 0D array might be more appropriate:
    # import numpy as np
    # np.mean([1, 2, 3])  # scalar, not 0d array
    # np.asarray(0)*2  # scalar, not 0d array
    # np.sin(np.asarray(0))  # scalar, not 0d array
    # Libraries like CuPy, for which `cp.asarray(0)[()]` returns a 0D array,
    # tend to return a 0D array in scenarios like those above.
    # Therefore, regardless of whether the developer provides a scalar or 0D
    # array for `desired`, we would typically want the type of `actual` to be
    # the type of `desired[()]`. If the developer wants to override this
    # behavior, they can set `check_shape=False`.

from scipy.

mdhaber commented on September 27, 2024

Yeah, go ahead and add an option to override that behavior. I know that conventions differ in other subpackages, but it was intentional.

from scipy.

ev-br commented on September 27, 2024

Scalars are tricky, yes. However, having whatever_assert_equal(a, b) throwing a cryptic error for a == b is exteremtly confusing.

from scipy.

mdhaber commented on September 27, 2024

For developer convenience, we can add a more explicit message if the result is a 0d array instead of a scalar, if you'd like. And the message could recommend passing a keyword argument that changes the check if 0d array is indeed the desired result.

Note that you can see in the traceback that it's intentionally making desired = desired[()].

from scipy.

ev-br commented on September 27, 2024

Yes. And actually check that expected == actual (under a try-except if needed).

from scipy.

mdhaber commented on September 27, 2024

I'm not sure I follow. The default is to check the type, dtype, and shape, etc, before checking the numerical values. This is reporting that the type is incorrect, so it fails before it gets to the value check. Perhaps we could have tested all these aspects separately, then reported everything that is wrong (e.g. value is correct but type is wrong), but we didn't go the trouble since it is a private function. I think that's how the public NumPy functions with the strict checks work, too. Once you fix the type, you would see whether something else is wrong. If you don't want this behavior, use check_shape=False.

from scipy.

ev-br commented on September 27, 2024

Things are either equal or they aren't.

from scipy.

lucascolley commented on September 27, 2024

Things are either equal or they aren't.

A NumPy array has various parts to it though, and some of the parts can be equal while others aren't. The new assertions are more strict in that they check for equality of dtypes and shapes in places where the old assertions didn't.

Perhaps you would prefer a longer name like assert_array_and_shape_equal? Or to split the checks into two separate assertions? I think that both of those options seem worse than one assertion called assert_equal, though, with a kwarg to toggle behaviour.

from scipy.

ev-br commented on September 27, 2024

No idea what "parts" you mean here. No renaming is going to change the fact that the OP example has exactly the same object.

from scipy.

lucascolley commented on September 27, 2024

Okay, I thought you were replying to Matt's comment about the dtype and shape checks. Sure, it probably makes sense for the same objects to be asserted equal with the default kwargs.

If we want to have the 0D array / scalar check that Matt implemented to be widespread though, that sounds like we would either need to pass a kwarg in most uses of the assertion, or append [()] to most of the desired values.

from scipy.

fancidev commented on September 27, 2024

I think OP’s point is that he expects xp_assert_equal(a, a) not to raise an exception because a is a. This sounds reasonable as long as the data type provides such equality semantics (e.g. nan == nan is false and they’re handled specially by the underlying library I believe).

from scipy.

ilayn commented on September 27, 2024

This is the black hole javascript fell with == and ===. I don't think we need to repeat their mistake. assert_equal_### is about the values being equal and not the metadata about things. 42 == 42.0 must return True regardless of the library.

If we need to enforce dtype, shape and the values we need to find another name for it in my opinion assert_same_array_props or whatever.

from scipy.

ev-br commented on September 27, 2024

xp_assert_equal(a, a) not to raise an exception because a is a

Yes, but that's not the whole story. It's just a limiting case where the result is clearly absurd. One cannot rely on a is b in general because at least a is a.copy() is False.

I'm calling the current behavior a bug because xp_assert_* treats 0D arrays differently from 1D:
xp_assert_equal(np.array(42), np.array(21)*2) should be as True as xp_assert_equal(np.array([42]), np.array([21])*2).

== is just an implementation detail: my knee-jerk implementation would rely on == returning a scalar not numpy 0D arrays. Maybe it's too brittle, and the assertion needs to explicitly branch on len(shape) == 0. At any rate, the fact that xp.isscalar is buggy and inconsistent across the xp libraries simply means xp_assert_* should handle it internally.

EDIT: And actually -- again IMO --- enforcing the dtype is fine. array(42) not being equal to array(42.0) is fine by default, and needing to explicitly set a flag to say "don't check the dtype" is fine IMO. These are the "parts" of an array, like Lucas said above. What is not fine is requiring a special flag for otherwise identical 0D arrays.

from scipy.

ilayn commented on September 27, 2024

The flag should be "check also the type", in other words, it should be opt-in because value comparisons are always true in Python regardless of the dtype.

from scipy.

fancidev commented on September 27, 2024

By the way, xp_assert_equal behaves differently from numpy.testing.assert_equal in that the latter considers +0.0 not equal to -0.0 while the former considers them equal. I have not tested whether they treat nans with different bit patterns as being equal.

While this is just a matter of specification, I do find it sometimes convenient to be able to distinguish +0.0 from -0.0 in a test case. But maybe assert_identical is a more appropriate name than assert_equal for testing signed zero (and bit patterns).

from scipy.

lucascolley commented on September 27, 2024

Out of interest, what was the original example that prompted this thread Evgeni? I think I agree that the error on (a, a) is an abuse of the semantics of "equal". I'm not yet convinced that this would cause a problem in real tests (where the solution isn't that we actually want to return the opposite of scalar / 0D array).

from scipy.

ev-br commented on September 27, 2024

$ python dev.py test -s ndimage |grep "Types do not" |wc -l
123

in the branch https://github.com/ev-br/scipy/tree/ndimage_array_api_tests.

from scipy.

lucascolley commented on September 27, 2024

So yeah, Matt has been sprinkling lines like

pvalue = pvalue[()] if pvalue.ndim == 0 else pvalue

everywhere while we convert so that we better align with NumPy on returning scalars. Would we not want to do that in ndimage too?

I think there is an argument that some level of abuse of semantics is justified if it helps us make this behaviour more consistent, without having to set extra kwargs in the assertions. Maybe we can get the best of both worlds by changing the names of the assertions too?

from scipy.

ev-br commented on September 27, 2024

> pvalue = pvalue[()] if pvalue.ndim == 0 else pvalue

There might be very good reasons to do that in stats tests, but here it's irrelevant. Note that the OP does not have scalars, it only has 0D arrays. [1]
The very fact that the OP has scalars in the error message is some implementation detail of xp_assert_* leaking through.

[1] np.array(42) is a 0D array; np.int64(42) is a numpy scalar (aka array scalar). Indexing the former with [()] returns the latter. Not all xp libraries have array scalars, but all have 0D arrays.

Would we not want to do that in ndimage too?

Don't know. If you see a reason to change some ndimage functions, a case needs to be made in a usual way, considering usage, impact, backwards compatibility etc. I'm sure there was a very good reason in stats statistical tests.

Note however that array API adoption by iself might be a reason to take a closer look at the user API, but that's about it. It cannot be the sole reason to change user API.
But anyhow even if we change something in ndimage, that's unrelated to keeping the existing test coverage of the existing API.

Maybe we can get the best of both worlds by changing the names of the assertions too?

I seriously hope you're not suggesting xp_assert_equal_unless_0D, which is what the OP boils down to.

from scipy.

lucascolley commented on September 27, 2024

Note however that array API adoption by iself might be a reason to take a closer look at the user API

Yeah that's it. I can't really provide any more to the conversation at this point, but Matt and Ralf agreed on the original decision in gh-19186, so there is some discussion there.

from scipy.

ilayn commented on September 27, 2024

Regardless of the resolution, the arrays should be unpacked via ndarray.item.

Packing scalars to arrays to do comparisons sounds really weird unless we adopt the "Everything is an array" mantra which is not the case for at least NumPy. Some memories from the past https://stackoverflow.com/questions/40659212/

But why is this not fixed at the array API side?

from scipy.

lucascolley commented on September 27, 2024

Since numpy scalars are heavily used in stats, making (3) return True seems fine to me.

Note that this wasn't quite the use case Matt had in mind. The idea is that xp_assert_equal(a_np_0D, {a_np_0D, a_np_scalar}) should return False, enforcing res to be a scalar. It sounds like that is the behaviour that should be moved to a kwarg. Then, if it turns out that we are using the kwarg almost everywhere (i.e. actually wanting a 0D NumPy array is the exception), maybe we could reconsider the default. But the xp_assert_equal(a_0D, a_0D) counterexample is hard to argue with.

from scipy.

h-vetinari commented on September 27, 2024

So yeah, Matt has been sprinkling lines like
pvalue = pvalue[()] if pvalue.ndim == 0 else pvalue
everywhere while we convert so that we better align with NumPy on returning scalars.

I consider further blurring the line between scalars and 0d arrays to be actively detrimental. We should get better at this, not worse.

from scipy.

h-vetinari commented on September 27, 2024

Aside from the lack of trivial self-identity, we should also fix the lack of commutativity:

>>> a = np.array(42)
>>> xp_assert_equal(a, 42)  # AssertionError: Result is a NumPy 0d array....
>>> xp_assert_equal(42, a)  # AttributeError: 'int' object has no attribute 'dtype'
# further
>>> xp_assert_equal(np.int64(42), a)  # passes...
>>> xp_assert_equal(np.int64(42), a, allow_0d=True)  # AssertionError: Types do not match...

The last one in particular makes no sense - something additional is "allowed" according to the keyword, yet the assertion fails, while the same thing does pass with allow_0d=False.

I already proposed this in #19989, but: replace allow_0d=False with check_0d=True, and then:

do not ever error on the content/type of the argument (not appropriate for assert_equal style functions)
fail if there's a mismatch between scalar and 0d (for default check_0d=True), allow mismatch (if values are equal) for check_0d=False
make function fully symmetrical (IMO required for assert_*)

This would solve the issue in the OP, my problems in #19989, and the bugs related to mismatched argument order. IOW:

>>> a = np.array(42)
>>> xp_assert_equal(a, a)   # passes
>>> xp_assert_equal(a, 42)  # AssertionError
>>> xp_assert_equal(42, a)  # AssertionError
>>> xp_assert_equal(np.int64(42), a)  # AssertionError
>>> xp_assert_equal(a, np.int64(42))  # AssertionError
>>> xp_assert_equal(np.int64(42), a, check_0d=False)  # passes
>>> xp_assert_equal(a, np.int64(42), check_0d=False)  # passes
>>> xp_assert_equal(a, 42, check_0d=False)  # passes
>>> xp_assert_equal(42, a, check_0d=False)  # passes

from scipy.

h-vetinari commented on September 27, 2024

Implementing the above leads to:

======== 264 failed, 51934 passed, 4665 skipped, 11738 deselected, 163 xfailed, 13 xpassed in 636.99s (0:10:36) ========

due to existing tests counting on not distinguishing between scalars and 0d arrays. I fixed those things in #21026 (draft for discussion).

from scipy.

h-vetinari commented on September 27, 2024

I discussed this (and #21026) with @mdhaber yesterday; here's an RFC that writes things up (hopefully) somewhat comprehensively & coherently, along with a proposed solution that tries to make everyone somewhat happy: #21044

from scipy.

lucascolley commented on September 27, 2024

let's close this in favour of the RFC, where the discussion here has been consolidated.

from scipy.

TST: make `xp_assert_{equal,close}` scalar check opt-in about scipy HOT 29 CLOSED

Comments (29)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent