Failures for new `pytest-fail-slow` check in Windows CI jobs about scipy HOT 13 OPEN

rgommers commented on September 27, 2024

Failures for new `pytest-fail-slow` check in Windows CI jobs

from scipy.

Comments (13)

mdhaber commented on September 27, 2024 1

This extra test didn't fail the runtime check in the PR so any fails will be sporadic.

No, it would have failed, but CI wasn't run on gh-20757 after the relevant fail-slow PR (gh-20672) merged.

I think the trial policy to have tests run under a second is going create noise from sporadic fails that'll take extra work to quieten.

Noted again.

But we still haven't seen real noise yet. Already some good has come out of it: the fail slow marker revealed that COBYQA was uneccessarily slow. Here, it only came up because the two PRs were happening in parallel. To lessen the burden of applying the mark to the test, I'll go ahead and submit a PR and ping you.

I'd be fine removing pytest-fail-slow it if we really do have trouble with it, though.

from scipy.

mdhaber commented on September 27, 2024

I think it's just new. That test didn't register in other recent CI runs (e.g. gh-20805), but gh-20757 merged 9 hours ago and CI wasn't run in that PR after gh-20672 merged. Now in gh-20807:

================================== FAILURES ===================================
______________ TestDifferentialEvolutionSolver.test_strategy_fn _______________
[gw0] win32 -- Python 3.12.3 C:\hostedtoolcache\windows\Python\3.12.3\x64\python.exe
Test passed but took too long to run: Duration 1.9525856999998723s > 1.0s

from scipy.

andyfaff commented on September 27, 2024

I'll look into that particular test tomorrow.

from scipy.

andyfaff commented on September 27, 2024

In #20757 TestDifferentialEvolutionSolver.test_strategy_fn was amended slightly to check that a user provided strategy worked with updating='deferred'. I think the test should be run in the fast and slow test suites.

Considering the other changes in the PR the added test runs faster now (508 ms) than it would've done before the PR (755 ms). So it's not a performance issue in the code, it's just that an extra check has been added. We could cut the number of iterations down to make it run faster, but for me this is in the noise. That it takes 2s in CI is negligible. This extra test didn't fail the runtime check in the PR so any fails will be sporadic.

It's now extra work to mark this test as pytest.mark.fail_slow. I think the trial policy to have tests run under a second is going create noise from sporadic fails that'll take extra work to quieten.

from scipy.

mdhaber commented on September 27, 2024

Ran across one that was missing from the original PR and fixed it in gh-20824. A few others that I might need to adjust are:

scipy/stats/tests/test_fit.py::TestFit::test_truncweibull_min
scipy/sparse/tests/test_base.py::Test64Bit::test_no_64
scipy/stats/tests/test_fast_gen_inversion.py::test_error_extreme_params
scipy/interpolate/tests/test_bsplines.py::TestSmoothingSpline::test_weighted_smoothing_spline
scipy/sparse/tests/test_base.py::TestCSC::test_slicing_3
scipy/stats/tests/test_resampling.py::TestPermutationTest::test_finite_precision_statistic
scipy/optimize/tests/test_optimize.py::TestOptimizeResultAttributes::test_attributes_present
scipy/optimize/tests/test_constraint_conversion.py::TestNewToOld::test_multiple_constraint_objects
scipy/sparse/tests/test_base.py::Test64Bit::test_resiliency_random

But I've been monitoring the variation between runs and most tests are reasonably consistent. For instance, tests on the list that took significantly more time on a recent CI run compared to the last run of the gh-20672 are:

Tests taking significantly more time:
18.02s -> 28.94s linalg/tests/test_extending.py::test_cython
1.28s -> 2.02s sparse/linalg/_isolve/tests/test_iterative.py::test_precond_inverse[poisson2d]
0.99s -> 1.79s optimize/tests/test__dual_annealing.py::TestDualAnnealing::test_from_docstring
0.91s -> 1.18s optimize/tests/test__dual_annealing.py::TestDualAnnealing::test_bounds_class
0.62s -> 0.83s spatial/tests/test_qhull.py::TestVoronoi::test_incremental[random-3d-chunk-1]
0.43s -> 0.54s optimize/tests/test_constraint_conversion.py::TestNewToOld::test_multiple_constraint_objects
0.4s -> 0.51s linalg/tests/test_matfuncs.py::TestExpM::test_gh18086
0.35s -> 0.49s sparse/tests/test_base.py::Test64Bit::test_resiliency_all_64[TestCSC-test_slicing_3]
0.33s -> 0.45s sparse/tests/test_base.py::Test64Bit::test_resiliency_limit_10[TestDOK-test_add_sub]
0.29s -> 0.43s optimize/tests/test__shgo.py::TestShgoArguments::test_21_1_jac_true
0.28s -> 0.39s special/tests/test_support_alternative_backends.py::test_support_alternative_backends[f_name_n_args3-numpy]
0.26s -> 0.38s sparse/linalg/_eigen/tests/test_svds.py::Test_SVDS_ARPACK::test_small_sigma_sparse[complex-shape1]
0.29s -> 0.37s special/tests/test_support_alternative_backends.py::test_support_alternative_backends[f_name_n_args6-numpy]
0.28s -> 0.37s sparse/linalg/_eigen/arpack/tests/test_arpack.py::test_hermitian_modes
0.28s -> 0.36s special/tests/test_support_alternative_backends.py::test_support_alternative_backends[f_name_n_args1-numpy]
0.26s -> 0.35s stats/tests/test_continuous_basic.py::test_cont_basic[500-burr-arg7]
0.25s -> 0.35s sparse/tests/test_base.py::Test64Bit::test_no_64[TestDOK-test_setdiag_comprehensive]
0.26s -> 0.34s interpolate/tests/test_rgi.py::TestRegularGridInterpolator::test_list_input[quintic]
0.25s -> 0.33s sparse/tests/test_base.py::Test64Bit::test_resiliency_random[TestCSR-test_add_sub]
0.25s -> 0.33s interpolate/tests/test_interpolate.py::TestNdPPoly::test_simple_4d

and all of the ones that take more than 0.55s have pytest.mark.fail_slow exceptions with plenty of buffer.

from scipy.

mdhaber commented on September 27, 2024

@rgommers to try to prevent any more failures due to variation in times, do you mind if I just double the limits of everything I've put exceptions on so far? From what I've seen so far, I think the tests above and the ones on which I've already put fail_slow exceptions are more at risk of spurious failures than tests that aren't on my radar. I'd prefer to just whack all these moles once and for all.

from scipy.

rgommers commented on September 27, 2024

Yes, that sounds like a good idea.

from scipy.

rgommers commented on September 27, 2024

From this log for scipy\sparse\linalg\tests\test_expm_multiply.py::TestExpmActionSimple::test_scaled_expm_multiply:

_______________ TestExpmActionSimple.test_scaled_expm_multiply ________________
[gw0] win32 -- Python 3.12.3 C:\hostedtoolcache\windows\Python\3.12.3\x64\python.exe
Test passed but took too long to run: Duration 1.026920699999664s > 1.

from scipy.

rgommers commented on September 27, 2024

A timeout failure for scipy/_lib/tests/test__util.py::test_pool from this CI log:

__________________________________ test_pool __________________________________
[gw0] win32 -- Python 3.12.4 C:\hostedtoolcache\windows\Python\3.12.4\x64\python.exe
Test passed but took too long to run: Duration 2.63190110000005s > 1.0s

from scipy.

lucascolley commented on September 27, 2024

scipy\linalg\tests\test_lapack.py::test_gejsv_general[1-1-0-1-3-float64-size0] from this CI log:

_________________ test_gejsv_general[1-1-0-1-3-float64-size0] _________________
[gw1] win32 -- Python 3.12.4 C:\hostedtoolcache\windows\Python\3.12.4\x64\python.exe
Test passed but took too long to run: Duration 1.0877726999997321s > 1.0s

from scipy.

lucascolley commented on September 27, 2024

From this CI log:

scipy\stats\tests\test_sampling.py::test_random_state[NumericalInverseHermite-kwargs4]:

_____________ test_random_state[NumericalInverseHermite-kwargs4] ______________
[gw1] win32 -- Python 3.12.4 C:\hostedtoolcache\windows\Python\3.12.4\x64\python.exe
Test passed but took too long to run: Duration 1.2308337999993455s > 1.0s

scipy\stats\tests\test_sampling.py::TestQRVS::test_QRVS_shape_consistency[3-d_out2-None-size_out0-qrng2-NumericalInversePolynomial]:

_ TestQRVS.test_QRVS_shape_consistency[3-d_out2-None-size_out0-qrng2-NumericalInversePolynomial] _
[gw1] win32 -- Python 3.12.4 C:\hostedtoolcache\windows\Python\3.12.4\x64\python.exe
Test passed but took too long to run: Duration 1.267359999999826s > 1.0s

from scipy.

lucascolley commented on September 27, 2024

From this CI log:

scipy\optimize\tests\test__differential_evolution.py::TestDifferentialEvolutionSolver::test_immediate_updating:

___________ TestDifferentialEvolutionSolver.test_immediate_updating ___________
[gw0] win32 -- Python 3.12.4 C:\hostedtoolcache\windows\Python\3.12.4\x64\python.exe
Test passed but took too long to run: Duration 1.3675756000002366s > 1.0s

from scipy.

lucascolley commented on September 27, 2024

From this CI log:

scipy\special\tests\test_extending.py::test_cython

 _________________________________ test_cython _________________________________
[gw0] win32 -- Python 3.12.4 C:\hostedtoolcache\windows\Python\3.12.4\x64\python.exe
Test passed but took too long to run: Duration 51.31201609999971s > 40s

from scipy.

Failures for new `pytest-fail-slow` check in Windows CI jobs about scipy HOT 13 OPEN

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent