There are various potential improvements we could make to our CI matrix to improve tes

Explore ways to maximize coverage while minimizing cost of the CI test matrix about build-planning HOT 3 OPEN

rapidsai commented on May 28, 2024 2

Explore ways to maximize coverage while minimizing cost of the CI test matrix

from build-planning.

Comments (3)

jameslamb commented on May 28, 2024 1

Capturing some miscellaneous thoughts from interacting with the matrices on rapidsai/shared-workflows#166.

In general, I think whatever decisions we make about the support matrix should be:

encouraged via shared variables in configuration
enforced in shared-workflows CI

idea 1. matrices in workflow configs should make more use of shared variables

For example, it seems that we only do conda builds with 1 Python, want to cover one minor version per CUDA version, and want to cover the previous and current CUDA major versions (code link).

Today that looks like:

export MATRIX="
- { CUDA_VER: '11.8.0', LINUX_VER: 'ubuntu22.04', ARCH: 'amd64', PY_VER: '3.10' }
- { CUDA_VER: '11.8.0', LINUX_VER: 'ubuntu22.04', ARCH: 'arm64', PY_VER: '3.10' }
- { CUDA_VER: '12.0.1', LINUX_VER: 'ubuntu22.04', ARCH: 'amd64', PY_VER: '3.10' }
- { CUDA_VER: '12.0.1', LINUX_VER: 'ubuntu22.04', ARCH: 'arm64', PY_VER: '3.10' }
"

The intention there would be clearer, in my opinion, like this:

BUILD_OS="ubuntu22.04"
CUDA_PREVIOUS="11.8.0"
CUDA_CURRENT="12.0.1"
PYTHON_VERSION="3.10"

export MATRIX="
- { CUDA_VER: '${CUDA_PREVIOUS}', LINUX_VER: '${BUILD_OS}', ARCH: 'amd64', PY_VER: '${PYTHON_VERSION}' }
- { CUDA_VER: '${CUDA_PREVIOUS}', LINUX_VER: '${BUILD_OS}', ARCH: 'arm64', PY_VER: '${PYTHON_VERSION}' }
- { CUDA_VER: '${CUDA_CURRENT}',  LINUX_VER: '${BUILD_OS}', ARCH: 'amd64', PY_VER: '${PYTHON_VERSION}' }
- { CUDA_VER: '${CUDA_CURRENT}',  LINUX_VER: '${BUILD_OS}', ARCH: 'arm64', PY_VER: '${PYTHON_VERSION}' }
"

That'd make the patterns more obvious and reduce the diffs for changes like updating CUDA or Python version.

If the post-processing of those variables that's already done with yq / jq (code link) also resolved duplicates, then for other matrices you also wouldn't have to think about the difference between the state "we're only supporting 1 minor version of CUDA like 12.0.1" and "we're supporting 2 minor versions like 12.0.1 and 12.2.2".

idea 2: desirable properties of matrices should be enforced in CI

Here are some constraints I can imagine being desirable:

"no identical matrix configurations"
"no more than n total arm64 jobs"
"at least 1 job for each of the operating systems we support"
"at least 1 test job on PRs per CUDA {major}.{minor} that we support

Instead of relying on code comments or convention, I think it's worth considering whether those constraints could be enforced in shared-workflows CI.

I'm imagining here a little script that reads in the matrix configuration, renders the full matrices, and then asserts all the conditions we want to be true and raises a big loud error if any are violated.

from build-planning.

jameslamb commented on May 28, 2024

One other passing thought.... right now all wheels are tested against only the latest driver supported on the NVIDIA-hosted runners.

https://github.com/rapidsai/shared-workflows/blob/91799a905608f4c57d7fe65b92ce9a261249e635/.github/workflows/wheels-test.yaml#L69-L80

We should consider how much, if any, coverage of older drivers we want when testing wheels.

from build-planning.

vyasr commented on May 28, 2024

Instead of relying on code comments or convention, I think it's worth considering whether those constraints could be enforced in shared-workflows CI.
I'm imagining here a little script that reads in the matrix configuration, renders the full matrices, and then asserts all the conditions we want to be true and raises a big loud error if any are violated.

I really like this idea in theory. I don't know how well it'll work in practice, but I definitely think it's worth experimenting with.

from build-planning.

Explore ways to maximize coverage while minimizing cost of the CI test matrix about build-planning HOT 3 OPEN

Comments (3)

Related Issues (17)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent