Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Align conda and wheel building workflows #44

Open
7 tasks
vyasr opened this issue Apr 18, 2024 · 3 comments
Open
7 tasks

Align conda and wheel building workflows #44

vyasr opened this issue Apr 18, 2024 · 3 comments

Comments

@vyasr
Copy link
Contributor

vyasr commented Apr 18, 2024

Historically our conda and wheel GHA workflow scripts have looked fairly different for a number of reasons. However, with #33 many of the fundamental distinctions will no longer exist because wheels will also have separate build steps for C++ and Python builds. As a result, we should invest in aligning our workflows as much as possible so as to reduce maintenance costs going forward. Some changes that we ought to make:

  • We should parallelize test jobs across different Python packages. Wheels generally already do this, while all conda tests typically occur within a single job (in some repos there is a partial split, usually based on criteria specific to each repo e.g. cuml dask tests or cudf tests that aren't part of the cudf package).
  • We should standardize handling of pure Python packages. Wheels already do this, while conda packages do not. See Properly support building pure Python packages #43 for a more detailed writeup.
  • We should automatically append the CUDA version to the artifacts produced by the gha-tools for uploading packages. We already do this for conda, but not for wheels. As part of Support dynamic linking between RAPIDS wheels #33 we will be rearchitecting the wheel pipelines to have one job for building the C++ wheel and one for all the Python wheels, matching the conda packages more closely (this becomes feasible after Support dynamic linking between RAPIDS wheels #33 because the Python builds will be very fast if they don't have to rebuild the C++). Once that is the case, we can also upload/download all the wheels in the same manner that we do for conda packages (a single tarball for all wheels instead of one per wheel), so we can get rid of support for RAPIDS_PY_WHEEL_NAME. In the PRs for Support dynamic linking between RAPIDS wheels #33 we're currently abusing RAPIDS_PY_WHEEL_NAME to handle the CUDA version, so we need to start adding it for wheels before we can get rid of that variable.
  • rapids-download-conda-from-s3 automates choosing the output directory, while rapids-download-wheels-from-s3 requires that the caller specify it. We should update the wheel tool to automate that too.
  • Update conda jobs to include conda in the name. Currently wheels jobs are e.g. build_wheel_*, whereas conda is just build_cpp.sh etc. That is an artifact of a time when conda was our only produced artifact.
  • The rapids-wheels-anaconda tool will need to be modified to support upload of cpp wheels.
  • We should standardize the names of the CI jobs across repos, conda/wheel build types, and pr.yaml/build.yaml/test.yaml. See also add wheel output kvikio#369

I will update this list as more ideas come to mind.

@vyasr
Copy link
Contributor Author

vyasr commented Apr 19, 2024

A separate but related topic that may make sense to address somewhat in parallel is finding a way to make some subsets of the scripts reusable. Some work was previously done to make the CI test scripts usable in other environments (e.g. rapidsai/raft#2165), so when aligning conda/wheels CI and finding ways to maximize shared code (e.g. both calling the same test_* scripts) it might also make sense to see if we can make sure build and test scripts could be used in other environments, such as devcontainers. However, in an ideal world build scripts would be trivial (i.e. just cmake -S . -B build && cmake --build build, or python -m pip install ..., so there might not be much to do for build scripts (and test scripts have already been handled as mentioned above).

@jameslamb
Copy link
Member

The rapids-wheels-anaconda tool will need to be modified to support upload of cpp wheels.

Put up a proposal for this in rapidsai/gha-tools#105

@sisodia1701
Copy link

This is dependent on C++ wheels work and team will start work on this one, once the C++ work is done.
Moving to backlog.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants