Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make rapids-configure-conda-channels idempotent #104

Merged

Conversation

jameslamb
Copy link
Member

While testing some RAPIDS projects, I've found it helpful to do something like this to test conda builds.

docker run \
    --rm \
    -v $(pwd):/opt/rmm \
    -w /opt/rmm \
    -it rapidsai/ci-conda:cuda12.2.2-ubuntu22.04-py3.9-amd64 \
    bash

# build a conda package
ci/build_cpp.sh

(example: rapidsai/build-planning#54 (comment))

In that interactive case, it's useful to be able to modify the source of the project and directly run that ci/build_* script again.

Many of the RAPIDS projects' build scripts include a call to rapids-configure-conda-channels near the beginning. (GitHub search. That interrupts this workflow because that script is not idempotent... it raises an exception if asked to remove a channel that isn't currently in the conda configuration.

CondaKeyError: 'channels': 'rapidsai' is not in the 'channels' key of the config file

This proposes changing the behavior of that tool:

  • before: try to remove {channel} from the conda configuration, raise an error if {channel} isn't present there
  • after: if {channel} exists in the conda configuration, remove it, otherwise warn that it already doesn't exist there

Notes for Reviewers

I can think of 2 reasons we might decide not to merge this:

  • having this be a loud error means typos will be caught at CI time... e.g. trying to remove a channel called rapidz (note the z) would result in a CI failure on main but just a log message on this branch
  • this branch's changes make a fairly simple script more complex, which makes it harder to understand

I think these risks are worthwhile in exchange for removing some friction from local development on RAPIDS projects. I'm especially not too worried about the typo problem, because the channels this tool references are string literals within the tool, not coming through other configuration that's customized by projects.

How I tested this

in a container, using one of the images we use when building wheels in CI (click me)

From the root of this repo, on this branch.

docker run \
    --rm \
    -v $(pwd):/opt/gha-tools \
    -w /opt/gha-tools \
    -it rapidsai/ci-conda:cuda12.2.2-ubuntu22.04-py3.9-amd64 \
    bash

Checked the current set of configured channels.

conda config --get channels

That printed the following.

--add channels 'nvidia'   # lowest priority
--add channels 'conda-forge'
--add channels 'pytorch'
--add channels 'dask/label/dev'
--add channels 'rapidsai-nightly'
--add channels 'rapidsai'   # highest priority

Ran the script.

./tools/rapids-configure-conda-channels

Saw this in its logs

[rapids-is-release-build] is not release build

Checked the channels again.

conda config --get channels

As expected, rapidsai was removed.

--add channels 'nvidia'   # lowest priority
--add channels 'conda-forge'
--add channels 'pytorch'
--add channels 'dask/label/dev'
--add channels 'rapidsai-nightly'   # highest priority

Ran the script another 5 times, just to prove that it was really idempotent.

for i in {1..5}; do
    ./tools/rapids-configure-conda-channels
done

All of those succeeded, and the expected warning was issued.

[rapids-is-release-build] is not release build
[rapids-configure-conda-channels] channel 'rapidsai' not found via 'conda config --get channels'
[rapids-is-release-build] is not release build
[rapids-configure-conda-channels] channel 'rapidsai' not found via 'conda config --get channels'
[rapids-is-release-build] is not release build
[rapids-configure-conda-channels] channel 'rapidsai' not found via 'conda config --get channels'
[rapids-is-release-build] is not release build
[rapids-configure-conda-channels] channel 'rapidsai' not found via 'conda config --get channels'
[rapids-is-release-build] is not release build
[rapids-configure-conda-channels] channel 'rapidsai' not found via 'conda config --get channels'

Repeating this same process on main, the second call fails like this:

CondaKeyError: 'channels': 'rapidsai' is not in the 'channels' key of the config file

if [[ "${in_config}" == "true" ]]; then
conda config --system --remove channels "${channel_id}"
else
echo "[rapids-configure-conda-channels] channel '${channel_id}' not found via 'conda config --get channels'"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposing adding this so we have a chance of catching the "there was a typo in the channel name" case.

In theory this should never show up in CI logs unless there's a problem, since this script should only be run once during CI jobs.

Copy link
Member

@ajschmidt8 ajschmidt8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about simplifying all of this into just:

conda config --system --remove channels rapidsai || true

The || true part will prevent it from exiting with a non-zero code if the channel doesn't exist.

And the standard conda error that you shared below should still print:

CondaKeyError: 'channels': 'rapidsai' is not in the 'channels' key of the config file

@jameslamb
Copy link
Member Author

jameslamb commented Apr 30, 2024

Thanks for reviewing!

I'm proposing not || true-ing it because that'd mean these types of issues wouldn't cause an immediate failure:

  • conda config no longer recognizing the arguments --system or --remove
  • channels itself not being a valid key in conda config anymore
  • conda executable not being found on PATH

I'll defer to you. If you think the added complexity in the current state of this PR isn't worth the protection against those types of issues, I'll revert out all the jq stuff and add || true instead.

@jameslamb jameslamb added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels May 1, 2024
@vyasr
Copy link
Contributor

vyasr commented May 7, 2024

@ajschmidt8 thoughts? Just noticed that this PR has been sitting for a bit.

Copy link
Member

@ajschmidt8 ajschmidt8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved with some optional comments/suggestions

tools/rapids-configure-conda-channels Outdated Show resolved Hide resolved
@ajschmidt8
Copy link
Member

also, i noticed this script is missing our standard set -euo pipefail. not sure if it's worth adding in this PR.

@jameslamb
Copy link
Member Author

also, i noticed this script is missing our standard set -euo pipefail. not sure if it's worth adding in this PR.

I just added this in f1464ce

Tested my new changes following the "How I tested this" steps in the description of the PR, all seemed to still work the expected way.

I'll wait til you have another chance to review before I merge this @ajschmidt8

@jameslamb
Copy link
Member Author

Thanks again for the thorough review and for teaching me some new jq things @ajschmidt8 !

@jameslamb jameslamb merged commit 90067c5 into rapidsai:main May 14, 2024
1 check passed
@jameslamb jameslamb deleted the idempotent-configure-conda-channels branch May 14, 2024 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improves an existing functionality non-breaking Introduces a non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants