Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Treat clobber warnings as errors for conda builds #54

Open
jameslamb opened this issue Apr 23, 2024 · 11 comments
Open

Treat clobber warnings as errors for conda builds #54

jameslamb opened this issue Apr 23, 2024 · 11 comments
Assignees

Comments

@jameslamb
Copy link
Member

(Apr 23, 2024) moved from an internal tracking board. The description here is @ajschmidt8 's original write-up of the issue.

Problem

Some of our conda builds suffer from ClobberWarnings (e.g. here and here).

Ignoring these clobber warnings can sometimes result in packaging issues. Therefore, we'd like to begin treating these clobber warnings as errors.

Ideally, this setting should go in the .condarc file of our CI images (i.e. here) so that it applies to every RAPIDS repository.

However, since many repositories already suffer from clobber warnings, this would break CI for a lot of people.

Therefore, we need to plan the rollout carefully.

Solution

To roll out this new configuration setting safely, we should do the following:

For each of the repositories in the section below:

a. Open a PR and add the following line to any build scripts (e.g. ci/build_{cpp,python}.sh): conda config --set path_conflict prevent
b. Resolve any build issues that result from this change
c. If build issues occur, fix them and merge the PR (leaving in the new conda config line, that will be cleaned up later)
d. If no build issues occur, simply close the PR
e. Link the PR in this issue (do not link this issue in the PR, since this issue is in a private repository)

Once all of the repositories are successfully using the new conda config setting, add the setting to https://github.com/rapidsai/ci-imgs/blob/main/context/condarc.tmpl

Go back and clean up all of the extraneous conda config --set path_conflict prevent lines in each repository

Repositories

Look at the repos at this search query.

@jameslamb
Copy link
Member Author

Adding an example for reference on how to investigate the contents of these files.

# choose a conda package from https://anaconda.org/rapidsai/librmm/files
RMM_CONDA_PACKAGE=https://anaconda.org/rapidsai/librmm/24.04.00/download/linux-64/librmm-24.04.00-cuda12_240410_g8f19c9c3_0.tar.bz2

# download it
wget \
    "${RMM_CONDA_PACKAGE}" \
    -O librmm.tar.bz2

# list its contents, filtering down for paths mentioned in the clobber errors
tar jtf ./librmm.tar.bz2 \
| grep -E 'fmt|spdlog'

For a recent version of librmm, this shows the following:

fmt and spdlog headers and static libraries (click me)
include/spdlog/details/windows_include.h
include/spdlog/fwd.h
include/spdlog/version.h
include/spdlog/formatter.h
include/spdlog/fmt/xchar.h
include/spdlog/fmt/ranges.h
include/spdlog/fmt/chrono.h
include/spdlog/fmt/ostr.h
include/spdlog/fmt/compile.h
include/spdlog/details/console_globals.h
include/spdlog/fmt/std.h
include/spdlog/details/periodic_worker-inl.h
include/spdlog/cfg/helpers.h
include/spdlog/sinks/sink-inl.h
include/spdlog/details/synchronous_factory.h
include/spdlog/sinks/sink.h
include/spdlog/details/log_msg_buffer.h
include/spdlog/details/null_mutex.h
include/spdlog/cfg/env.h
include/spdlog/fmt/fmt.h
include/spdlog/sinks/basic_file_sink-inl.h
include/spdlog/details/log_msg-inl.h
include/spdlog/details/log_msg.h
include/spdlog/details/backtracer.h
include/spdlog/sinks/stdout_color_sinks-inl.h
include/spdlog/sinks/null_sink.h
include/spdlog/cfg/argv.h
include/spdlog/sinks/ostream_sink.h
include/spdlog/sinks/base_sink.h
include/spdlog/sinks/callback_sink.h
include/spdlog/sinks/stdout_color_sinks.h
include/spdlog/details/log_msg_buffer-inl.h
include/spdlog/stopwatch.h
include/spdlog/details/file_helper.h
include/spdlog/details/periodic_worker.h
include/spdlog/sinks/base_sink-inl.h
include/spdlog/sinks/udp_sink.h
include/spdlog/sinks/basic_file_sink.h
include/spdlog/details/backtracer-inl.h
include/spdlog/sinks/msvc_sink.h
include/spdlog/common-inl.h
include/spdlog/sinks/ringbuffer_sink.h
include/spdlog/sinks/tcp_sink.h
include/spdlog/async_logger.h
include/spdlog/details/udp_client.h
include/spdlog/sinks/stdout_sinks.h
include/spdlog/sinks/dist_sink.h
include/spdlog/async_logger-inl.h
include/spdlog/sinks/rotating_file_sink.h
include/spdlog/sinks/wincolor_sink.h
include/spdlog/spdlog-inl.h
include/spdlog/details/udp_client-windows.h
include/spdlog/sinks/dup_filter_sink.h
include/spdlog/cfg/helpers-inl.h
include/spdlog/details/thread_pool.h
include/spdlog/details/circular_q.h
include/spdlog/details/thread_pool-inl.h
include/spdlog/async.h
include/spdlog/pattern_formatter.h
include/spdlog/sinks/syslog_sink.h
include/spdlog/sinks/mongo_sink.h
include/spdlog/details/os.h
include/spdlog/sinks/ansicolor_sink.h
include/spdlog/details/tcp_client.h
include/spdlog/details/registry.h
include/spdlog/sinks/stdout_sinks-inl.h
include/spdlog/sinks/kafka_sink.h
include/spdlog/details/mpmc_blocking_q.h
include/spdlog/sinks/systemd_sink.h
include/spdlog/details/tcp_client-windows.h
include/spdlog/details/fmt_helper.h
include/spdlog/sinks/ansicolor_sink-inl.h
include/spdlog/sinks/android_sink.h
include/spdlog/sinks/rotating_file_sink-inl.h
include/spdlog/details/file_helper-inl.h
include/spdlog/sinks/wincolor_sink-inl.h
include/spdlog/tweakme.h
include/fmt/ostream.h
include/spdlog/sinks/hourly_file_sink.h
include/spdlog/logger-inl.h
include/spdlog/fmt/bin_to_hex.h
include/fmt/args.h
include/spdlog/sinks/daily_file_sink.h
include/spdlog/details/registry-inl.h
include/spdlog/sinks/win_eventlog_sink.h
include/fmt/xchar.h
include/spdlog/sinks/qt_sinks.h
include/spdlog/spdlog.h
include/fmt/os.h
include/fmt/std.h
include/spdlog/common.h
include/spdlog/logger.h
include/spdlog/details/os-inl.h
include/fmt/compile.h
include/fmt/printf.h
include/fmt/ranges.h
include/fmt/color.h
include/spdlog/pattern_formatter-inl.h
include/fmt/chrono.h
include/fmt/format-inl.h
include/fmt/core.h
include/fmt/format.h
lib/libfmt.a
lib/libspdlog.a
lib/pkgconfig/spdlog.pc
lib/pkgconfig/fmt.pc
lib/cmake/fmt/fmt-targets-release.cmake
lib/cmake/spdlog/spdlogConfigTargets-release.cmake
lib/cmake/fmt/fmt-config.cmake
lib/cmake/spdlog/spdlogConfig.cmake
lib/cmake/fmt/fmt-config-version.cmake
lib/cmake/spdlog/spdlogConfigVersion.cmake
lib/cmake/fmt/fmt-targets.cmake
lib/cmake/spdlog/spdlogConfigTargets.cmake

@jameslamb
Copy link
Member Author

jameslamb commented Apr 23, 2024

Open PRs:

@jameslamb
Copy link
Member Author

Short Summary

It looks to me like the root cause of the fmt and spdlog clobbering warnings specifically is a combination of the following:

And I think there are 2 classes of fixes we could pursue.

  • short-term: upgrade fmt and spdlog across RAPIDS and remove the patches we're carrying around in rapids-cmake for them
  • long-term: ensure builds of RAPIDS packages never place downloaded sources at likely-to-cause-conflicts paths like include/fmt and include/spdlog

Details

I'm able to reproduce the errors about clobbering fmt and spdlog by running the following on the branch from rapidsai/rmm#1508.

docker run \
    --rm \
    -v $(pwd):/opt/rmm \
    -w /opt/rmm \
    -it rapidsai/ci-conda:cuda12.2.2-ubuntu22.04-py3.9-amd64 \
    bash

# do a 'librmm' conda build
ci/build_cpp.sh

I removed the fmt and spdlog patches in my fork of rapids-cmake and ran rmm's conda builds again, found that that resolved the clobber errors. (link to successful builds).

image

(link to diff on my rapids-cmake fork)

We could immediately resolve these fmt and spdlog clobber issues specifically by upgrading to the latest versions of those libraries and removing the patches in rapids-cmake.

But separately, we should try to permanently fix this so the clobbering issues don't re-appear in similar situations in the future.

If we want to preserve the ability to substitute in sources with not-yet-upstreamed patches in our conda builds, I can think of 2 options:

  • modify rapids-cmake (or the way that projects invoke it) such that it writes those sources to project-specific paths like include/rmm/vendored/fmt/*
  • publish a rapids-patched-sources or something package on conda-forge which contains copies of the patched sources we want projects to build with, and which is included as a host: dependency for RAPIDS packages

Downloading to a project-specific path would be fine for headers and source files, but I'm not sure how to handle other types of files that we're getting clobbering warnings for like:

  • pkg-config scripts (e.g. 'lib/pkgconfig/fmt.pc')
  • <PackageNames>Config.cmake scripts used by find_package() (e.g. 'lib/cmake/spdlog/spdlogConfig.cmake')

(rapidsai/rmm#1508 (comment))

Note I'm saying "project-specific" here because every RAPIDS package using rapids-cmake and depending on something it decides to download is going to end up pulling these files in.

For example, look at the conda builds on rapidsai/cuml#5821. It's getting 3 packages all trying to install the same fmt headers to the same places: fmt itself, librmm, and libraft-headers-only.

This transaction has incompatible packages due to a shared path.
  packages: conda-forge/linux-64::fmt-10.2.1-h00ab1b0_0,
                    rapidsai-nightly/linux-64::librmm-24.04.00a38-cuda11_240326_ga98931b9_38,
                    rapidsai-nightly/linux-64::libraft-headers-only-24.04.00a93-cuda11_240326_g9637b3c2_93
  path: 'include/fmt/args.h'

(build link)

So if rapids_cpm_find() were to write to the same path regardless of project, e.g. include/rapids/, that'd fix conflicts with the fmt package but still leave a risk of conflicts between multiple RAPIDS packages.

Notes

There are other clobber warnings, just starting with rmm because it's the furthest upstream across RAPIDS projects.

@jameslamb
Copy link
Member Author

jameslamb commented Apr 23, 2024

Linking these related things (thanks @vyasr for the pointer).

There is already a rapids_core_dependencies package that we're publishing building to which could be used avoid every RAPIDS conda package re-vendoring the same things:

@robertmaynard
Copy link

Linking these related things (thanks @vyasr for the pointer).

There is already a rapids_core_dependencies package that we're publishing to avoid every RAPIDS conda package re-vendoring the same things:

We don't publish rapids_core_dependencies currently, it was proposed but never done.

@jameslamb
Copy link
Member Author

Ah ok. I missed the "as CI artifacts" part of the description in rapidsai/rapids-cmake#414. Thank you, edited my original post.

@jameslamb
Copy link
Member Author

I just put up a PR to build the latest version of spdlog on conda-forge: conda-forge/spdlog-feedstock#61.

That'll be helpful whenever we decide to upgrade and drop the patch in rapids-cmake (link).

@jameslamb
Copy link
Member Author

jameslamb commented Apr 23, 2024

I've looked across all the open PRs that add this setting:

conda config --set path_conflict prevent

And see the following common issues.

1: fmt and spdlog headers packaged in librmm

This transaction has incompatible packages due to a shared path.
  packages: conda-forge/linux-aarch64::fmt-10.2.1-h2a328a1_0, file:///tmp/conda-bld-output/linux-aarch64::librmm-24.06.00a16-cuda12_240419_g9dfd9070_16
  path: 'include/fmt/chrono.h'

2: librmm headers packaged in libraft-headers-only

Example:

This transaction has incompatible packages due to a shared path.
  packages: rapidsai-nightly/linux-64::librmm-24.04.00a38-cuda11_240326_ga98931b9_38, rapidsai-nightly/linux-64::libraft-headers-only-24.04.00a93-cuda11_240326_g9637b3c2_93
  path: 'include/rmm/mr/device/arena_memory_resource.hpp'

references:

3: CCCL (cub, libcudacxx, thrust) packages in librmm and libraft-headers-only

This transaction has incompatible packages due to a shared path.
 packages: rapidsai-nightly/linux-64::librmm-24.04.00a38-cuda11_240326_ga98931b9_38, rapidsai-nightly/linux-64::libraft-headers-only-24.04.00a93-cuda11_240326_g9637b3c2_93
 path: 'include/rapids/cub/agent/agent_radix_sort_upsweep.cuh'

4: (CUDA 11.x only) getting the same lib{component}.so from conda-forge/cudatoolkit and nvidia/{component}

This transaction has incompatible packages due to a shared path.
  packages: nvidia/linux-64::cuda-nvtx-11.8.86-0, conda-forge/linux-64::cudatoolkit-11.8.0-h4ba93d1_13
  path: 'lib/libnvToolsExt.so'

5: (CUDA 11.x only) getting the same lib{component}.so from nvidia/{component} and nvidia/{component}-dev

This transaction has incompatible packages due to a shared path.
  packages: nvidia/linux-64::libcusparse-11.7.5.86-0, nvidia/linux-64::libcusparse-dev-11.7.5.86-0, conda-forge/linux-64::cudatoolkit-11.8.0-h4ba93d1_13
  path: 'lib/libcusparse.so.11'

rapids-bot bot pushed a commit to rapidsai/cuxfilter that referenced this issue Apr 26, 2024
Contributes to rapidsai/build-planning#54

Ensures that conda builds will fail in the presence of multiple packages with identical paths

Authors:
  - Jake Awe (https://github.com/AyodeAwe)
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Ajay Thorve (https://github.com/AjayThorve)
  - Ray Douglass (https://github.com/raydouglass)

URL: #583
rapids-bot bot pushed a commit to rapidsai/raft that referenced this issue May 2, 2024
Contributes to rapidsai/build-planning#54.

The `libraft-headers` and `libraft-headers-only` conda packages are bundling `rmm`'s headers. I believe that's because the `librmm` conda package isn't available in the `libraft*` conda build environment, and as a result it's getting `rmm` via CPM (thanks to `rapids-cmake`).

As a result, this project and any that depend on it are seeing warnings like the following in conda builds where `conda`'s `path_conflict` setting is set to `warn` or `prevent` (like #2245):

```text
This transaction has incompatible packages due to a shared path.
  packages: rapidsai-nightly/linux-64::librmm-24.04.00a38-cuda11_240326_ga98931b9_38, rapidsai-nightly/linux-64::libraft-headers-only-24.04.00a93-cuda11_240326_g9637b3c2_93
  path: 'include/rmm/mr/device/arena_memory_resource.hpp'
```

To fix that, this proposes the following changes:

* make `librmm` a `host` dependency of the following conda packages: `libraft-headers-only`, `libraft-headers`

### Benefits of this change

* slightly smaller `libraft-headers` and `libraft-headers-only` conda packages
* reduced risk of runtime and installation issues caused by file clobbering

## Notes for reviewers

### History of changes to the `librmm` dependency for `libraft-headers`:

* both `run` and `host`: #508
* both `run` and `host`, but ignoring its `run_exports`: #1264
* just `run`, but ignoring its `run_exports`: #2102

In particular, #2102 referred to the `host` dependency on `librmm` as "extraneous" but from a packaging perspective, I don't think it is. `librmm` being in `host` means it'll be present in the build environment, which means its headers will be *found* instead of *downloaded*, and therefore not packaging into the `libraft*` conda packages.

### How I tested this

Built all the `raft` conda packages locally from `branch-24.06` and confirmed that they contain `rmm` headers. Then again from this branch and confirmed they were gone.

```shell
docker run \
    --rm \
    --env-file "${PWD}/aws-creds.env" \
    -v $(pwd):/opt/work \
    -w /opt/work \
    -it rapidsai/ci-conda:cuda12.2.2-ubuntu22.04-py3.10-amd64 \
    bash

CI="true" \
  ci/build_cpp.sh

# On 'branch-24.06', this showed the rmm headers being packaged.
# On this branch, they're omitted.
tar --bzip2 -tf \
  /tmp/conda-bld-output/linux-64/libraft-headers-only-24.06.00a50-cuda12_240430_g1e0e2283_50.tar.bz2 \
| grep 'include/rmm' \
| wc -l
```

Also checked the CI logs from `conda-cpp-build` jobs here. On other recent PRs, I see CPM downloading `rmm` ...

```text
-- CPM: Adding package [email protected] (branch-24.06)
```

... and all the `rmm` headers getting installed as part of the `libraft-headers` package

```text
-- Installing: /opt/conda/conda-bld/_h_env_placehold_placehold_..._placeho/include/rmm/cuda_stream.hpp
```

([build link](https://github.com/rapidsai/raft/actions/runs/8904352932))

Here, I see `librmm` coming through via the conda package requirements ...

```text
The following NEW packages will be INSTALLED:
    ...
    librmm:                      24.06.00a17-cuda12_240430_g26fa9ecb_17 rapidsai-nightly
```

... and being used instead of downloads via CPM ...

```text
-- CPM: Using local package [email protected]
```

... and no installation of the `rmm` headers as part of building any `libraft` packages.

([build link](https://github.com/rapidsai/raft/actions/runs/8910675575/job/24470450187?pr=2284))

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Ray Douglass (https://github.com/raydouglass)

URL: #2284
@jameslamb
Copy link
Member Author

I'm pausing on this for now to help out with some of the other initiatives (like #31 and #33).

I believe rapidsai/raft#2284 fixed the issues with raft vendoring rmm headers. Think the right long-term solution for the CCCL + fmt + spdlog stuff is to distribute rapids-core-dependencies with those things in it, which I'd love to help with (cc @robertmaynard) but probably couldn't get done before we enter burndown for 24.06.

@vyasr
Copy link
Contributor

vyasr commented May 7, 2024

Thanks for the update James! I agree that those two are more immediate priorities, but the progress you've already made on here is great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants