Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] libcudf conda packages are shipping dependencies in the package #13230

Open
bdice opened this issue Apr 26, 2023 · 3 comments
Open

[BUG] libcudf conda packages are shipping dependencies in the package #13230

bdice opened this issue Apr 26, 2023 · 3 comments
Assignees
Labels
bug Something isn't working CMake CMake build issue libcudf Affects libcudf (C++/CUDA) code.

Comments

@bdice
Copy link
Contributor

bdice commented Apr 26, 2023

Describe the bug
Currently, libcudf conda packages are shipping libraries/headers from libcudf dependencies. In particular:

libcudf's conda package is 548 MB (all reported sizes are unzipped). It currently ships libraries/headers from:

  • kvikio (< 1 MB)
  • libcudacxx (5.2 MB)
  • nvbench (4.1 MB)
  • nvcomp (46 MB)

Expected behavior
libcudf conda packages should not ship its dependencies, but should instead list other conda packages as dependencies. This will prevent files from outside libcudf from being shipped in libcudf packages.

Specific proposals for resolution

kvikio (easy)
kvikio should be easy to fix. RAPIDS already produces conda packages of kvikio and just need to use those, rather than letting it be found via CPM and repackaged in libcudf.

✔️ Resolved in: #13231

libcudacxx (medium difficulty)
libcudacxx is part of CCCL, along with Thrust and CUB. Historically, librmm has shipped most of the CCCL headers in include/rapids because they've been found by CPM during the librmm build and repackaged there. However, this is not ideal. libcudacxx is an exception to librmm shipping a RAPIDS-vendored CCCL because it was not needed by librmm -- only libcudf. I suspect we're repackaging libcudacxx in many RAPIDS packages right now. The fix here is to adopt rapids-core-dependencies across all of RAPIDS, which should be mostly ready for use, if I recall correctly.

nvbench (very difficult)
Ideally, nvbench should be its own conda package, but we often require patched versions. Therefore, we should consider moving nvbench headers to a special subdirectory include/rapids as we do for CCCL (see rapidsai/rapids-cmake#98). We should also do a more granular review to decide if (and where) we want to be shipping files like bin/nvbench-ctl or lib/objects-Release/nvbench.main/main.cu.o.

nvcomp (medium difficulty)
nvcomp should be shipping in a separate conda package (probably on either the rapidsai or nvidia channels). Then we make it a dependency of libcudf. It is the largest component of libcudf that is being repackaged (around 10% of libcudf's size).

✔️ Resolved in: #13566

cc: @vuule (with whom I discussed this)

@bdice bdice added bug Something isn't working libcudf Affects libcudf (C++/CUDA) code. CMake CMake build issue conda labels Apr 26, 2023
@bdice bdice self-assigned this Apr 26, 2023
@jakirkham
Copy link
Member

cc @madsbk

rapids-bot bot pushed a commit that referenced this issue May 9, 2023
…dency. (#13231)

Provides a partial solution to #13230. During conda builds of `libcudf`, `libkvikio` is currently being fetched by CPM rather than supplied by a conda dependency. This means `libkvikio` headers are being shipped as part of `libcudf`'s packages, which is not desirable.

I also added `libcufile[-dev]` as an explicit dependency of libcudf. Note that this is only available on linux64 (amd64), not aarch64 (arm64). We should always make the cuFile library available at build time for conda packages on amd64.

The impacts of this change are:
- `libcudf` conda packages will no longer ship `libkvikio` headers (those headers will instead be supplied by `libkvikio` packages). This reduces the size of libcudf and prevents clobbering files from `libkvikio`.
- `libcudf` will have a dependency on `libcufile` from the `nvidia` channel on `linux64` (but not `aarch64`, since libcufile packages currently do not exist for `aarch64`).

This change does not impact builds outside of conda-build.

Authors:
  - Bradley Dice (https://github.com/bdice)
  - Vukasin Milovanovic (https://github.com/vuule)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Vukasin Milovanovic (https://github.com/vuule)

URL: #13231
@jakirkham
Copy link
Member

KvikIO also uses nvCOMP and could benefit from using a package as well. Filed issue ( rapidsai/kvikio#242 ) to track that

@bdice bdice mentioned this issue Jun 14, 2023
3 tasks
rapids-bot bot pushed a commit that referenced this issue Jun 26, 2023
This PR uses conda-forge packages of `nvcomp` rather than fetching a tarball. This means that the nvcomp binary should not be shipped in the libcudf conda package, but is instead listed as a dependency. This will reduce libcudf's conda package size.

Addresses part of #13230.

Authors:
  - Bradley Dice (https://github.com/bdice)
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Ray Douglass (https://github.com/raydouglass)
  - https://github.com/jakirkham

URL: #13566
rapids-bot bot pushed a commit to rapidsai/ucxx that referenced this issue Jul 5, 2023
Currently conda builds are fetching `rmm` from CPM, rather than using conda packages. This PR aims to use conda for as many dependencies as possible to avoid repackaging and clobbering issues.

Resolves #20.

Related to #22, #54, similar to rapidsai/cudf#13230

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - https://github.com/jakirkham
  - Peter Andreas Entschev (https://github.com/pentschev)
  - Ray Douglass (https://github.com/raydouglass)

URL: #64
@vyasr
Copy link
Contributor

vyasr commented May 15, 2024

@bdice should we move this issue to rapidsai/build-planning#54 for a more centralized discussion?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CMake CMake build issue libcudf Affects libcudf (C++/CUDA) code.
Projects
Status: In Progress
Development

No branches or pull requests

3 participants