Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch Conda packages to .conda (instead of .tar.bz2) #98

Open
jakirkham opened this issue Aug 28, 2024 · 6 comments · May be fixed by rapidsai/ci-imgs#176
Open

Switch Conda packages to .conda (instead of .tar.bz2) #98

jakirkham opened this issue Aug 28, 2024 · 6 comments · May be fixed by rapidsai/ci-imgs#176
Labels

Comments

@jakirkham
Copy link
Member

jakirkham commented Aug 28, 2024

Currently RAPIDS builds and publishes packages using .tar.bz2. However this was revamped in the newer .conda packages. The make a few important changes:

  1. Use a top-level uncompressed .zip file
  2. Means all .conda packages can be renamed to .zip and extracted
  3. As .zip files allow random access, specific items can be retrieved
  4. Metadata is placed in the top-level
    1. Contains info about the .conda metadata itself
    2. Package metadata (how it was built, dependencies needed, etc.)
    3. Compressed package contents (currently using Zstd)

This can help with solve times (no need to decompress .tar.bz2 to find metadata first). It can also help with download sizes & times (we noticed ~30% reduction in size of the legacy cudatoolkit when transitioning conda-forge)


To make the change, we would simply need to update our condarc at build time to include

conda_build:
  pkg_format: '2'

This can also be done with the following command:

conda config --set conda_build.pkg_format '2'

Also as pointed out by James in comment ( rapidsai/ci-imgs#176 (review) ) there are a few places where .tar.bz2 shows up in the RAPIDS org. Of these noted that some were:

  • Data files (so not Conda packages)
  • Archived repos (so not active)
  • Dormant repos (haven't seen activity in years; candidates for archival)
  • Forks of upstream repos (in some cases these overlap with the above categories)

Decided to skip the above. In the event we want to revitalize one of these projects, likely this is one of many changes that will be needed.

Of the remainder saw:

  • Infrastructure tools
  • Shared CI content
  • One-off projects

Made a best effort to update these. With some one-off projects, they don't run CI yet; so, we can likely move ahead without those

@jakirkham jakirkham added the epic label Aug 28, 2024
@jakirkham
Copy link
Member Author

It is worth noting that XGBoost Conda packages in RAPIDS are already built in the .conda format (for example. So we are already making some use of these today

@jakirkham
Copy link
Member Author

Think this is all we need: rapidsai/ci-imgs#176

Would be good if others can confirm though

@jameslamb
Copy link
Member

think this is all we need

We'll also need to update any places where *.tar.bz2 or similar is being used to list conda packages. See rapidsai/ci-imgs#176 (review)

@jakirkham
Copy link
Member Author

Have made a best effort to submit PRs where appropriate. Included notes on this above. Would be good if someone else can recheck whether we got everything we deem relevant

@jameslamb
Copy link
Member

I think you got everything, and I've reviewed them all: rapidsai/ci-imgs#176 (comment)

@jakirkham
Copy link
Member Author

jakirkham commented Sep 1, 2024

Think we have what we need in for PR: rapidsai/ci-imgs#176

AIUI gpuci-tools should no longer be used. Though James mentioned offline there might still be a few spots where it is used

If we do want to update gpuci-tools, we have PR: rapidsai/gpuci-tools#37

Will double check next week (after the long weekend) if we want to include that PR too

The others still open we concluded don't need to merge as they are to projects planned for archival

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants