Upload the package zip in chunks for GHA #1043

quyykk · 2023-04-28T19:56:26Z

This fixes microsoft/vcpkg#31072 and microsoft/vcpkg#31132.

It changes the upload to the cache so that it happens in chunks of 450MB, instead of all at once. This is because GitHub rejects uploads bigger than ~500MB (educated guess) to its cache.

~~The implementation is a bit hacky though, but I haven't found a better solution: It splits the file into multiple 450MB chunk files on disk.~~

It now reads 450MB chunks from the file at a time.

BillyONeal · 2023-04-28T23:51:37Z

Sorry for the noise, trying to verify that the transition over to GitHub Actions for the PR bot is working...

src/vcpkg/archives.cpp

src/vcpkg/base/downloads.cpp

autoantwort · 2023-05-30T16:01:40Z

I haven't found a better solution: It splits the file into multiple 450MB chunk files on disk.

You could pass the data via stdin.

quyykk · 2023-06-04T09:33:37Z

You could pass the data via stdin.

I could but I honestly have no idea how.

autoantwort · 2023-07-21T14:59:13Z

I could but I honestly have no idea how.

There is now #1134 :)

quyykk · 2023-08-30T19:29:13Z

I've switched it to use stdin instead, and it still works 😄. This PR is again ready for review

quyykk · 2023-09-07T19:29:02Z

Seems like the formatting errors are caused by GitHub upgrading their clang-format, and not my fault. 😄

src/vcpkg/base/downloads.cpp

autoantwort · 2023-09-09T18:56:33Z

I always also build the format target locally so that the format is always right (if you use clang-format 16)

Co-authored-by: autoantwort <[email protected]>

autoantwort · 2023-09-09T19:14:54Z

src/vcpkg/base/downloads.cpp

+                StringView(buffer.data(), bytes_read));
+            if (!res.get() || *res.get() != 0 || (code >= 100 && code < 200) || code >= 300)
+            {
+                return msg::format_error(msgCurlFailedToPutHttp,


Theoretically not completely true, but I don't care

quyykk · 2023-09-14T21:57:19Z

@BillyONeal friendly ping 😄

DerZade · 2024-01-22T18:54:30Z

What is the status of this? 🤔

quyykk · 2024-03-11T20:17:22Z

I fixed the merge conflicts that accumulated. Just needs someone from the team to review 😄

src/vcpkg/base/downloads.cpp

src/vcpkg/binarycaching.cpp

Co-authored-by: Thomas1664 <[email protected]>

dg0yt · 2024-03-13T08:49:43Z

src/vcpkg/base/downloads.cpp

+        base_cmd.string_arg(url);
+
+        auto file_ptr = fs.open_for_read(file, VCPKG_LINE_INFO);
+        std::vector<char> buffer(chunk_size);


I see that the default size is 450 MB. And no limits.
Is there an alternative to reading it into a buffer first just to forward it to anothers command stdin?
(Remembering all those Raspi people which struggle to build vcpkg due to low memory...)

The original way was to split the file on disk, but that's pretty hacky I think.

But I can decrease the buffer size. I'm not sure what you mean with limit.

Are people really running a GitHub Runner server on a Raspi lmao 😄

Are people really running a GitHub Runner server on a Raspi lmao

Well, this is only the tool uploading the artifacts. Caching large artifacts is more important when build machine power is low.

Ah okay. What buffer size do you think I should use? I can't make it really small or else the upload will be way slower than it would otherwise be.

I don't know. I see the trade-offs and barriers.

Can't make curl read chunks directly from (within) a large file.

Can't feed (the vcpkg function running) curl with piecewise input. (IO buffer smaller than network chunks.)

Changing curl (tool) is out of scope here.
If the interface remains running curl instead of calling into libcurl, then it would be best to fix the second point.
If this is too intrusive, it might be helpful to have a way for the user to change the buffer size, or at least to turn of the buffering in case of trouble.

BillyONeal · 2024-03-27T22:59:45Z

src/vcpkg/base/downloads.cpp

+        std::size_t bytes_read = 0;
+        for (std::size_t i = 0; i < file_size; i += bytes_read)
+        {
+            bytes_read = file_ptr.read(buffer.data(), sizeof(decltype(buffer)::value_type), chunk_size);


I think a whole curl process launch per chunk like this is kind of a problem. I don't see reasonable ways to achieve the effect this PR wants without linking with libcurl.

I suppose it could still be done like this but the chunk sizes would have to be bigger than make sense to denote as a single contiguous memory buffer; there should be more than one read / write per curl launch, etc.......

and that sounds like a lot more work than linking with libcurl.

Neumann-A · 2024-06-10T13:42:06Z

#1422 pulls in libcurl.

quyykk and others added 2 commits April 28, 2023 14:59

Upload the package zip in chunks for GHA.

cd70f77

Merge remote-tracking branch 'origin/main' into HEAD

7fa81d1

BillyONeal reviewed May 1, 2023

View reviewed changes

src/vcpkg/archives.cpp Outdated Show resolved Hide resolved

BillyONeal reviewed May 1, 2023

View reviewed changes

src/vcpkg/archives.cpp Outdated Show resolved Hide resolved

BillyONeal requested changes May 1, 2023

View reviewed changes

src/vcpkg/base/downloads.cpp Outdated Show resolved Hide resolved

src/vcpkg/base/downloads.cpp Outdated Show resolved Hide resolved

src/vcpkg/base/downloads.cpp Outdated Show resolved Hide resolved

quyykk added 4 commits May 14, 2023 10:50

Merge remote-tracking branch 'upstream/main' into gha-upload-fixes

df0a632

Fix merge conflicts.

db160c1

Address review comments

5f343b1

Fix formatting

905209b

quyykk requested a review from BillyONeal May 23, 2023 19:59

albertziegenhagel mentioned this pull request Jul 1, 2023

Improve CI workflow albertziegenhagel/snail-server#14

Merged

autoantwort mentioned this pull request Jul 21, 2023

Infrastructure: allow passing stdin to subprocesses. #1134

Merged

quyykk added 5 commits August 30, 2023 18:29

Merge remote-tracking branch 'upstream/main' into gha-upload-fixes

77e5c79

Merge remote-tracking branch 'upstream/main' into gha-upload-fixes

68f9a6b

Pass the data through stdin instead of spliting the archive on disk.

41d09b3

Fix warning.

d025512

Fix formatting.

126232c

quyykk added 3 commits August 30, 2023 22:29

Send the correct range in the HTTP request.

18868f1

Merge remote-tracking branch 'upstream/main' into gha-upload-fixes

a1bf8c6

Some small cleanups.

e6bf304

Merge remote-tracking branch 'upstream/main' into gha-upload-fixes

6faa588

autoantwort mentioned this pull request Sep 9, 2023

Question: Why are cache entries uploaded in chunks of 32MB? actions/toolkit#1528

Open

autoantwort suggested changes Sep 9, 2023

View reviewed changes

src/vcpkg/base/downloads.cpp Outdated Show resolved Hide resolved

src/vcpkg/base/downloads.cpp Outdated Show resolved Hide resolved

src/vcpkg/base/downloads.cpp Outdated Show resolved Hide resolved

Address review comments.

8b15edd

Format :/

1150fed

autoantwort reviewed Sep 9, 2023

View reviewed changes

src/vcpkg/base/downloads.cpp Outdated Show resolved Hide resolved

quyykk and others added 2 commits September 9, 2023 21:02

Update src/vcpkg/base/downloads.cpp

dbed226

Co-authored-by: autoantwort <[email protected]>

More fixes.

ef45cb8

autoantwort reviewed Sep 9, 2023

View reviewed changes

autoantwort approved these changes Sep 9, 2023

View reviewed changes

Merge remote-tracking branch 'upstream/main' into gha-upload-fixes

e8bd3a0

quyykk added 3 commits October 8, 2023 20:58

Merge remote-tracking branch 'upstream/main' into gha-upload-fixes

a29b364

Merge remote-tracking branch 'upstream/main' into gha-upload-fixes

6e27717

Merge remote-tracking branch 'upstream/main' into gha-upload-fixes

710d38f

Merge remote-tracking branch 'upstream/main' into gha-upload-fixes

78b6bf3

Format fixes.

2568af0

Thomas1664 reviewed Mar 12, 2024

View reviewed changes

src/vcpkg/base/downloads.cpp Outdated Show resolved Hide resolved

Thomas1664 reviewed Mar 12, 2024

View reviewed changes

src/vcpkg/binarycaching.cpp Outdated Show resolved Hide resolved

quyykk and others added 2 commits March 12, 2024 10:28

Apply suggestions from code review. Thanks!

d397db3

Co-authored-by: Thomas1664 <[email protected]>

Format fixes.

608a72b

dg0yt reviewed Mar 13, 2024

View reviewed changes

BillyONeal reviewed Mar 27, 2024

View reviewed changes

Neumann-A mentioned this pull request Jun 10, 2024

add and link libcurl #1422

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upload the package zip in chunks for GHA #1043

Upload the package zip in chunks for GHA #1043

quyykk commented Apr 28, 2023 •

edited

Loading

BillyONeal commented Apr 28, 2023

autoantwort commented May 30, 2023

quyykk commented Jun 4, 2023

autoantwort commented Jul 21, 2023

quyykk commented Aug 30, 2023

quyykk commented Sep 7, 2023

autoantwort commented Sep 9, 2023

autoantwort Sep 9, 2023

quyykk commented Sep 14, 2023

DerZade commented Jan 22, 2024

quyykk commented Mar 11, 2024

dg0yt Mar 13, 2024

quyykk Mar 13, 2024

dg0yt Mar 13, 2024

quyykk Mar 13, 2024

dg0yt Mar 14, 2024

BillyONeal Mar 27, 2024

BillyONeal Mar 28, 2024

Neumann-A commented Jun 10, 2024

Upload the package zip in chunks for GHA #1043

Are you sure you want to change the base?

Upload the package zip in chunks for GHA #1043

Conversation

quyykk commented Apr 28, 2023 • edited Loading

BillyONeal commented Apr 28, 2023

autoantwort commented May 30, 2023

quyykk commented Jun 4, 2023

autoantwort commented Jul 21, 2023

quyykk commented Aug 30, 2023

quyykk commented Sep 7, 2023

autoantwort commented Sep 9, 2023

Choose a reason for hiding this comment

quyykk commented Sep 14, 2023

DerZade commented Jan 22, 2024

quyykk commented Mar 11, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Neumann-A commented Jun 10, 2024

quyykk commented Apr 28, 2023 •

edited

Loading