Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drivers: Refactor package as a more generic and performant library #2223

Merged
merged 9 commits into from
May 19, 2022

Conversation

victorges
Copy link
Member

@victorges victorges commented Feb 1, 2022

What does this pull request do? Explain your changes. (required)
This is to make some improvements to the drivers package so that it can also be used
as a library from other projects, specifically the task-runner for VOD tasks. It is currently
using the drivers library for a platform-agnostic way of accessing Object Stores for reading
and writing VOD assets.

The greatest advantage of that is that we can just use the same Object Store API from
livepeer-com to declare where assets should be stored as well as stream recordings,
keeping the whole interface leaner.

Specific updates (required)

  • Make the Size property optional in our File representation. It is optional in the AWS SDK and is not returned in all implementations of it, like previous versions of Filebase, so it was required at some point (not anymore) but I still think it makes sense to keep it.
  • Make the drivers receive an io.Reader instead of the whole buffered file. The reader is compatible with clients that have the whole bytes in memory, but requiring all clients to buffer the entire contents in memory is not ideal. This would have made the VOD tasks much more inefficient, needing to read the entire files in memory before sending to the OS.
  • Use an S3 Uploader for uploading files to S3. This avoids the need of buffering the files before uploading, since the simpler upload SDK requires the file to be an io.ReadSeeker due to hashing purposes. The uploader is the suggested solution in the open issue in their SDK repo.

How did you test each of these updates (required)

  • Fixed and then ran/passed all tests in go-livepeer
  • Used the drivers lib in the task-runner for VOD and already deployed it in production, it's working great
  • Did a bunch of performance tests with Storj folks to come up with the optimal configurations for the S3 uploader (which end up being good both for Storj and for GCP)

Does this pull request close any open issues?
Part of livepeer/task-runner#14

Checklist:

(does it need a changelog update?)

@victorges victorges changed the base branch from master to vg/fix/drivers-file-size February 1, 2022 16:19
@victorges victorges changed the base branch from vg/fix/drivers-file-size to master March 29, 2022 23:13
@victorges victorges force-pushed the vg/feat/drivers-lib branch 3 times, most recently from a33e9b4 to 9f8c762 Compare March 29, 2022 23:25
@victorges victorges changed the title Vg/feat/drivers lib drivers: Export drivers package as a more generic and performant library Mar 29, 2022
@victorges victorges changed the title drivers: Export drivers package as a more generic and performant library drivers: Refactor package as a more generic and performant library Mar 29, 2022
@victorges victorges marked this pull request as ready for review March 29, 2022 23:34
@victorges victorges requested review from yondonfu and leszko and removed request for darkdarkdragon March 29, 2022 23:34
Copy link
Contributor

@leszko leszko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added 2 comments, other than that LGTM 👍

Also remember to:

  • Update CHANGELOG_PENDING.md
  • Consider squashing some commits before merging

drivers/s3.go Outdated Show resolved Hide resolved
drivers/s3.go Outdated Show resolved Hide resolved
The Filebase S3-compatible API does not return a
file size in the response for files greater than 255 bytes.

That caused a panic on s3.go:267 when reading recordings.

With this change, the problem won't happen anymore cause
we don't even try to dereference the ContentLength, but
hold it as a nilable pointer that should be checked wherever
it is used (right now: nowhere).
We want to use this lib for saving large files to ObjectStore
as well for the VOD tasks. Since they can be really large, we
don't want to have to buffer the whole files in memory for
saving. So changing this API to receive a Reader instead makes
more sense.
Avoid the need to buffer the whole input in memory before saving.
Will also be more performant for large files.
Increase the concurrency and the part size so the uploads
are more performant in our production environment.

These are also more compatible with Storj's configuration
which saves files in 64MB chunks on the decentralized network,
so that part size also works well for them. But it still improves
upload performance to gcloud in around 10x as well.
@codecov
Copy link

codecov bot commented May 19, 2022

Codecov Report

Merging #2223 (1f5892b) into master (30b7330) will decrease coverage by 0.06086%.
The diff coverage is 33.33333%.

❗ Current head 1f5892b differs from pull request most recent head 5c28f2f. Consider uploading reports for the commit 5c28f2f to get more accurate results

Impacted file tree graph

@@                 Coverage Diff                 @@
##              master       #2223         +/-   ##
===================================================
- Coverage   54.93891%   54.87805%   -0.06086%     
===================================================
  Files             93          93                 
  Lines          19316       19352         +36     
===================================================
+ Hits           10612       10620          +8     
- Misses          8115        8140         +25     
- Partials         589         592          +3     
Impacted Files Coverage Δ
core/orchestrator.go 78.75723% <0.00000%> (ø)
drivers/gs.go 10.23622% <0.00000%> (-0.16379%) ⬇️
drivers/s3.go 11.08179% <27.27273%> (+0.94093%) ⬆️
drivers/local.go 50.00000% <30.00000%> (-1.11111%) ⬇️
server/segment_rpc.go 78.66667% <50.00000%> (-0.08519%) ⬇️
drivers/drivers.go 27.96610% <66.66667%> (ø)
drivers/overwrite_queue.go 84.48276% <100.00000%> (ø)
drivers/session_mock.go 52.63158% <100.00000%> (ø)
server/broadcast.go 78.24497% <100.00000%> (ø)
server/mediaserver.go 66.13672% <100.00000%> (ø)
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 30b7330...5c28f2f. Read the comment docs.

It returns an error so had to change a few
signatures.
Also add comments with context about how those numbers came up.
@victorges victorges merged commit 13165dd into master May 19, 2022
@victorges victorges deleted the vg/feat/drivers-lib branch May 19, 2022 16:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants