Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added multipart upload, added optional encryption/decryption. #24

Closed
wants to merge 85 commits into from

Conversation

pkaleta
Copy link

@pkaleta pkaleta commented Mar 15, 2013

Summary

  • Added new multipart_upload command with optional number of threads as well as chunk size arguments;
  • Integrated python-gnupg package and added optional encryption/decryption of files using GnuPG's RSA algo implementation.
  • Added optional --encrypt and --decrypt flags for enabling and disabling encryption respectively. The --encrypt is available for upload commands, whereas --decrypt for retrieval;
  • Cleaned up the code a bit;
  • Updated README.md and added .gitignore;

Testing

Sucessfully uploaded and downloaded multiple files with and without encryption using multipart as well as single chunk upload method. Files were of different sizes ranging from 1KB to 3GB.

@subnetmarco
Copy link

+1

@basak
Copy link
Owner

basak commented Mar 20, 2013

Hi,

Thanks for your work! Having encryption and multipart uploads is really useful.

Some comments:

  1. Please can you add a copyright notice to gpg.py, copy the licence over from glacier.py and confirm here that you wrote gpg.py yourself, that you own it, and are happy to licence it under these terms? I'm hoping to get glacier-cli into Debian soon so I need the copyright and licensing situation to be beyond reproach.
  2. I think 68af3c1 breaks handling of - to mean stdin. Handling for this is built in to argparse, so pulling it out breaks it. I don't mind if you want to reimplement this or go back to using argparse, but I believe that git-annex uses this functionality so I can't break it.
  3. Encryption and decryption must both be disabled by default in order to maintain backwards compatibility and compatibility with other tools.
  4. (Minor) glacier archive retrieve uses --multipart-size, so glacier archive multipart-upload should also use --part-size for consistency. It would be even better if, instead of adding a new subcommand, specifying --multipart-size automatically enabled multipart mode.
  5. (Minor) I'd prefer multipart-upload over multipart_upload. This is because other tools that use subcommands (eg. git) do it this way, so I think users would find this easier to remember.
  6. (Minor) In 389a1b8, you've added a function called most_recent_job but it only parses a date and doesn't do anything more, so the function doesn't bear any relation to the name you've given it. Can you give it a more appropriate name, please?

I appreciate the refactoring, but it's really hard to review when intermingled in the middle of other commits. I've managed to review this OK and I appreciate your work, but it would be great if in the future you could submit a separate pull request for each feature you add, and perhaps one for any refactoring you want to do. That would make it much easier for me to review, I wouldn't have to block everything on issues that only affect one feature and I'd be able accept your changes much faster.

If you could fix these issues then I'll merge straight away. Otherwise if you can just do the first two, I'll do the others and merge when I can.

So this pull request is currently blocked on licensing and breaking the stdin - interpretation feature.

Thanks!

@sakoht
Copy link

sakoht commented Jan 14, 2014

@basak I've pushed up the changes we've made to glacier-cli, and I believe I've addressed all of your concerns from last Spring.

Additionally, I took the extra mulitpart_upload command we previously added, just turned it into an optional flag on upload called --concurrent. Since boto will automatically do multi-part uploads, and the distinctive thing about the flag is that it uses threads rather than doing the upload serially, and also retries individual pieces should they fail.

@sakoht
Copy link

sakoht commented Jan 14, 2014

I should note that we've been able to upload archives as big as 40GB with the new --concurrent option. In every case some piece fails to upload, but leaning on the automatic per-part retry in boto took us from 100% failing to 100% succeeding, building on what you already had here. Thanks.

@sakoht
Copy link

sakoht commented Feb 12, 2014

Hey @basak, just wanted to ping you on this.

It should be 100% supplementary changes, not affecting existing functionality. Can you let us know if it needs anything else?

@sakoht
Copy link

sakoht commented Feb 21, 2014

=> @basak

@aspiers
Copy link

aspiers commented Apr 24, 2015

What's the latest on this?

@pkaleta
Copy link
Author

pkaleta commented Apr 24, 2015

No response from @basak for over a year now. Not sure if he's still maintaining this project. You can try using https://github.com/pkaleta/glacier-cli instead.

@basak
Copy link
Owner

basak commented Apr 24, 2015

Sorry, this is near impossible for me to review:

From my initial review: "I appreciate the refactoring, but it's really hard to review when intermingled in the middle of other commits. I've managed to review this OK and I appreciate your work, but it would be great if in the future you could submit a separate pull request for each feature you add, and perhaps one for any refactoring you want to do."

@aspiers
Copy link

aspiers commented Apr 24, 2015

I entirely sympathesise with @basak's position here - I'm sure this PR is full of many great improvements, but it's so huge that it seems to me like it would be a nightmare to review. @pkaleta is there any chance you could break it into smaller more manageable chunks, please?

@basak
Copy link
Owner

basak commented Apr 24, 2015

I do apologise for taking so long to say this. That is my fault - I should have asked as soon as you (pkaleta) updated your request. When I first saw the update, I thought "OK, I'll try and take this and break it up, figure it out and commit what I can". But it was such a large task I didn't get to it and forgot all about it.

I'd really to get pull requests in small manageable pieces with mistakes/corrections rebased out. Like the Linux kernel does it - it's easiest to review that way.

@aspiers
Copy link

aspiers commented Apr 25, 2015

Hmm, I probably asked the wrong person here - @sakoht is there any chance you could break this into smaller more manageable chunks, please?

@sakoht
Copy link

sakoht commented Apr 26, 2015

Adam I'd love to but I've left the company that needed it, and don't have the bandwidth to jump back currently. Piotr is there (before and after me), and is more familiar with the latest state of things, I believe.

We aimed to have the update be very thorough before submitting. Down side is the size, sadly.

On Apr 25, 2015, at 4:48 AM, Adam Spiers [email protected] wrote:

Hmm, I probably asked the wrong person here - @sakoht is there any chance you could break this into smaller more manageable chunks, please?


Reply to this email directly or view it on GitHub.

@aspiers
Copy link

aspiers commented Apr 26, 2015

Thanks a lot for the update @sakoht!

@pkaleta pkaleta closed this Oct 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants