Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have an option to verify the locally pulled files #1368

Closed
1 task done
stmlange opened this issue Apr 24, 2024 · 17 comments
Closed
1 task done

Have an option to verify the locally pulled files #1368

stmlange opened this issue Apr 24, 2024 · 17 comments
Labels
enhancement New feature or request stale Inactive issues or pull requests

Comments

@stmlange
Copy link

What is the version of your ORAS CLI

1.1.0

What would you like to be added?

Maybe I'm missing it, but assume I have a local copy of some pulled data. Is there an option to run some sha256check or something to verify that the local copy matches with the artifacts listed in the manifest?

Why is this needed for ORAS?

The manifest can have multiple digest encoding's which can make it very tricky to manually verify if the local copy is what is equal to the remote artifact.

Are you willing to submit PRs to contribute to this feature?

  • Yes, I am willing to implement it.
@stmlange stmlange added enhancement New feature or request triage New issues or PRs to be acknowledged by maintainers labels Apr 24, 2024
@qweeah
Copy link
Contributor

qweeah commented May 7, 2024

@stmlange Can you kindly explain your scenario and why the verification is needed?

@qweeah qweeah removed the triage New issues or PRs to be acknowledged by maintainers label May 7, 2024
@stmlange
Copy link
Author

stmlange commented May 7, 2024

@qweeah Thanks for reaching out. The main reason why I have created this ticket is that there doesn't seem to be an (easy) option to verify/check if the locally pulled artifacts are "equal" to what the remove artifact is.

Try to answer the question: is what I have locally really what was published to remote?

Consider for example maven/gradle that publish dedicated sha1 and md5 hashsums so one can download the hashsum and verify somehow that the published thing is "correct"/"equal" to the local variant.

With oras one can have multiple digest variants encoded (https://github.com/opencontainers/image-spec/blob/main/descriptor.md#digests).

A digest can be sha256:6c3c624b58dbbcd3c0dd82b4c53f04194d1247c6eebdaab7c610cf7d66709b3b or sha512:401b09eab3c013d4ca54922bb802bec8fd5318192b0a75f201d8b372742...or whatever ORAS supports. Due to multiple supported hashalgorithms it is therefore not trivial to manually check if the downloaded artifact is actually what was published.

Running oras pull multiple times actually seems to re-download the artifact. So it may be possible that oras pull secretly checks those digests (and potentially fails if the download was not successfull), but redownloading is a waste of network resources.

@qweeah
Copy link
Contributor

qweeah commented May 8, 2024

Thanks @stmlange for the detailed explanation.
You can utilize oras manifest fetch generate a checksum file and use shasum -c $FILE to check it.

/cc @FeynmanZhou To validate if it's something we should add to ORAS CLI.

@qweeah
Copy link
Contributor

qweeah commented May 8, 2024

@stmlange Also worth mentioning that if you want to copy an artifact in a trusted way, why not using an OCI image layout?

#  1. copy an artifact to a local folder mcr.microsoft.com/oss/kubernetes/kubectl
> oras cp mcr.microsoft.com/oss/kubernetes/kubectl:v1.28.1 -r --to-oci-layout mcr.microsoft.com/oss/kubernetes/kubectl                                                                                                               
✓ Copied  application/vnd.docker.container.image.v1+json                               1.93/1.93 kB 100.00%  571µs
  └─ sha256:919d96c9446db8f5c6cf76d98abd4c79ccfe9af241f977d87188ef3e9f6f09de
...
Copied [registry] mcr.microsoft.com/oss/kubernetes/kubectl:v1.28.1 => [oci-layout] mcr.microsoft.com/oss/kubernetes/kubectl
Digest: sha256:a01b2873f41c65aa9157baf5ec0e0beaf80e9e84bb7dfa94b081cd230b534418

# 2. cp the OCI image layout folder to some air-gap environment

# 3. pull provenance file from the copied OCI image layout folder, checksum will be verified during the pull
> oras pull --oci-layout mcr.microsoft.com/oss/kubernetes/kubectl@sha256:30019e253ab74eb3e38abae7b8997e8e60c420169
044ca9bfaf9665f54ad18bc -o in-toto
✓ Pulled      provenance.json                                                          14.9/14.9 kB 100.00%  717µs
  └─ sha256:f4740e5a3adde42224679263c7b4e76985411cb7a9504615cf1421d8afb078b5
✓ Pulled      application/vnd.oci.image.manifest.v1+json                                 682/682  B 100.00%  608µs
  └─ sha256:30019e253ab74eb3e38abae7b8997e8e60c420169044ca9bfaf9665f54ad18bc
Pulled [oci-layout] mcr.microsoft.com/oss/kubernetes/kubectl@sha256:30019e253ab74eb3e38abae7b8997e8e60c420169044ca9bfaf9665f54ad18bc
Digest: sha256:30019e253ab74eb3e38abae7b8997e8e60c420169044ca9bfaf9665f54ad18bc

@stmlange
Copy link
Author

stmlange commented May 8, 2024

Indeed with ORAS the manual way would be to download the manifest (e.g. oras manifest fetch).
The reason why I filed this issue is that I believe that you can not assume that you can check via shasum -a 256 -c $FILE or a sha256sum as this would assume the hashdigest of sha256.

I believe as per https://github.com/opencontainers/image-spec/blob/main/descriptor.md#digests oras could also have a sha512:401b09eab3c013d4ca54922bb802bec8fd5318192b0a75f201d8b372742 or a hashdigest of sha512.

@qweeah
Copy link
Contributor

qweeah commented May 8, 2024

I am not an expert of checksum file but shouldn't the length of the checksum string implies the algorithm already?

@qweeah
Copy link
Contributor

qweeah commented May 8, 2024

Yes I tested on my linux VM and different checksum can co-exist in the same checksum file

> cat a
123
> shasum -a 512 a >> sum
> shasum -a 256 a >> sum
> cat sum
ea2fe56bb8c1fb5ada84963b42ed71b764a74b092d75755173ade06f2f4aada9c00d6c302e185035cbe85fdff31698bca93e8661f0cbcef52cf2ff65864fd742  a
181210f8f9c779c26da1d9b2075bde0127302ee0e3fca38c9a83f5b1dd8e5d3b  a
> shasum -c sum
a: OK
a: OK

@qweeah
Copy link
Contributor

qweeah commented May 8, 2024

What I mean is although ORAS doesn't support sha512, still you may split the digest with : and only keep the latter part as checksum, shasum utility can auto detect the algorithm based on the length of the checksum string.

@stmlange
Copy link
Author

stmlange commented May 8, 2024

Yes in general the length of the hashed string could be used to determine the algorithm.
As https://github.com/opencontainers/image-spec/blob/main/descriptor.md#digests oras makes it even a bit easier as it encodes the used algo in front sha256:..., sha512:....

The ORAS Digest (https://github.com/opencontainers/image-spec/blob/main/descriptor.md#digests) can be more than just sha256:..., sha512:.... How should I tell checksum to verify a multihash+base58:... or sha256+b64u:..., or whatever other algos that are supported by ORAS?

@qweeah
Copy link
Contributor

qweeah commented May 8, 2024

How should I tell checksum to verify a multihash+base58:... or sha256+b64u:..., or whatever other algos that are supported by ORAS?

The demo I give generates sha256sum and sha512sum into on sum file and shasum is able to detect it automatically.

@qweeah
Copy link
Contributor

qweeah commented May 8, 2024

@stmlange You don't need to use shasum -a 256, just shasum -c is enough so the checking script won't involve the algorithm. (I have amended I earlier post and removed -a 256 from it)

@stmlange
Copy link
Author

stmlange commented May 8, 2024

The problem remains that ORAS can encode the hash as sha256+b64u: in the manifest. There is no gurantue that everything that is encoded in the manifest is supported as hash by shasum.

Consider the multihash+base58:... or sha256+b64u:... which can't be verified with shasum easily.
Hence if we really need to go down the manual validation it would be a very tedious as one needs to do different things based on the digest used in the manifest.

Consider:

$ echo "123" > a
$ shasum -a 256 a >> sum
$ sha256sum a | cut -d ' ' -f 1 | xxd -r -p | base64 >> sum

$ sha256sum -c sum
a: OK
sha256sum: WARNING: 1 line is improperly formatted

@qweeah
Copy link
Contributor

qweeah commented May 8, 2024

multihash+base58:... and sha256+b64u:... are not registered in OCI spec and not supported (see a related test case of OCI digest library)

@stmlange
Copy link
Author

stmlange commented May 8, 2024

Ok I see that only sha256:... and sha512:... are actually registered and supported algorithms https://github.com/opencontainers/image-spec/blob/main/descriptor.md#registered-algorithms.

However I still think it is not that easy (I guess sometimes even impossible) to run a shasum with just the manifest.
Assume the example manifest from https://github.com/opencontainers/image-spec/blob/main/manifest.md#example-image-manifest.

It just tells us the "digest", but we don't know the filename.
E.g. try

oras manifest fetch --pretty ..... | grep -o '"digest": "[^"]*' | grep -o '[^:]*$' | shasum -c --

For a shasum to work we need both filename and digest:

$ cat a
123
$ shasum -a 512 a >> sum
$ shasum -a 256 a >> sum
$ cat sum
ea2fe56bb8c1fb5ada84963b42ed71b764a74b092d75755173ade06f2f4aada9c00d6c302e185035cbe85fdff31698bca93e8661f0cbcef52cf2ff65864fd742  a
181210f8f9c779c26da1d9b2075bde0127302ee0e3fca38c9a83f5b1dd8e5d3b  a

in theory one could workaround the issue by attaching the filenames using annotation to the manifest like:

   {
     "mediaType": "application/vnd.oci.image.layer.v1.tar",
     "size": 14189,
     "digest": "sha256:181210f8f9c779c26da1d9b2075bde0127302ee0e3fca38c9a83f5b1dd8e5d3b",
     "annotations": {
       "org.opencontainers.image.title": "blah.blah"
     }
   }

I still think and feel that manual validation is not the way to go :-)

@qweeah
Copy link
Contributor

qweeah commented May 8, 2024

Generating the checksum file is not easy with v1.1.0 but will be improved in v1.2.0. You can try the main build container, e.g. generate checksum file for mcr.microsoft.com/oss/kubernetes/kubectl@sha256:30019e253ab74eb3e38abae7b8997e8e60c420169044ca9bfaf9665f54ad18bc

> docker run ghcr.io/oras-project/oras:main manifest fetch mcr.microsoft.com/oss/kubernetes/kubectl@sha256:30019e253ab74eb3e38abae7b8997e8e60c420169044ca9bfaf9665f54ad18bc --format '{{range .content.layers}}{{if index .annotations "org.opencontainers.image.title"}}{{.digest}} {{index .annotations "org.opencontainers.image.title"}}{{println}}{{end}}{{end}}'
sha256:f4740e5a3adde42224679263c7b4e76985411cb7a9504615cf1421d8afb078b5 provenance.json

Copy link

github-actions bot commented Jul 8, 2024

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days.

@github-actions github-actions bot added the stale Inactive issues or pull requests label Jul 8, 2024
Copy link

github-actions bot commented Aug 7, 2024

This issue was closed because it has been stalled for 30 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request stale Inactive issues or pull requests
Projects
None yet
Development

No branches or pull requests

2 participants