Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: add proposal for CephFS fscrypt integration #2912

Merged
merged 1 commit into from
May 24, 2022

Conversation

irq0
Copy link
Contributor

@irq0 irq0 commented Mar 1, 2022

Add proposal document covering key management integration
of Ceph CSI and https://github.com/google/fscrypt

Updates: #1563
Signed-off-by: Marcel Lauhoff [email protected]

@mergify mergify bot added ci/skip/e2e skip running e2e CI jobs component/docs Issues and PRs related to documentation labels Mar 1, 2022
@humblec
Copy link
Collaborator

humblec commented Mar 2, 2022

Cc @jtlayton @kotreshhr appreciated your view on this proposal to integrate CephFS encryption with CSI.

management systems
- `fscrypt` handles key derivation, storage of wrapped keys and metadata

The current CephFS subvolume root will remain untouched with the exception that
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we referring to SubVoumeGroup here as subvolume root or its the filesystem ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subvolume root refers to the root directory of the subvolume. What gets mapped into a pod once mounted.

On my testsetup this would be:

$ bin/ceph fs subvolume info a csi-vol-4fa8245c-9b00-11ec-bdd8-eed58c1c7c89 csi
{
...
    "path": "/volumes/csi/csi-vol-4fa8245c-9b00-11ec-bdd8-eed58c1c7c89/32a6c9c4-d83a-482b-8abf-e6b0ea676f3a",
...
}

$ sudo mount -t ceph -o name=admin,secret=.. 192.168.122.1:40687:/volumes/csi/csi-vol-4fa8245c-9b00-11ec-bdd8-eed58c1c7c89/32a6c9c4-d83a-482b-8abf-e6b0ea676f3a /subvolume_root 

subdirectory. The root will contain a `/.fscrypt` directory managed by `fscrypt`.

`fscrypt` requires access to a mounted filesystem and therefore the encryption setup
must take place in the `NodeStageVolume` request handler instead of `CreateVolume`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dont we need to create a protector key at time of createvolume and store ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The protector keys are stored wrappend in the '/.fscrypt' directory on the volume. We therefore need filesystem access which we don't have in CreateVolume, hence NodeStageVolume.


## Dependencies

The proposed change is tailored to CephFS and requires CephFS support
Copy link
Collaborator

@humblec humblec Mar 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like one of the CephFS string ( from ...and requires.... ) has to be replaced...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of CephFS in that sentence :). It tries to express that this is change is agnostic to CephFS, except for the fact that it is part of Ceph CSI's CephFS integration.

@humblec
Copy link
Collaborator

humblec commented Mar 2, 2022

@irq0 first of all this is a well formatted/written design doc , Thanks for that.. 👍
I have a couple of comments on the design or proposal
Also, can we also list the dependencies or prerequisites to support this feature in Ceph CSI?

ps # the diagram is not rendering properly for me, so I may be missed the peotector key generation part which I asked in one comment.

@nixpanic nixpanic added the component/cephfs Issues related to CephFS label Mar 2, 2022
@irq0
Copy link
Contributor Author

irq0 commented Mar 3, 2022

@irq0 first of all this is a well formatted/written design doc , Thanks for that.. +1 I have a couple of comments on the design or proposal Also, can we also list the dependencies or prerequisites to support this feature in Ceph CSI?

Added runtime and build dependencies to the Dependencies section

ps # the diagram is not rendering properly for me, so I may be missed the peotector key generation part which I asked in one comment.

Odd. Maybe the mermaid live editor works better:

https://mermaid.live/edit#pako:eNqVVE1v4yAQ_SuIS3Ylt72ukqpSm_QQuZWiWuoe7MiiMGmsxIAAb2S1_e8LxgnBdbVbLNnD8N58PDBvmAoGeIpfFZFb9PBUcGTHH9LsTf7s3glawMYZa3RxcfPOiCEIOFWtNJXgaAftO6K6Kne1LituwAYywHwYDVSB0fmKaC23imi4flE36S99XEnQ7e8sQcu7R5RCi1ZKGKB9JnkihQQ1GOIqKLh7uhTNiy-dgtyWPS6f2wmaZ0uUPmZrD3RjGCafHK0JMq0EtLhPkTZCwQgpNJdPgj1C9FTgbKRK6TsUKt_oTsPgOU_Z2Fh1GRTI553H6XgQimWiURTWs8BQ5HAOfyIHq2eABeApXRDhx9VlX8zVaVX__Bet26ODbUoC60_BsOrZ9_hxD7OBkLGMYl_RNmjYTc8E9I6y4S7BJ_do526pAtt31HZM-FzzSKJo44cnrotgD8uIWjEhnLBAiQU6pRgG8n_PUfAv64yj_QfJ0yKnZ3UeT9mB4rDP0-6DbldLKydOcA2qJhWz98ybi1Jgs4UaCjy1JvPXS4EL_mGhjbQ6wT2rbBl4uiF7DQkmjRFZyymeGtXAEbSoiJWo7lEffwFbzbJp

Copy link
Contributor

@Rakshith-R Rakshith-R left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jfyi There is a feature request on cephfs for metadata capability similar to rbd
https://tracker.ceph.com/issues/54472

docs/design/proposals/cephfs-fscrypt.md Outdated Show resolved Hide resolved
@humblec
Copy link
Collaborator

humblec commented Mar 14, 2022

Waiting for CephFS team input on this l.. hopefully we will have it soon! thanks !

Copy link

@jtlayton jtlayton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks reasonable to me. I'm afraid I don't have enough intimate knowledge of Ceph CSI to know what the best method to use is.

- No `/.fscrypt` on the subvolume root

Drawbacks
- `fscryptctl` is a C tool and does not lend itself to be integrated into Ceph CSI

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw: fscryptctl is very simple and just calls a bunch of ioctls. You could probably drive those ioctls from Go as well.

Copy link
Contributor Author

@irq0 irq0 Mar 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is not a big drawback. Using the policy functions from fscrypt would also be an option. (https://github.com/google/fscrypt/blob/master/metadata/policy.go)

@humblec
Copy link
Collaborator

humblec commented Mar 21, 2022

@irq0 can you please correct the CI linter failure ?

@humblec humblec requested a review from nixpanic March 21, 2022 11:47
subvolumes.

Due to the way `fscrypt` stores metadata, subvolumes have a regular root
directory containing a `/.fscrypt` directory and a

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, the .fscrypt directory is one of the main reasons I wanted the CSI team to think about this. This directory is not generally an issue with local filesystems like ext4, as admins almost always mount the root of the fileystem, and you can generally ensure that this directory is available to userland applications.

Contrast that with something like NFS or Ceph, where mounting a subdirectory of an export is very common. You could end up in a situation where you've run fscrypt setup on an upper-level directory but then someone mounts a subdirectory of it. The .fscrypt directory won't be available on the client at that point.

The placement of the .fscrypt directory is crucial if you intend to use the fscrypt binary under the hood. I wonder if it may even be better to do something like store the info that's currently in .fscrypt directly in a RADOS objects instead, which would sort of sidestep the whole issue. That would mean a major overhaul for fscrypt, however (or a rewrite).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll split the anwer into the .fscrypt directory and RADOS object alternative.

I think we should limit the discussion to the Ceph CSI context.
We basically have two options:

  1. Integrate Ceph CSI with the fscrypt tooling as is (this design
    doc)
  2. Create another key management option that is specific to CephFS
    (Alternative [ceph-csi-ksm]; Fscrypt Metadata on RADOS below)
  • The .fscrypt directory is necessary for 1.
    1. and 2. can coexist on the same filesystem, since they are
      user land key management systems that ultimately set a fscrypt
      policy in the kernel.

I'm in favor with 1. Why? It would enable users to manually get a
secret from K8s secrets or Vault, mount a CephFS somewhere and use the
fscrypt tool without changes with that secret to unlock the
encrypted data. No Ceph specific tools required.

The .fscrypt metadata directory and Ceph CSI

To make things a bit more concrete, here is what a freshly created
PV/PVC looks like on an otherwise empty CephFS with an implementation
based on the doc:

# mount -t ceph -o name=admin,secret=$(bin/ceph auth get-key client.admin) \
   $(bin/ceph mon dump --format json | jq -r '.mons[] | .addr' | sed -e 's/\/0//'):/ \
   /mnt
# find /mnt
/mnt
/mnt/volumes
/mnt/volumes/_csi:csi-vol-cab93367-afff-11ec-89f2-620077d1dfcd.meta
/mnt/volumes/csi
/mnt/volumes/csi/csi-vol-cab93367-afff-11ec-89f2-620077d1dfcd
/mnt/volumes/csi/csi-vol-cab93367-afff-11ec-89f2-620077d1dfcd/703b0bb0-0d92-4c98-a1d2-ef943a4260ad
/mnt/volumes/csi/csi-vol-cab93367-afff-11ec-89f2-620077d1dfcd/703b0bb0-0d92-4c98-a1d2-ef943a4260ad/.fscrypt
/mnt/volumes/csi/csi-vol-cab93367-afff-11ec-89f2-620077d1dfcd/703b0bb0-0d92-4c98-a1d2-ef943a4260ad/.fscrypt/protectors
/mnt/volumes/csi/csi-vol-cab93367-afff-11ec-89f2-620077d1dfcd/703b0bb0-0d92-4c98-a1d2-ef943a4260ad/.fscrypt/protectors/46ef26343bfaa137
/mnt/volumes/csi/csi-vol-cab93367-afff-11ec-89f2-620077d1dfcd/703b0bb0-0d92-4c98-a1d2-ef943a4260ad/.fscrypt/policies
/mnt/volumes/csi/csi-vol-cab93367-afff-11ec-89f2-620077d1dfcd/703b0bb0-0d92-4c98-a1d2-ef943a4260ad/.fscrypt/policies/451ca267026dd6704b85efdccef305f8
/mnt/volumes/csi/csi-vol-cab93367-afff-11ec-89f2-620077d1dfcd/703b0bb0-0d92-4c98-a1d2-ef943a4260ad/ceph-csi-encrypted
/mnt/volumes/csi/csi-vol-cab93367-afff-11ec-89f2-620077d1dfcd/.meta

This would add a .fscrpyt directory on each subvolume / PVC. To
manually use fscrypt a user would have to mount
/mnt/volumes/csi/csi-vol-cab93367-afff-11ec-89f2-620077d1dfcd/703b0bb0-0d92-4c98-a1d2-ef943a4260ad

Since Ceph CSI owns the subvolumes below CSI anyway, I don't think
there is a drawback to this.

We could move the .fscrypt directory as high as the filesystem or
the csi subvolume group. Basically having it next to the "meta" files.

Fscrypt Metadata on RADOS

An implementation could be as straight forward as using two objects,
one for protectors one for policies. Each with an object map mapping
keyid to a fscrypt metadata blob. The entries would directly correspond to the files in
/.fscrypt/{protectors,policies}/$keyid.

The problem would be access control in the spirit of fscrypt.
/.fscrypt is basically set up like /tmp (sticky). Created by root,
but non-root users are allowed to add/remove their own protectors.

The fscrypt design doc lists the following requirements for a metadata store:

Metadata Requirements
There are a few properties that we want our metadata storage to have.

For a filesystem that is set up for encryption, a user can create an
encrypted directory (and its associated metadata) without being root

Any user with access to an encrypted directory (via standard UNIX
permissions) and the correct credentials can unlock the directory,
regardless of who set up the directory.

A non-root user cannot delete another user’s metadata, which would
make the files corresponding to that metadata unreadable (see above
Threat Model).

An encrypted directory can be protected with a Protector whose data
is on another filesystem. This is necessary to protect a folder on a
USB drive with a user’s login password, for example.

We do NOT require that any user be able to set up encryption on a
filesystem, as this involves making privileged changes to the system.

I'm not familiar enough with the Ceph FS POSIX user translation to
RADOS auth, but I suspect user access to the fscrypt metadata
objects won't be easy.

I agree, this is a major overhaul or a new Ceph FS specific key management solution.

Copy link

@jtlayton jtlayton Apr 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem would be access control in the spirit of fscrypt. /.fscrypt is basically set up like /tmp (sticky). Created by root, but non-root users are allowed to add/remove their own protectors.

Good point. That would be tricky to replicate in bare RADOS with cephx creds. You'd probably have to layer some enforcement on top. Blech.

That said...one thing that the local fscrypt filesystem folks don't need to worry about is idmapping. I imagine consistent uid/gid mapping is a must for k8s nodes so I'll assume that's not a problem here.

We could move the .fscrypt directory as high as the filesystem or the csi subvolume group. Basically having it next to the "meta" files.

The key point here is that the .fscrypt dir has to be consistently reachable, so you need to consistently mount the same cephfs directory. If you're dealing with a pretty flat hierarchy where every tenant is mounting his own subvolume, you should be ok. If, on the other hand, you have a situation where some clients mount at a higher point in the directory tree then things get more iffy.

I imagine Ceph CSI keeps things fairly flat though, so you should be OK.

Having multiple fscrypt directories (one for each subvolume) seems like the best approach, IMO. Allowing access to the keys in there should be safe (since they are just part of the overall KDF), but it'd be better to deny access to it when tenants don't need it.

@humblec
Copy link
Collaborator

humblec commented Mar 24, 2022

@irq0 can you please revisit the comments from Jeff and the linter failures?

@humblec humblec added this to the release-3.7 milestone Apr 6, 2022
@humblec humblec mentioned this pull request Apr 21, 2022
4 tasks
@github-actions
Copy link

github-actions bot commented May 7, 2022

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale label May 7, 2022
@Rakshith-R Rakshith-R removed the stale label May 12, 2022
Copy link
Member

@nixpanic nixpanic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for joining todays standup call @irq0!

As discussed during the call, it would be a great start to have this function on RBD volumes with ext4. Once CephFS in the kernel supports this, it can be enabled for CephFS too. In the mean time, including and stabilizing fscrypt within Ceph-CSI can already get started.

Note that our e2e suite uses minikube, and hence the minikube-iso should have a kernel with fscrypt enabled (minikube kernel config).

Add proposal document covering key management integration
of Ceph CSI and https://github.com/google/fscrypt

Updates: ceph#1563
Signed-off-by: Marcel Lauhoff <[email protected]>
@mergify mergify bot merged commit b7ec0b2 into ceph:devel May 24, 2022
@irq0 irq0 deleted the PR/fscrypt-proposal branch May 27, 2022 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci/skip/e2e skip running e2e CI jobs component/cephfs Issues related to CephFS component/docs Issues and PRs related to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants