Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable encryption for ceph-csi using fscrypt #4597

Open
pankaj-mandal opened this issue May 1, 2024 · 5 comments
Open

Enable encryption for ceph-csi using fscrypt #4597

pankaj-mandal opened this issue May 1, 2024 · 5 comments
Labels
dependency/ceph depends on core Ceph functionality question Further information is requested

Comments

@pankaj-mandal
Copy link

pankaj-mandal commented May 1, 2024

Describe the bug

I have been trying to enable encryption for ceph-csi, one of the requirements is to enable fscrypt for the ceph storage. However the ceph osd stores use LVM and fscrypt uses ext4 and few others but not LVM so encryption cannot be enabled on the LVM devices.

Environment details

  • Image/version of Ceph CSI driver : quay.io/cephcsi/cephcsi:v3.11.0
  • Helm chart version :(helm version:"v3.14.4") chart version: ceph-csi-cephfs-3.11.0
  • Kernel version : (output of uname -r) 6.5.0-1018-gcp
  • Mounter used for mounting PVC (for cephFS its fuse or kernel. for rbd its
    krbd or rbd-nbd) : kernel
  • Kubernetes cluster version :Client Version: v1.30.0
    Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
    Server Version: v1.29.2
  • Ceph cluster version : ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable)

Steps to reproduce

Steps to reproduce the behavior:

  1. ceph cluster is deployed and cephfs, pools, osd's are configured. The cluster is healthy.
ceph status
  cluster:
    id:     038167ca-076f-11ef-b000-c16a24702dee
    health: HEALTH_OK
 
  services:
    mon: 1 daemons, quorum ceph-4-single (age 13h)
    mgr: ceph-4-single.jojgfx(active, since 13h), standbys: ceph-4-single.ntnqjw
    mds: 1/1 daemons up, 1 standby
    osd: 3 osds: 3 up (since 12h), 3 in (since 12h)
 
  data:
    volumes: 1/1 healthy
    pools:   5 pools, 209 pgs
    objects: 30 objects, 783 KiB
    usage:   886 MiB used, 299 GiB / 300 GiB avail
    pgs:     209 active+clean

ceph-csi is installed on another node using helm charts.
All pods are up and running

NAMESPACE            NAME                                         READY   STATUS    RESTARTS   AGE
ceph-csi-cephfs      ceph-csi-cephfs-nodeplugin-wvzs5             3/3     Running   0          9h
ceph-csi-cephfs      ceph-csi-cephfs-provisioner-86bf8dfc-46x4c   5/5     Running   0          9h
ceph-csi-cephfs      csi-cephfs-demo-pod                          1/1     Running   0          7h28m
kube-system          coredns-76f75df574-r2dl6                     1/1     Running   0          10h
kube-system          coredns-76f75df574-tgdwv                     1/1     Running   0          10h
kube-system          etcd-kind-control-plane                      1/1     Running   0          10h
kube-system          kindnet-c2gqj                                1/1     Running   0          10h
kube-system          kube-apiserver-kind-control-plane            1/1     Running   0          10h
kube-system          kube-controller-manager-kind-control-plane   1/1     Running   0          10h
kube-system          kube-proxy-nqnxj                             1/1     Running   0          10h
kube-system          kube-scheduler-kind-control-plane            1/1     Running   0          10h
local-path-storage   local-path-provisioner-7577fdbbfb-whtmn      1/1     Running   0          10h

I have encryption set to false at this point in the storageclass. However if I enable encryption in storageclass, it will give an error in the demo pod i.e. the error is something like

Events:
  Type     Reason       Age               From               Message
  ----     ------       ----              ----               -------
  Normal   Scheduled    11s               default-scheduler  Successfully assigned ceph-csi-cephfs/csi-cephfs-demo-pod to kind-control-plane
  Warning  FailedMount  12s               kubelet            MountVolume.MountDevice failed for volume "pvc-465a3be2-0dfc-4c06-ae2a-b690cdf00ef5" : rpc error: code = Internal desc = panic runtime error: invalid memory address or nil pointer dereference
  Warning  FailedMount  4s (x4 over 11s)  kubelet            MountVolume.MountDevice failed for volume "pvc-465a3be2-0dfc-4c06-ae2a-b690cdf00ef5" : rpc error: code = Internal desc = fscrypt: unsupported state metadata=true kernel_policy=false
  1. Deployment to trigger the issue '....'
    I have encryption set to false at this point in the storageclass. However if I enable encryption in storageclass, it will give an error in the demo pod
  2. See error
Events:
  Type     Reason       Age               From               Message
  ----     ------       ----              ----               -------
  Normal   Scheduled    11s               default-scheduler  Successfully assigned ceph-csi-cephfs/csi-cephfs-demo-pod to kind-control-plane
  Warning  FailedMount  12s               kubelet            MountVolume.MountDevice failed for volume "pvc-465a3be2-0dfc-4c06-ae2a-b690cdf00ef5" : rpc error: code = Internal desc = panic runtime error: invalid memory address or nil pointer dereference
  Warning  FailedMount  4s (x4 over 11s)  kubelet            MountVolume.MountDevice failed for volume "pvc-465a3be2-0dfc-4c06-ae2a-b690cdf00ef5" : rpc error: code = Internal desc = fscrypt: unsupported state metadata=true kernel_policy=false

Actual results

I guess it is because fscrypt is not enabled in the storage i.e. on the server side.
If I look at the volumes on server side. I see

lsblk -f
NAME                                              FSTYPE         FSVER    LABEL           UUID                                   FSAVAIL FSUSE% MOUNTPOINTS
loop0                                                                                                                                  0   100% /snap/core20/2264
loop1                                                                                                                                  0   100% /snap/google-cloud-cli/235
loop2                                                                                                                                  0   100% /snap/lxd/28373
loop3                                                                                                                                  0   100% /snap/snapd/21465
sda                                                                                                                                             
├─sda1                                            ext4           1.0      cloudimg-rootfs 8ed05f8a-f362-4937-bc52-8e21afbc835c      3.6G    62% /var/lib/containers/storage/overlay
│                                                                                                                                               /
├─sda14                                                                                                                                         
└─sda15                                           vfat           FAT32    UEFI            51E2-9280                                98.3M     6% /boot/efi
sdb                                               LVM2_member    LVM2 001                 QgHYur-fuZE-6Wgs-EQCj-fty6-Wj2Y-dgfogt                
└─ceph--e9acd4cb--15c4--4b48--a713--8cdd9e6c595b-osd--block--9de1730b--d96d--4cf6--b8e5--c533247b3f7b
                                                  ceph_bluestore                                                                                
sdc                                               LVM2_member    LVM2 001                 llxtff-KEmh-rhLh-WNys-L8y4-hABO-JDCZXD                
└─ceph--05b9c373--decc--4bea--a33c--f6f2e3b3d311-osd--block--8ccf132c--d14d--487d--b764--447e59118343
                                                  ceph_bluestore                                                                                
sdd                                               LVM2_member    LVM2 001                 2OI3tQ-IQZs-XotL-e7Kt-t2ka-gdFk-x1Gx8Y                
└─ceph--055041f3--c14f--46cc--a772--52c798acb929-osd--block--3663f324--3511--4e67--b68d--e253da711e2b
                                                  ceph_bluestore 

The devices sdb, sdc and sdd need to be encrypted. However the LVM cannot be encrypted using fscrypt as it is not supported by fscrypt

@nixpanic
Copy link
Member

nixpanic commented May 2, 2024

The Ceph-CSI project provides a CSI driver that a Container Platform like Kubernetes can use to create/delete volumes for application usage. The encryption that Ceph-CSI sets up is client-side, per volume. Ceph-CSI does not manage the Ceph cluster and OSDs. A project like Rook focuses on that.

For your case, you may want to check the Ceph documentation about encryption.

@nixpanic nixpanic added question Further information is requested dependency/ceph depends on core Ceph functionality labels May 2, 2024
@pankaj-mandal
Copy link
Author

The Ceph-CSI project provides a CSI driver that a Container Platform like Kubernetes can use to create/delete volumes for application usage. The encryption that Ceph-CSI sets up is client-side, per volume. Ceph-CSI does not manage the Ceph cluster and OSDs. A project like Rook focuses on that.

For your case, you may want to check the Ceph documentation about encryption.

Thanks for the update, I have enabled server side encryption as per the link you mentioned in Ceph documentation. I had earlier looked at examples in the git repo and it looked like encryption could be done using ceph-csi

I also noticed that even if I set the key "encrypted" to "false" in storageclass, the pvc will not bind. I have to remove that entry completely or comment it out. Also I have to remove the encryptionPassphrase from the secret.yaml or comment it out. Also the namespace in the examples is default but the namespace needed is ceph-csi-cephfs. I am assuming that with this and server side encryption enabled, there is nothing additional to be done in ceph csi as far as encryption of data at rest is concerned.

I did look at using Rook originally but eventually decided to deploy ceph as per ceph documentation. Will try Rook another time.

@Madhu-1
Copy link
Collaborator

Madhu-1 commented May 2, 2024

@pankaj-mandal there are 2 types of encryption

  • Server-side encryption , where you will enable encryption on the ceph cluster and update csi to use the specific encryption method secure or CRC to connect to the ceph cluster and do all operations of secure port 3300
  • The second option is PV encryption where cephcsi will encrypt all the cephfs (its still in alpha state and not much tested and RBD PVC's created

You need to decide on what exact encryption you are looking for

@pankaj-mandal
Copy link
Author

@pankaj-mandal there are 2 types of encryption

  • Server-side encryption , where you will enable encryption on the ceph cluster and update csi to use the specific encryption method secure or CRC to connect to the ceph cluster and do all operations of secure port 3300
  • The second option is PV encryption where cephcsi will encrypt all the cephfs (its still in alpha state and not much tested and RBD PVC's created

You need to decide on what exact encryption you are looking for

This is what I did

ceph-volume lvm prepare --data /dev/sdb --dmcrypt
ceph-volume lvm activate activate 0 <osd-uuid>

and repeated the above for different values of --data and <osd-uuid> and that has enabled encryption for object stores in my ceph cluster. On the client side I removed the entries for encryptionPassphrase and encrypted in the secret and storageclass. Seems to work, although I would like to have a way to see the encrypted files on the disk.

@nixpanic
Copy link
Member

The way you have setup encryption is on the OSD side, where the Ceph cluster stores its objects for the files and RBD-images. By inspecting the contents of the LogicalVolume, you have access to the unencrypted objects. It is just not trivial to select and combine the objects that present a single file. The format is Ceph specific, and not meant for humans to interact with it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependency/ceph depends on core Ceph functionality question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants