Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

charts: add allowPrivilegeEscalation: true to containerSecurityContext to nodeplugin daemonset #2993

Merged
merged 2 commits into from
Apr 22, 2022

Conversation

losil
Copy link
Contributor

@losil losil commented Apr 6, 2022

Describe what this PR does

When running the kubernetes cluster with one single privileged PodSecurityPolicy which is allowing everything the nodeplugin daemonset can fail to start. To be precise the problem is the defaultAllowPrivilegeEscalation: false configuration in the PSP. Containers of the nodeplugin daemonset won't start when they have privileged: true but no allowPrivilegeEscalation in their container securityContext.

Kubernetes will not schedule if this mismatch exists cannot set allowPrivilegeEscalation to false and privileged to true:

[Warning  FailedCreate  7s (x3 over 14s)  daemonset-controller  (combined from similar events): Error creating: Pod "ceph-csi-nodeplugin-rbd-7qm9f" is invalid: [spec.containers[0].securityContext: Invalid value: core.SecurityContext{Capabilities:(*core.Capabilities)(nil), Privileged:(*bool)(0xc04f579a0c), SELinuxOptions:(*core.SELinuxOptions)(nil), WindowsOptions:(*core.WindowsSecurityContextOptions)(nil), RunAsUser:(*int64)(nil), RunAsGroup:(*int64)(nil), RunAsNonRoot:(*bool)(nil), ReadOnlyRootFilesystem:(*bool)(nil), AllowPrivilegeEscalation:(*bool)(0xc04d20ce50), ProcMount:(*core.ProcMountType)(nil), SeccompProfile:(*core.SeccompProfile)(nil)}: cannot set `allowPrivilegeEscalation` to false and `privileged` to true, spec.containers[2].securityContext: Invalid value: core.SecurityContext{Capabilities:(*core.Capabilities)(nil), Privileged:(*bool)(0xc04f579a0f), SELinuxOptions:(*core.SELinuxOptions)(nil), WindowsOptions:(*core.WindowsSecurityContextOptions)(nil), RunAsUser:(*int64)(nil), RunAsGroup:(*int64)(nil), RunAsNonRoot:(*bool)(nil), ReadOnlyRootFilesystem:(*bool)(nil), AllowPrivilegeEscalation:(*bool)(0xc04d20ce50), ProcMount:(*core.ProcMountType)(nil), SeccompProfile:(*core.SeccompProfile)(nil)}: cannot set `allowPrivilegeEscalation` to false and `privileged` to true]]()

The default PodSecurityPolicy for every workload in the k8s cluster is the following:

apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  annotations:
    seccomp.security.alpha.kubernetes.io/allowedProfileNames: '*'
  name: system-unrestricted-psp
spec:
  allowPrivilegeEscalation: true
  allowedCapabilities:
  - '*'
  defaultAllowPrivilegeEscalation: false
  fsGroup:
    rule: RunAsAny
  hostIPC: true
  hostNetwork: true
  hostPID: true
  hostPorts:
  - max: 65535
    min: 0
  privileged: true
  runAsUser:
    rule: RunAsAny
  seLinux:
    rule: RunAsAny
  supplementalGroups:
    rule: RunAsAny
  volumes:
  - '*'

The daemonsets run when the allowPrivilegeEscalation parameter is added to the container securityContext config. In detail this is needed for the containers:

  • driver-registrar
  • liveness-prometheus

Updating the helm chart templates for ceph-csi-rbd and ceph-csi-cephfssolve the problem when using defaultAllowPrivilegeEscalation: true in global privileged PodSecurityPolicy. For example:

{{- if .Values.nodeplugin.httpMetrics.enabled }}
        - name: liveness-prometheus
          securityContext:
            privileged: true
            allowPrivilegeEscalation: true
          image: "{{ .Values.nodeplugin.plugin.image.repository }}:{{ .Values.nodeplugin.plugin.image.tag }}"

Is there anything that requires special attention

Do you have any questions? No

Is the change backward compatible? Yes

Are there concerns around backward compatibility? No

Provide any external context for the change, if any.

For example:

https://kubernetes.io/docs/tasks/configure-pod-container/security-context/

Related issues

Mention any github issues relevant to this PR. Adding below line
will help to auto close the issue once the PR is merged.

None

Future concerns

List items that are not part of the PR and do not impact it's
functionality, but are work items that can be taken up subsequently.

None


Show available bot commands

These commands are normally not required, but in case of issues, leave any of
the following bot commands in an otherwise empty comment in this PR:

  • /retest ci/centos/<job-name>: retest the <job-name> after unrelated
    failure (please report the failure too!)
  • /retest all: run this in case the CentOS CI failed to start/report any test
    progress or results

@yati1998
Copy link
Contributor

yati1998 commented Apr 6, 2022

@losil please check the DCO failure. We have specific guidelines for the commit messages.

@losil losil force-pushed the fix/container-security-context branch from fdf49dc to c46fdcf Compare April 6, 2022 06:28
@losil losil changed the title add allowPrivilegeEscalation: true to containerSecurityContext to nodeplugin daemonset charts: add allowPrivilegeEscalation: true to containerSecurityContext to nodeplugin daemonset Apr 6, 2022
@humblec
Copy link
Collaborator

humblec commented Apr 6, 2022

@losil as you know PSP ( PodSecurityPolicy) is getting deprecated in upstream. Regardless we could have this in chart 👍

@losil
Copy link
Contributor Author

losil commented Apr 6, 2022

@losil as you know PSP ( PodSecurityPolicy) is getting deprecated in upstream. Regardless we could have this in chart 👍

@humblec of course I am aware of the PSPs deprecation and honestly look forward to getting rid of them.
But since we are still running our clusters with k8s 1.21.10 we still rely on the PSPs.

@humblec
Copy link
Collaborator

humblec commented Apr 6, 2022

Yeah , please correct the commit lint error @losil .. it should be good to go..

@losil losil force-pushed the fix/container-security-context branch from c46fdcf to 1f8ff5a Compare April 6, 2022 11:09
@Madhu-1
Copy link
Collaborator

Madhu-1 commented Apr 6, 2022

@losil Please check https://github.com/ceph/ceph-csi/blob/devel/docs/development-guide.md#commit-messages standard followed for commit message.

it should be something like below

helm: allowPrivilegeEscalation: true in containerSecurityContext

When running the kubernetes cluster with one single privileged 
PodSecurityPolicy which is allowing everything the nodeplugin 
daemonset can fail to start. To be precise the problem is the 
defaultAllowPrivilegeEscalation: false configuration in the PSP.
 Containers of the nodeplugin daemonset won't start when they 
have privileged: true but no allowPrivilegeEscalation in their c
ontainer securityContext.

Kubernetes will not schedule if this mismatch exists cannot set 
allowPrivilegeEscalation to false and privileged to true:

Signed-off-by: Silvan Loser <[email protected]>

@losil losil force-pushed the fix/container-security-context branch from 1f8ff5a to 413d283 Compare April 6, 2022 15:23
@losil
Copy link
Contributor Author

losil commented Apr 6, 2022

@losil Please check https://github.com/ceph/ceph-csi/blob/devel/docs/development-guide.md#commit-messages standard followed for commit message.

it should be something like below

helm: allowPrivilegeEscalation: true in containerSecurityContext

When running the kubernetes cluster with one single privileged 
PodSecurityPolicy which is allowing everything the nodeplugin 
daemonset can fail to start. To be precise the problem is the 
defaultAllowPrivilegeEscalation: false configuration in the PSP.
 Containers of the nodeplugin daemonset won't start when they 
have privileged: true but no allowPrivilegeEscalation in their c
ontainer securityContext.

Kubernetes will not schedule if this mismatch exists cannot set 
allowPrivilegeEscalation to false and privileged to true:

Signed-off-by: Silvan Loser <[email protected]>

Thanks for the hint!

@humblec
Copy link
Collaborator

humblec commented Apr 7, 2022

/retest ci/centos/k8s-e2e-external-storage/1.21

@humblec
Copy link
Collaborator

humblec commented Apr 7, 2022

/retest ci/centos/mini-e2e-helm/k8s-1.21

humblec
humblec previously approved these changes Apr 7, 2022
yati1998
yati1998 previously approved these changes Apr 7, 2022
@humblec
Copy link
Collaborator

humblec commented Apr 7, 2022

@Mergifyio rebase

@mergify
Copy link
Contributor

mergify bot commented Apr 7, 2022

rebase

✅ Branch has been successfully rebased

@Rakshith-R Rakshith-R force-pushed the fix/container-security-context branch from 413d283 to eae27dc Compare April 7, 2022 10:47
@humblec
Copy link
Collaborator

humblec commented Apr 11, 2022

@Mergifyio rebase

@humblec
Copy link
Collaborator

humblec commented Apr 11, 2022

@Mergifyio refresh

@mergify
Copy link
Contributor

mergify bot commented Apr 11, 2022

rebase

✅ Branch has been successfully rebased

@mergify
Copy link
Contributor

mergify bot commented Apr 11, 2022

refresh

✅ Pull request refreshed

@Rakshith-R Rakshith-R force-pushed the fix/container-security-context branch from eae27dc to 03dae1b Compare April 11, 2022 06:42
Copy link
Collaborator

@Madhu-1 Madhu-1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@losil Sorry for the late review on this one.

Same changes are required in the below files

deploy/cephfs/kubernetes/csi-cephfsplugin.yaml
deploy/rbd/kubernetes/csi-rbdplugin.yaml

@mergify mergify bot dismissed stale reviews from humblec and yati1998 April 12, 2022 05:47

Pull request has been modified.

Madhu-1
Madhu-1 previously approved these changes Apr 12, 2022
@Madhu-1
Copy link
Collaborator

Madhu-1 commented Apr 12, 2022

@losil can you please change Signed-off-by: Silvan Loser <[email protected]> to Signed-off-by: Silvan Loser <[email protected]> in your 2nd commit?

@losil losil force-pushed the fix/container-security-context branch from bce1ff2 to bff4f29 Compare April 12, 2022 07:56
@mergify mergify bot dismissed Madhu-1’s stale review April 12, 2022 07:57

Pull request has been modified.

@ceph-csi-bot
Copy link
Collaborator

@losil "ci/centos/mini-e2e-helm/k8s-1.23" test failed. Logs are available at location for debugging

@ceph-csi-bot
Copy link
Collaborator

/retest ci/centos/mini-e2e/k8s-1.23

@ceph-csi-bot
Copy link
Collaborator

@losil "ci/centos/mini-e2e/k8s-1.23" test failed. Logs are available at location for debugging

@ceph-csi-bot
Copy link
Collaborator

@Mergifyio requeue

@mergify
Copy link
Contributor

mergify bot commented Apr 22, 2022

requeue

✅ The queue state of this pull request has been cleaned. It can be re-embarked automatically

@ceph-csi-bot
Copy link
Collaborator

/retest ci/centos/mini-e2e-helm/k8s-1.21

@ceph-csi-bot
Copy link
Collaborator

@losil "ci/centos/mini-e2e-helm/k8s-1.21" test failed. Logs are available at location for debugging

@ceph-csi-bot
Copy link
Collaborator

/retest ci/centos/k8s-e2e-external-storage/1.22

@ceph-csi-bot
Copy link
Collaborator

@losil "ci/centos/k8s-e2e-external-storage/1.22" test failed. Logs are available at location for debugging

@ceph-csi-bot
Copy link
Collaborator

@Mergifyio requeue

@mergify
Copy link
Contributor

mergify bot commented Apr 22, 2022

requeue

☑️ This pull request is already queued

@ceph-csi-bot
Copy link
Collaborator

/retest ci/centos/mini-e2e-helm/k8s-1.21

@ceph-csi-bot
Copy link
Collaborator

@losil "ci/centos/mini-e2e-helm/k8s-1.21" test failed. Logs are available at location for debugging

@ceph-csi-bot
Copy link
Collaborator

/retest ci/centos/mini-e2e-helm/k8s-1.22

@ceph-csi-bot
Copy link
Collaborator

@losil "ci/centos/mini-e2e-helm/k8s-1.22" test failed. Logs are available at location for debugging

@ceph-csi-bot
Copy link
Collaborator

@Mergifyio requeue

@mergify
Copy link
Contributor

mergify bot commented Apr 22, 2022

requeue

☑️ This pull request is already queued

@ceph-csi-bot
Copy link
Collaborator

/retest ci/centos/mini-e2e-helm/k8s-1.22

@ceph-csi-bot
Copy link
Collaborator

@losil "ci/centos/mini-e2e-helm/k8s-1.22" test failed. Logs are available at location for debugging

@ceph-csi-bot
Copy link
Collaborator

@Mergifyio requeue

@mergify
Copy link
Contributor

mergify bot commented Apr 22, 2022

requeue

☑️ This pull request is already queued

@ceph-csi-bot
Copy link
Collaborator

/retest ci/centos/mini-e2e-helm/k8s-1.22

@ceph-csi-bot
Copy link
Collaborator

@losil "ci/centos/mini-e2e-helm/k8s-1.22" test failed. Logs are available at location for debugging

@ceph-csi-bot
Copy link
Collaborator

@Mergifyio requeue

@mergify
Copy link
Contributor

mergify bot commented Apr 22, 2022

requeue

☑️ This pull request is already queued

@mergify mergify bot merged commit f2e0fa2 into ceph:devel Apr 22, 2022
@Madhu-1
Copy link
Collaborator

Madhu-1 commented Apr 26, 2022

@Mergifyio refresh

@mergify
Copy link
Contributor

mergify bot commented Apr 26, 2022

refresh

✅ Pull request refreshed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci/retry/e2e Label to retry e2e retesting on approved PR's
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants