Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Use PVC as cache path, The Second Mount Pod stuck in Init:0/1 state #906

Open
chenmiao1991 opened this issue Mar 21, 2024 · 4 comments
Labels
kind/bug Something isn't working

Comments

@chenmiao1991
Copy link

What happened:

when I try use-pvc-as-cache-path, the second mount pod can not Running.

What you expected to happen:

Each JuiceFS app mount pod with its own RBD cache block and is currently running.

How to reproduce it (as minimally and precisely as possible):

  • I create a rbd pvc use this yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: juicefs-pv-rbd
  namespace: kube-system
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 512Mi
  storageClassName: ceph-rbd-pool         <- pvc create with rbd storage class
  • then create juicefs apps use that pvc as cache
apiVersion: v1
kind: Secret
metadata:
  name: juicefs-secret
type: Opaque
stringData:
  name: xx
  metaurl: redis://xx
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: juicefs-pv
  labels:
    juicefs-name: ten-pb-fs
spec:
  capacity:
    storage: 10Pi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  csi:
    driver: csi.juicefs.com
    volumeHandle: juicefs-pv
    fsType: juicefs
    nodePublishSecretRef:
      name: juicefs-secret
      namespace: default
    volumeAttributes:
      juicefs/mount-cache-pvc: "juicefs-pv-rbd"     <- Used here
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: juicefs-pvc
  namespace: default
spec:
  accessModes:
    - ReadWriteMany
  volumeMode: Filesystem
  storageClassName: ""
  resources:
    requests:
      storage: 10Pi
  selector:
    matchLabels:
      juicefs-name: ten-pb-fs
---
apiVersion: v1
kind: Pod
metadata:
  name: juicefs-app
  namespace: default
spec:
  containers:
  - args:
    - -c
    - while true; do sleep 5; done
    command:
    - /bin/sh
    image: centos
    name: app
    volumeMounts:
    - mountPath: /data
      name: data
    resources:
      requests:
        cpu: 10m
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: juicefs-pvc      <- share juicefs pvc
---
apiVersion: v1
kind: Pod
metadata:
  name: juicefs-app2
  namespace: default
spec:
  containers:
  - args:
    - -c
    - while true; do sleep 5; done
    command:
    - /bin/sh
    image: centos
    name: app
    volumeMounts:
    - mountPath: /data
      name: data
    resources:
      requests:
        cpu: 10m
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: juicefs-pvc        <- share juicefs pvc
  • the second app mount pod is stuck in a Init:0/1 state with the Warning:
Events:
  Type     Reason       Age                  From     Message
  ----     ------       ----                 ----     -------
  Warning  FailedAttachVolume  2m27s  attachdetach-controller  Multi-Attach error for volume "pvc-d744c8d6-145b-4006-9ba8-bf42fd4ad632" Volume is already used by pod(s) node-12-juicefs-pv-crxnpz
  Warning  FailedMount         24s    kubelet                  Unable to attach or mount volumes: unmounted volumes=[cachedir-pvc-0], unattached volumes=[jfs-root-dir kube-api-access-zh5j2 cachedir-pvc-0 jfs-dir updatedb]: timed out waiting for the condition

Anything else we need to know?

How to implement each JuiceFS app mount pod with its own RBD cache block?
Any suggestions ?

Environment:

  • JuiceFS CSI Driver version (which image tag did your CSI Driver use):
    v0.18.1

  • Kubernetes version (e.g. kubectl version):
    v1.23.13

  • Object storage (cloud provider and region):
    ceph 14

  • Metadata engine info (version, cloud provider managed or self maintained):
    self maintained。

  • Network connectivity (JuiceFS to metadata engine, JuiceFS to object storage):

  • Others:

@showjason
Copy link
Contributor

juicefs-app1 and juicefs-app2 run on the same node or different nodes? If different nodes, it's very likely that the issue caused by the accessModes RWO.

@chenmiao1991
Copy link
Author

juicefs-app1 and juicefs-app2 run on the same node or different nodes? If different nodes, it's very likely that the issue caused by the accessModes RWO.

@showjason run on the different nodes. How to solve the use of block device PVC, I see examples are all cloud vendor block devices.

@showjason
Copy link
Contributor

@chenmiao1991 as I know, ceph rbd doesn't support RWX, but support ROX. Maybe, dedicated-cache-cluster is one way to address your issue. Or you can use the NFS instead of block storage.
@zxh326 sorry, do you have any better ideas?

@chenmiao1991
Copy link
Author

chenmiao1991 commented Mar 26, 2024

@showjason Using the juicefs distributed file system is to replace file systems like nfs, which brings us back again.

@showjason @zxh326 may juicefs-csi-node DaemonSet can automatically mount their own rbd as cache, and other applications can share the rbd cache. Do not use the hostpath mode, as it is inconvenient for batch creation and deletion of rbd.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants