Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pod startup failure #84

Open
RomanOrlovskiy opened this issue Apr 12, 2023 · 4 comments
Open

Pod startup failure #84

RomanOrlovskiy opened this issue Apr 12, 2023 · 4 comments
Labels

Comments

@RomanOrlovskiy
Copy link

Describe the bug
The pod is not able to start up during the initial deployment using the latest v3.4.0 helm chart. Is it possible this is related to the Kubernetes version?

Events:
  Type     Reason     Age                    From               Message
  ----     ------     ----                   ----               -------
.....
  Normal   Created    3m7s (x2 over 3m19s)   kubelet            Created container kubernetes-secret-generator
  Normal   Started    3m7s (x2 over 3m19s)   kubelet            Started container kubernetes-secret-generator
  Normal   Pulled     3m7s                   kubelet            Successfully pulled image "quay.io/mittwald/kubernetes-secret-generator:latest" in 74.165988ms (74.182482ms including waiting)
  Warning  Unhealthy  2m55s (x8 over 3m13s)  kubelet            Readiness probe failed: Get "http://10.8.11.223:8080/readyz": dial tcp 10.8.11.223:8080: connect: connection refused
  Warning  Unhealthy  2m55s (x6 over 3m13s)  kubelet            Liveness probe failed: Get "http://10.8.11.223:8080/healthz": dial tcp 10.8.11.223:8080: connect: connection refused
  Normal   Killing    2m55s (x2 over 3m7s)   kubelet            Container kubernetes-secret-generator failed liveness probe, will be restarted
  Normal   Pulling    2m54s (x3 over 3m19s)  kubelet            Pulling image "quay.io/mittwald/kubernetes-secret-generator:latest"

Those are the only logs available in pods:

{"level":"info","ts":1681322039.958661,"logger":"cmd","msg":"Operator Version: 0.0.1"}
{"level":"info","ts":1681322039.9587452,"logger":"cmd","msg":"Go Version: go1.15.15"}
{"level":"info","ts":1681322039.9587672,"logger":"cmd","msg":"Go OS/Arch: linux/amd64"}
{"level":"info","ts":1681322039.9587784,"logger":"cmd","msg":"Version of operator-sdk: v0.16.0"}
{"level":"info","ts":1681322039.9592156,"logger":"leader","msg":"Trying to become the leader."}
{"level":"info","ts":1681322049.27793,"logger":"leader","msg":"Found existing lock with my name. I was likely restarted."}
{"level":"info","ts":1681322049.2779632,"logger":"leader","msg":"Continuing as the leader."}

To Reproduce
Just a basic installation using helm.

values.yaml:

installCRDs: true
useMetricsService: true

Environment:

  • Kubernetes version: EKS 1.25
  • kubernetes-secret-generator version: v3.1.0, v.3.4.0, latest.
kubectl version
Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.0", GitCommit:"a866cbe2e5bbaa01cfd5e969aa3e033f3282a8a2", GitTreeState:"clean", BuildDate:"2022-08-23T17:36:43Z", GoVersion:"go1.19", Compiler:"gc", Platform:"darwin/amd64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"25+", GitVersion:"v1.25.6-eks-48e63af", GitCommit:"9f22d4ae876173884749c0701f01340879ab3f95", GitTreeState:"clean", BuildDate:"2023-01-24T19:19:02Z", GoVersion:"go1.19.5", Compiler:"gc", Platform:"linux/amd64"}
@jan-kantert
Copy link

We have seen this startup failure too. But only momentarily. It made no sense since "we did not change anything (tm)".

It later turned out that this happened because one of our apiservices became unavailable (in our case linkerd-tap because the pods ran into an issue). You can check with kubectl get apiservices.apiregistration.k8s.io. This did not affect any other workload on the cluster. I honestly do not understand why it causes secret-generator to hang. It definitely should not cause that.

Ideas?

@jan-kantert
Copy link

We looked some more into this issue. Seems to be a bug in the (old) version of operator-sdk. Guess an update would fix that.

@vmartino
Copy link

vmartino commented Feb 7, 2024

Is there a workaround for this issue?

@jan-kantert
Copy link

Workaround: Fix all of your webhooks ;-). This only happens when other webhooks are broken for us.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants