Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle local externalTrafficPolicy correctly #54

Merged

Conversation

tommasopozzetti
Copy link
Contributor

LoadBalancer services can specify an extra property called externalTrafficPolicy which defines how kube-proxy sets up the load balancing on the nodes. If such property is set to Local, kube-proxy will only forward traffic for externalIPs or LoadBalancer IPs to pods backing the service that are running on the node on which the packet arrived. If no pods are running on that node, that traffic gets dropped.

This PR introduces a vrrp_script for services that set this property to Local which will healthcheck the Kubernetes provided health endpoint for the service. This endpoint returns 200 if at least a pod backing the service is running on the node, or 503 if no pods are running on the node. By adding this healthcheck script, keepalived will deem the node unhealthy if the healthcheck fails, thus ensuring the VIP moves to a different owner that does have at least a pod backing the service running on it. This way no traffic will be lost and the Local externalTrafficPolicy will work as intended.

This PR requires the keepalived image to contain the /usr/bin/curl binary which is used to perform the healthcheck.

This PR should address and fix #13

LoadBalancer services can specify an extra property called `externalTrafficPolicy`
which defines how kube-proxy sets up the load balancing on the nodes. If such
property is set to `Local`, kube-proxy will only forward traffic for externalIPs
or LoadBalancer IPs to pods backing the service that are running on the node on
which the packet arrived. If no pods are running on that node, that traffic gets
dropped.

This commit introduces a vrrp_script for services that set this property to
`Local` which will healthcheck the Kubernetes provided health endpoint for the
service. This endpoint returns 200 if at least a pod backing the service is
running on the node, or 503 if no pods are running on the node. By adding this
healthcheck script, keepalived will deem the node unhealthy if the healthcheck
fails, thus ensuring the VIP moves to a different owner that does have at least
a pod backing the service running on it. This way no traffic will be lost and
the `Local` `externalTrafficPolicy` will work as intended.
@tommasopozzetti
Copy link
Contributor Author

This PR modifies a file which is also slightly modified by #53. If both are of interest, after merging the first one the second should be adjusted to reflect the changes. It should be a trivial change but I'm just noting it to avoid forgetting.

@raffaelespazzoli
Copy link
Collaborator

I have two concerns about this PR:

  1. it's unclear what we expect to happen when the target pods are not running on the keepalived group nodes.
  2. it seems like we are making an assumption that all pods will expose the /health endpoint. This will not be true. The healthcheck should just check for tcp connectivity.

@tommasopozzetti
Copy link
Contributor Author

@raffaelespazzoli, let me try to address the two concerns:

  1. If no target pod is running on any of the keepalived group nodes, then the VIP will not be owned by any of them and any traffic toward that VIP will not get to the backend pod. Notice, however, that this is indeed the expected behavior if the externalTrafficPolicy is set to Local. It is, in fact, responsibility of the user when setting such option to ensure that nodes that are receiving the traffic do have backing pods running on them, otherwise local traffic is simply impossible. Also notice that this is the same behavior that is shown even without this PR. If externalTrafficPolicy is set to Local, kube-proxy only sets up load balancing towards pods on the node itself, so if no pods are running on the node, traffic is lost. The issue right now is that without this PR, even if a backing pod is running on one of the keepalived group nodes, but the VIP is owned by another node, the traffic will still be lost. This PR aims at making keepalived aware of the situation to move the VIP to a node that does have backing pods (if at least one exists).

  2. The healthCheckNodePort field of the service is a port that Kubernetes sets up only for services with Local externalTrafficPolicy and does not directly expose the backing pods. It exposes a healthcheck service provided by kube-proxy which always provides a response to requests to the /health endpoint, whether a backing pod is running on that specific node or not. The HTTP response code will be 200 if at least one backing pod is running on the given node or 503 if no backing pod is running on the node. This is an example of such response:

     $ curl "http://localhost:<healthCheckNodePort>/health"
     {
         "service": {
             "namespace": "<namespace>",
             "name": "<service-name>"
         },
         "localEndpoints": 1
     }
    

@raffaelespazzoli
Copy link
Collaborator

ok, perfect. Then I think we should incorporate this PR. Let's merge this one first since it's simple. curl should be onboard the keepalived image.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handling of Local externalTrafficPolicy
2 participants