fix and document aggregate cluster behavior #13134

markdroth · 2020-09-16T19:18:18Z

The aggregate cluster documentation describes how load balancing works with an aggregate cluster, but it doesn't describe how any of the other cluster behavior works.

Here are a few off-the-cuff examples (not a comprehensive list):

What happens for circuit breakers? Does the circuit breaking config in the aggregate cluster get used, or does it somehow delegate to the circuit breaking configs in the clusters being delegated to?
How does load reporting work? Does the load get reported as belonging to the underlying clusters or the aggregate cluster?
How does outlier detection work?

@yxue, can you please update the documentation to explain how these sorts of things are supposed to work? Ideally, the doc should cover the behavior for every feature configurable via the Cluster xDS resource.

Thanks!

yxue · 2020-09-16T23:53:17Z

Sure, I will update the document. Please assign the issue to me. Thanks!

markdroth · 2020-09-29T18:46:19Z

Also, how are EDS drops supposed to work in an aggregate cluster? They are supposed to be top-level parameters, but what happens if an aggregate cluster points to an EDS cluster? Or worse, what happens if the aggregate cluster points to multiple EDS clusters, each with their own drop config?

Note that Envoy does not currently implement EDS drop support, but gRPC does, so we need to understand how this is supposed to work.

markdroth · 2020-09-29T18:47:49Z

Similar question for the overprovisioning factor.

markdroth · 2020-10-01T14:04:15Z

gRPC is about to implement support for aggregate clusters, and I want to make sure we agree on the expected semantics, so that we don't want up making a decision that is incompatible with what Envoy does or may do in the future.

I've put together the following doc proposing what I think the semantics should be:

https://docs.google.com/document/d/17fc0rAktQpc5oMkerOLHSTnXwX1xWAqW7vZ6_LL47NY/edit#

Already discussed with @yxue, @htuch, @ejona86, and @dfawley.

I'd appreciate feedback from @mattklein123, @snowp, as well as any of the other @envoyproxy/api-shepherds that may be interested.

markdroth · 2020-10-14T23:54:32Z

https://docs.google.com/document/d/17fc0rAktQpc5oMkerOLHSTnXwX1xWAqW7vZ6_LL47NY/edit# has now been approved as specifying the expected behavior of aggregate clusters.

This issue should now cover the following:

Updating the aggregate cluster tests to cover every behavior described in that doc.
Fixing any behavior that the tests show does not currently match what is specified in the doc.
Updating the aggregate cluster documentation to specify the behavior in all of these cases.

markdroth · 2021-07-22T13:22:54Z

See discussion in #17412 about the max_retries circuit breaker. In general, we do want circuit breakers to come from the underlying clusters instead of from the aggregate cluster, as per our previous discussions, but the max_retries circuit breaker will need to come from the aggregate cluster instead. I've updated the design doc mentioned above to reflect this.

markdroth · 2023-12-13T22:06:10Z

@yanavlasov has just pointed out to me that we got this slightly wrong.

Our previous discussions in this issue assumed that the core idea of an aggregate cluster was that it concatenated the priorities from the underlying clusters into a single list, so that it could use a single LB policy defined at the aggregate cluster layer to choose a priority from that combined list. This made sense as a way to use the normal logic for choosing a priority, since in Envoy priorities are normally handled by the LB policy. Therefore, we assumed that aggregate clusters always used the LB policy defined in the aggregate cluster rather than the ones defined in the underlying clusters.

However, Yan pointed out to me that aggregate clusters don't actually define the LB policy in the aggregate cluster; instead, the aggregate cluster uses a special cluster provided LB policy that first chooses the underlying cluster and then delegates to the LB policy of the underlying cluster. This is actually the documented behavior, and it looks like the Envoy implementation here is indeed doing a two-level pick.

I think this means that we really do want basically all of the configuration to come from the underlying cluster, with the sole exception of the max_retries circuit breaker, as per discussion in #17412 -- and it seems fine to say that that particular circuit breaker simply won't work for aggregate clusters.

Note that the current Envoy implementation has the aggregate cluster pick the underlying cluster only when the LB pick is requested. However, since we also want features that are unrelated to the LB pick (e.g., circuit breaking, outlier detection, and EDS drops (see #31345)) to be configured at the underlying cluster level, Envoy will probably need to decouple this from the LB pick somehow. It may need a layer of indirection whereby it first asks the cluster for its configuration and then uses that configuration for all of the above features plus the LB pick.

markdroth · 2023-12-19T01:02:32Z

A couple of other things to note from a conversation with @yanavlasov:

Apparently, Envoy currently does not support an aggregate cluster pointing to another aggregate cluster. gRPC does support this, so we should probably add it to Envoy too (since that's easier than removing support in gRPC). Note that when we do this, we need to be careful about config validation here; in gRPC, we impose a recursion depth limit of 16, and we deal with diamond dependencies in the graph by simply ignoring a node that we've already previously seen.
Envoy expects the aggregate cluster to be configured with a CLUSTER_PROVIDED LB policy. It's not clear what its behavior would be if an aggregate cluster was configured with a different LB policy, but we should probably define this. Note that we're currently needing to define this in gRPC, because we incorrectly implemented aggregate clusters to use the LB policy in the aggregate cluster instead of the policy in the underlying cluster, which means that there are probably existing control planes that set the LB policy in the aggregate cluster to something other than CLUSTER_PROVIDED, and we don't want to break them when we fix our implementation. We're probably going to change gRPC to simply ignore the LB policy config in the aggregate cluster -- i.e., it will basically always use the cluster-provded implementation. So we could decide to do the same thing in Envoy to be consistent.

markdroth · 2024-01-03T21:45:16Z

FYI, I've written a design for fixing the aggregate cluster behavior in gRPC in grpc/proposal#405.

markdroth added the triage Issue requires triage label Sep 16, 2020

mattklein123 added area/aggregate_cluster area/docs help wanted Needs help! and removed triage Issue requires triage labels Sep 16, 2020

snowp assigned yxue Sep 17, 2020

markdroth changed the title ~~improve documentation of aggregate cluster behavior~~ fix and document aggregate cluster behavior Oct 14, 2020

markdroth mentioned this issue Jul 19, 2021

max_retries circuit breaker vs. dynamic cluster selection #17412

Closed

markdroth mentioned this issue Dec 13, 2023

Drop overload does not work with aggregate cluster #31345

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix and document aggregate cluster behavior #13134

fix and document aggregate cluster behavior #13134

markdroth commented Sep 16, 2020

yxue commented Sep 16, 2020

markdroth commented Sep 29, 2020

markdroth commented Sep 29, 2020

markdroth commented Oct 1, 2020

markdroth commented Oct 14, 2020

markdroth commented Jul 22, 2021

markdroth commented Dec 13, 2023

markdroth commented Dec 19, 2023

markdroth commented Jan 3, 2024

fix and document aggregate cluster behavior #13134

fix and document aggregate cluster behavior #13134

Comments

markdroth commented Sep 16, 2020

yxue commented Sep 16, 2020

markdroth commented Sep 29, 2020

markdroth commented Sep 29, 2020

markdroth commented Oct 1, 2020

markdroth commented Oct 14, 2020

markdroth commented Jul 22, 2021

markdroth commented Dec 13, 2023

markdroth commented Dec 19, 2023

markdroth commented Jan 3, 2024