[POC] Add standby/active multi-cluster support #301

phvalguima · 2023-10-24T10:51:38Z

This PR adds support for active/standby multi-clustering in Postgresql. It uses Patroni's standby_cluster option to bootstrap one of the clusters as a follower of the primary unit.

It creates 2x new relations and 2x new actions:

async-primary relation: this is the relation that represents the leader unit, it receives all the data from standby-clusters and generates the configuration for the leader
async-replica relation: relation used by each standby cluster to connect with the leader unit
promote-standby-cluster action: this action informs one of the clusters that it should be promoted to a primary
demote-primary-cluster action: likewise, cleans up the state and demotes the primary cluster.

The UX is described as follows:

Deploy two models in the same k8s cluster:

juju add-model psql-1
juju deploy ./postgresql.charm --resource postgresql-image=ghcr.io/canonical/charmed-postgresql@sha256:a6aa592506aa4cda85b63f66e1c9d079088ca7c9d84ed4bba9442dea36ec3f17

juju add-model psql-2
juju deploy ./postgresql.charm --resource postgresql-image=ghcr.io/canonical/charmed-postgresql@sha256:a6aa592506aa4cda85b63f66e1c9d079088ca7c9d84ed4bba9442dea36ec3f17

Then, configure async replication as follows:

juju switch psql-1
juju offer postgresql-k8s:async-primary async-primary  # async-primary is the relation provided by the leader

juju switch psql-2
juju consume admin/psql-1.async-primary  # consume the primary relation

Finally, set the relation and run the promotion action:

juju relate postgresql-k8s:async-replica async-primary  # Both units are now related, where postgresql-k8s in model psql-2 is the standby-leader
juju run -m psql-1 postgresql-k8s/0 promote-standby-cluster  # move postgresql-k8s in model psql-1 to be the leader cluster

Once the models settle, it is possible to check the status within one of the postgresql units.

For example, the following status can be seen in standby's patroni:

  $ PATRONI_KUBERNETES_LABELS='{application: patroni, cluster-name: patroni-postgresql-k8s}' \
    PATRONI_KUBERNETES_NAMESPACE=psql-2 \
    PATRONI_KUBERNETES_USE_ENDPOINTS=true \
    PATRONI_NAME=postgresql-k8s-0 \
    PATRONI_REPLICATION_USERNAME=replication \
    PATRONI_SCOPE=patroni-postgresql-k8s \
    PATRONI_SUPERUSER_USERNAME=operator \
      patronictl -c /var/lib/postgresql/data/patroni.yml list

Role should be "Standby leader" and State should be "Running".

Deploy two models, each with 1x postgresql Then, configure async replication as follows: $ juju switch psql-1 $ juju offer postgresql-k8s:async-primary async-primary # async-primary is the relation provided by the leader $ juju switch psql-2 $ juju consume admin/psql-1.async-primary # consume the primary relation $ juju relate postgresql-k8s:async-replica async-primary # Both units are now related, where postgresql-k8s in model psql-2 is the standby-leader Now, run the action: $ juju run -m psql-1 postgresql-k8s/0 promote-standby-cluster # move postgresql-k8s in model psql-1 to be the leader cluster Run the following command to check status: $ PATRONI_KUBERNETES_LABELS='{application: patroni, cluster-name: patroni-postgresql-k8s}' \ PATRONI_KUBERNETES_NAMESPACE=psql-2 \ # update to model number PATRONI_KUBERNETES_USE_ENDPOINTS=true \ PATRONI_NAME=postgresql-k8s-0 \ PATRONI_REPLICATION_USERNAME=replication \ PATRONI_SCOPE=patroni-postgresql-k8s \ PATRONI_SUPERUSER_USERNAME=operator \ patronictl -c /var/lib/postgresql/data/patroni.yml list Role should be "Standby leader" and State should be "Running".

phvalguima · 2023-10-24T11:06:36Z

The async-replica/-primary relation break is similar in concept to primary/secondary relation endpoints in: https://ubuntu.com/ceph/docs/setting-up-multi-site

dragomirp · 2023-10-25T12:32:55Z

src/relations/async_replication.py

@@ -0,0 +1,291 @@
+# Copyright 2022 Canonical Ltd.


Suggested change

# Copyright 2022 Canonical Ltd.

# Copyright 2023 Canonical Ltd.

dragomirp · 2023-10-25T13:44:28Z

src/relations/async_replication.py

+        # If this is a standby-leader, then execute switchover logic
+        # TODO


Looking at Patroni docs:

Automatic promotion is not possible, because DC2 will never able to figure out the state of DC1. You should not use pg_ctl promote in this scenario, you need “manually promote” the healthy cluster by removing standby_cluster section from the [dynamic configuration](https://patroni.readthedocs.io/en/latest/dynamic_configuration.html#dynamic-configuration).

It sounds to me that we need to remove the new relation (with --force?) to promote the replica cluster and we can't really promote and demote with just actions.

Hi @dragomirp, the way I see the fail/switchover happening is the following:

# First, offer async-primary from both models: juju switch psql-1 juju offer postgresql-k8s:async-primary async-primary juju switch psql-2 juju offer postgresql-k8s:async-primary async-primary

Then, we consume in each model, the async-primary relation:

juju consume admin/psql-2.async-primary juju relate -m psql-1 postgresql-k8s:async-replica async-primary juju consume admin/psql-1.async-primary juju relate -m psql-2 postgresql-k8s:async-replica async-primary

Once that setup is done, the postgresql apps know there is an async replication available, but will not implement the actual configuration. That will happen once we run the promote-standby-cluster action on one of the models.
At that moment, then the model where the action ran should take over and become the primary. The remaining will continue as replicas.

At switchover, the user must initiate the process. That should be:

juju run -m <model-with-old-primary> postgresql-k8s/leader demote-primary-cluster juju run -m <mode-with-new-primary> postgresql-k8s/leader promote-standby-cluster

The demote should not be successful if the target unit sees a cluster as "primary" still connected in its async-replica relation.

In case of failover, I think you are correct, depending on the state of the primary cluster, we may end up with Juju knowing about the primary cluster and having its databag (with the primary key set); but the cluster is gone. Indeed, we would need here to pull the relation out first, possibly with --force, then promote one of the replica clusters as leader.

But in any case, I don't think we should do automatic failover between clusters, as this is async replication. It should be a conscious decision from the user.

marceloneppel

Nice. Thanks @phvalguima!

On

postgresql-k8s-operator/src/charm.py

Line 486 in c4c0adb

if not self.is_primary and (

we need to skip the logic from the if block also for the standby_leader (like we do for primary) to avoid a wrong status message (after scaling up the standby cluster).

phvalguima · 2023-10-26T20:43:01Z

Based in first feedback in our discussions:

Switch the naming from "primary" to "active cluster" (TODO)
Test with clusters of 2+ nodes in both primary and standby
Clarify switchover procedure

I just finished the testing with clusters composed of at least 2 units. I noticed two issues: (1) #306 - I was wrongly using Endpoints instead of Services, and (2) missing trust in my tests.

Regarding @dragomirp question: I can run a switchover between clusters with some manual steps.

First, I deploy the environment with the following steps:

juju add-model psql-1
juju deploy ./postgresql-k8s_ubuntu-22.04-amd64.charm --resource postgresql-image=<image> -n2 --trust
juju offer postgresql-k8s:async-primary async-primary  # async-primary is the relation provided by the leader

juju add-model psql-2
juju deploy ./postgresql-k8s_ubuntu-22.04-amd64.charm --resource postgresql-image=<image> -n2 --trust

juju consume admin/psql-1.async-primar
juju relate postgresql-k8s:async-replica async-primary
juju run -m psql-1 postgresql-k8s/leader promote-standby-cluster

At the end of the process above, cluster psql-2 will be the standby cluster and should have the following topology:

Also, the standby cluster is not being displayed in patronictl list or patronictl topology output on the primary cluster.

$ patronictl -c /var/lib/postgresql/data/patroni.yml list
+ Cluster: patroni-postgresql-k8s -----------------------------+----------------+---------+----+-----------+
| Member           | Host                                      | Role           | State   | TL | Lag in MB |
+------------------+-------------------------------------------+----------------+---------+----+-----------+
| postgresql-k8s-0 | postgresql-k8s-0.postgresql-k8s-endpoints | Replica        | running |  2 |         0 |
| postgresql-k8s-1 | postgresql-k8s-1.postgresql-k8s-endpoints | Standby Leader | running |  2 |           |
+------------------+-------------------------------------------+----------------+---------+----+-----------+

Then, the switchover can be done by removing the following entries from the Standby Leader:

Stop Patroni in all standby units
Delete the patroni-postgresql-k8s-config from kubernetes service
Remove the following lines from patroni.yml:

.....
        effective_cache_size: 18733MB
-    standby_cluster:
-      host: 10.152.183.194
-      port: 5432
-      create_replica_methods: ["basebackup"]
  
  pg_hba:

Restart Patroni

The standby cluster will be promoted to leader:

patronictl -c /var/lib/postgresql/data/patroni.yml list
+ Cluster: patroni-postgresql-k8s -----------------------------+--------------+---------+----+-----------+
| Member           | Host                                      | Role         | State   | TL | Lag in MB |
+------------------+-------------------------------------------+--------------+---------+----+-----------+
| postgresql-k8s-0 | postgresql-k8s-0.postgresql-k8s-endpoints | Leader       | running |  2 |           |
| postgresql-k8s-1 | postgresql-k8s-1.postgresql-k8s-endpoints | Sync Standby | running |  2 |         0 |
+------------------+-------------------------------------------+--------------+---------+----+-----------+

This is just very early results. We need to test that type of switchover on clusters under stress.

… will stop their services before moving on and reconfiguring

marceloneppel · 2024-01-23T17:40:34Z

Superseded by #368.

phvalguima added 2 commits October 24, 2023 01:18

Add first draft of standby_cluster support

b6251b4

phvalguima requested review from delgod, taurus-forever and marceloneppel October 24, 2023 10:51

github-actions bot added the Libraries: OK label Oct 24, 2023

delgod marked this pull request as draft October 24, 2023 11:03

phvalguima changed the title ~~Add standby/active multi-cluster support~~ [POC] Add standby/active multi-cluster support Oct 24, 2023

dragomirp reviewed Oct 25, 2023

View reviewed changes

marceloneppel reviewed Oct 25, 2023

View reviewed changes

Add support for clusters with >1 untis

0a47ced

github-actions bot added Libraries: Out of sync and removed Libraries: OK labels Oct 26, 2023

Add lint fix

2918489

Add a coordinator logic to make sure all units of the standby cluster…

73af835

… will stop their services before moving on and reconfiguring

marceloneppel closed this Jan 23, 2024

marceloneppel deleted the experimental-standby-cluster branch January 23, 2024 17:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[POC] Add standby/active multi-cluster support #301

[POC] Add standby/active multi-cluster support #301

phvalguima commented Oct 24, 2023

phvalguima commented Oct 24, 2023

dragomirp Oct 25, 2023

dragomirp Oct 25, 2023 •

edited

Loading

phvalguima Oct 25, 2023

phvalguima Oct 25, 2023

marceloneppel left a comment •

edited

Loading

phvalguima commented Oct 26, 2023

marceloneppel commented Jan 23, 2024

	# Copyright 2022 Canonical Ltd.
	# Copyright 2023 Canonical Ltd.

		# If this is a standby-leader, then execute switchover logic
		# TODO

[POC] Add standby/active multi-cluster support #301

[POC] Add standby/active multi-cluster support #301

Conversation

phvalguima commented Oct 24, 2023

phvalguima commented Oct 24, 2023

dragomirp Oct 25, 2023

Choose a reason for hiding this comment

dragomirp Oct 25, 2023 • edited Loading

Choose a reason for hiding this comment

phvalguima Oct 25, 2023

Choose a reason for hiding this comment

phvalguima Oct 25, 2023

Choose a reason for hiding this comment

marceloneppel left a comment • edited Loading

Choose a reason for hiding this comment

phvalguima commented Oct 26, 2023

marceloneppel commented Jan 23, 2024

dragomirp Oct 25, 2023 •

edited

Loading

marceloneppel left a comment •

edited

Loading