floyd -> jimmy switchover

Well in advance

Announce downtime on blog.
Announce on Twitter/X/Y/Z.

A couple hours in advance

Stop any cron jobs that access the database.
Set a banner message on MB.

T=0

It will be helpful to have four shells open in advance to jimmy, hendrix, floyd, and aretha (ideally within a screen session on each server).

It is assumed that all host commands are run as root from within /root/docker-server-configs.

Bring all services down on the gateway.
Enable Consul maintenance mode on the primary postgres service on floyd:

./scripts/set_service_maintenance.sh enable postgres-floyd 65400

Stop the haproxy/pgbouncer containers on jimmy and hendrix:

docker stop pgbouncer-master pgbouncer-slave pgbouncer-any haproxy-postgres

Stop the barman "receive-wal" process on aretha:

docker exec barman sv down cron
docker exec barman sudo -u barman barman receive-wal --stop floyd

Perform the switchover on jimmy, inside a screen session:

docker exec -it postgres-jimmy /bin/bash

sudo -u postgres repmgr -f /etc/repmgr.conf --force-rewind --siblings-follow --dry-run standby switchover
sudo -u postgres repmgr -f /etc/repmgr.conf --force-rewind --siblings-follow standby switchover
sed -i 's/floyd/jimmy/g' /etc/repmgr.conf

Note: It will take 10 seconds for haproxy's health checks to detect the new master. This is controlled by recover_check_interval in /etc/pg_cluster/config.json. Wait.

Update repmgr.conf on hendrix to point to the new master:

docker exec postgres-hendrix sed -i 's/floyd/jimmy/g' /etc/repmgr.conf

On aretha, update the barman configuration and restart its cron services:

docker exec -it barman /bin/bash

mv /etc/barman.d/jimmy.conf.backup /etc/barman.d/jimmy.conf
mv /etc/barman.d/floyd.conf /etc/barman.d/floyd.conf.backup
sudo -u barman barman cron
sv up cron

Stop the postgres instance on floyd (unregistering it from repmgr first):

docker exec postgres-floyd sudo -u postgres repmgr -f /etc/repmgr.conf standby unregister
docker exec postgres-floyd sudo -u postgres stop_postgres.sh
docker stop postgres-floyd

Restart the haproxy/pgbouncer containers on jimmy and hendrix:

docker start pgbouncer-master pgbouncer-slave pgbouncer-any haproxy-postgres

Check that the master and slave pgbouncer services point to their respective roles on jimmy and hendrix:

psql -h localhost -p 65436 -U postgres -d template1 -c 'SELECT pg_is_in_recovery();' (should return 'f')
psql -h localhost -p 65437 -U postgres -d template1 -c 'SELECT pg_is_in_recovery();' (should return 't')

Bring all services back up on the gateway.
Restart any cron jobs that were previously stopped.