Effective PostgreSQL Cluster Configuration & Management Using PGO v5.1
Modern day production ready Postgres solutions require quite a bit of sophistication and automation. Changes need to be applied in a uniform and safe way. DevOps and SRE teams need to be in control system updates while limiting disruption to their users.
With the release of PGO v5.1, we are excited to announce enhancements in each of these areas. Not only does PGO v5.1 now automatically rollout all PostgreSQL configuration changes, but it allows you to protect your running databases against outages due to voluntary disruptions. Additionally, manual switchover and failover support means you can fully control your current cluster topology via your PostgresCluster spec.
PGO v5.1 delivers on what it means to be a production-ready PostgreSQL solution by providing the key features needed to configure, manage and protect your production PostgreSQL databases. Let’s take a look at what each of these new features looks like in action.
Rolling Database Restarts
Prior to PGO v5.1, if you changed a PostgreSQL configuration setting that required database restart, then it was necessary to manually trigger a rolling restart to apply that change. As of PGO v5.1, however, this step is no longer required. PGO will now detect and safely rollout configuration changes requiring a database restart.
To see this in action, update your PostgresCluster spec with a configuration
change that will require a database restart. For this example we will update the
shared_buffers
to 256MB
:
$ kubectl patch postgrescluster hippo --type=merge -p '{"spec":{"patroni":{"dynamicConfiguration":
{"postgresql":{"parameters":{"shared_buffers": "256MB"}}}}}}'
postgrescluster.postgres-operator.crunchydata.com/hippo patched
At this point, not only will PGO apply the desired configuration change, but it will also detect that a database restart is required. It will then restart all PostgreSQL instances in the cluster to automatically rollout the change.
Looking at the logs for any PostgreSQL instance in the cluster, you should see the following:
$ kubectl logs hippo-instance1-jl7v-0 -c database | grep shared_buffers
2022-04-03 18:23:31,347 INFO: Changed shared_buffers from 16384 to 256MB (restart might be required)
And by looking at the PostgreSQL logs you can see that the database was restarted:
$ kubectl exec hippo-instance1-jl7v-0 -c database -- bash -c 'cat /pgdata/pg13/log/*.log' | tail -n 5
...
2022-04-03 18:33:09.234 UTC [3188] LOG: database system was shut down at 2022-04-03 18:33:08 UTC
2022-04-03 18:33:09.276 UTC [3181] LOG: database system is ready to accept connections
You can also see that the setting has been applied via psql
:
$ kubectl exec hippo-instance1-jl7v-0 -c database -- psql -t -c 'show shared_buffers;'
256MB
This therefore demonstrates how PGO v5.1 makes PostgreSQL configuration easier than ever before.
Pod Disruption Budgets
Voluntary disruptions are often a reality in any Kubernetes environment. For instance, someone might drain a node containing a running PostgreSQL database. And when this occurs, it is important that all production databases be protected from an unexpected outage to the fullest extent possible.
We are therefore excited to announce support for protection against voluntary cluster disruptions in PGO v5.1. By simply defining more than once replica for an instance set or a PgBouncer deployment, PGO will now automatically create a Pod Disruption Budget (PDB) that will ensure at least one Pod for that instance set or PgBouncer deployment remains available when a voluntary disruption occurs. For instance, you can see this in action by defining two replicas for a PgBouncer deployment:
$ kubectl patch postgrescluster hippo --type='json' \
-p='[{"op":"replace","path":"/spec/pro
xy/pgBouncer/replicas","value":2}]'
postgrescluster.postgres-operator.crunchydata.com/hippo patched
$ kubectl get pdb
NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE
hippo-pgbouncer 1 N/A 1 4s
You can also set the minAvailable
field for any instance set or PgBouncer
deployment. This will cause PGO to reconcile a PDB that enforces that the
defined number of Pods is always available.
For example, the following will ensure at least two Pods are available for an instance set with five replicas:
spec:
instances:
- name: instance1
replicas: 3
minAvailable: 2
Once the above spec is applied, PGO will reconcile a corresponding PDB:
$ kubectl get pdb
NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE
hippo-set-instance1 2 N/A 1 110s
For more information about voluntary disruptions and PDBs, please see the following:
Manual Switchovers & Failovers
While manual switchovers and/or failover are not typically required during normal operations (since the HA system will automatically failover to a replica when needed), there are certain circumstances where this functionality can be beneficial. For instance, you might want to move the primary to an instance that is running on a specific node prior to performing maintenance. Or, you might simply want to test various failover scenarios in your environment. Whatever the use case might be, PGO now has you covered.
With PGO v5.1, it is now possible to manually switchover and failover for your PostgreSQL clusters, all via the PostgresCluster spec. For instance, let's say you want to switchover to an instance called hippo-instance1-t97d within your cluster. First, you would add the following to your PostgresCluster spec to define the instance that you want to switchover to:
spec:
patroni:
switchover:
enabled: true
targetInstance: hippo-instance1-t97d
Then, annotate PostgresCluster to trigger the actual switchover:
$ kubectl annotate -n postgres-operator postgrescluster hippo \
postgres-operator.crunchydata.com/trigger-switchover="$(date)"
postgrescluster.postgres-operator.crunchydata.com/hippo annotated
Once PGO detects the annotation, the switchover will occur. This means instance
hippo-instance1-t97d
will now be promoted to primary. You can verify this by
looking for the Pod in the cluster with a value of master for the
postgres-operator.crunchydata.com/role label.
$ kubectl get pods \
--selector=postgres-operator.crunchydata.com/role=master,postgres-operator.crunchydata.com/cluster=hippo
NAME READY STATUS RESTARTS AGE hippo-instance1-t97d-0 4/4
Running 0 26h
As you can see, Pod hippo-instance1-t97d-0
now has the master role, which
means the switchover was successful.
Next Steps
Thank you for taking the time to explore a few of the new features included in PGO v5.1! There is a lot more to get excited about in this release, so I encourage you to see the PGO documentation for more details. For information about upgrading to PGO v5.1, please see the Upgrade Documentation. If you are installing PGO for the first time, please be sure to check out the quickstart.