Logical Replication on Standbys in Postgres 16
Postgres 16 is hot off the press with the beta release last week. I am really excited about the new feature that allows logical replication from standbys, allowing users to:
- create logical decoding from a read-only standby
- reduce the workload on the primary server
- have new ways to achieve high-availability for applications that require data synchronization across multiple systems or for auditing purposes
A second relevant and exciting new feature coming in 16 is that replication slots on a given standby persist the promotion of that standby to a primary. This means that in the event of primary server failure and promotion of a standby to primary, the replication slots will persist and the former-standby subscribers will not be affected.
These two together give PostgreSQL a huge boost in performance for big data operations. Applications moving data around in physical locations and folks doing data warehousing, analytics, data integration, and business intelligence. I’m going to walk through an example schema and database setup and offer some sample settings and code for creating logical replication from a standby.
Background on replication
At a high level PostgreSQL supports two main types of replication - streaming/physical and logical. The Write-Ahead Log can stream the entire set of physical files through a connection and represents the entire database on disk. Logical replication offers a more fine-grained approach, where you can specify individual database objects, such as tables or even specific rows, to be replicated to a remote server. You can read more about logical replication basics in Data To Go: Postgres Logical Replication.
A standby server in PostgreSQL is created by taking a base backup of the primary
server and continuously applying all changes made on the primary. A hot standby
is a standby server that can be promoted to become the primary server.
PostgreSQL saves modifications in WAL (Write-Ahead Log) records and replicates
them from the primary server to the standby servers. The configuration parameter
hot_standby_feeedback
prevents catalog rows from being deleted too early from
the primary, if a standby needs them.
Development of logical decoding from standbys started in 2018. The process involved significant effort due to many intricate details. To not miss the point here, this has been 5 years in the making and we’re ecstatic to see it land for Postgres 16.
Example setup for replication from a standby
Let’s say we have three different PostgreSQL servers, managing inventory for a global logistics company with warehouses distributed across the globe. In the Primary we have warehouse and inventory information, the standby is a physical copy high availability machine, and a third server that is getting specific changes for reporting purposes.
Primary
In your primary instance, you need users with replication privileges. For this example I created a user for streaming the changes to the standby, and another to publish changes to the subscribers.
CREATE ROLE repuser WITH REPLICATION LOGIN ENCRYPTED PASSWORD 'somestrongpassword';
CREATE ROLE pubuser WITH REPLICATION LOGIN ENCRYPTED PASSWORD 'differentstrongpassword';
Next I create a physical slot to copy changes from the primary to the standby. Strictly speaking this is not required, but in practice it really is. Without a physical replication slot, a restart of either node or a connection loss would disrupt the replication process:
SELECT pg_create_physical_replication_slot('hot_standby_1');
We're interested only in the inventory for the Salt Lake City warehouse, whose
code is SLC
. On the primary we will create a publication
inventory_requirements
for the tables the inventory
table depends on, and
another publication named inventory_slc_pub
:
CREATE PUBLICATION inventory_requirements_pub
FOR TABLE regions, countries, warehouses, products;
CREATE PUBLICATION inventory_slc_pub
FOR TABLE inventory WHERE (warehouse = 'SLC1');
GRANT SELECT ON TABLE regions, countries, warehouses, products, inventory
TO pubuser;
Standby
At this point I can create my standby instance. We’re going to
initialize our standby with pgBackRest
(pg_basebackup
can be used as well). Once your standby data directory has been
restored, you'll need to edit its postgresql.conf
and make sure it has some
parameters
(as documented here):
# Provide feedback to the primary or upstream standby about
# queries currently executing on this standby
hot_standby_feedback = on
# Use the physical replication slot we created previously
primary_slot_name = 'hot_standby_1'
hot_standby = on
archive_mode = on
# If level is changed to below logical, slots will be dropped
wal_level = logical
# standby streams changes from the primary
primary_conn_info = 'host=127.0.0.1 port=5432 user=repuser password=somestrongpassword'
max_wal_senders = 10 # max number of walsender processes
max_replication_slots = 10 # max number of replication slots
# If an upstream standby server is promoted to become the new
# primary, downstream servers will continue to stream from
# the new primary
recovery_target_timeline = 'latest'
Connecting to this standby will confirm it's in read-only mode:
SELECT pg_is_in_recovery();
pg_is_in_recovery
-------------------
t
At this point we have:
- A primary instance replicating through a physical slot to a working standby
with
hot_standby_feedback = on
- Publications named
inventory_requirements_pub
andinventory_slc_pub
Logical replica from standby
Now we can go to our reporting
PostgreSQL instance and subscribe to changes
from the standby. Before Postgres 16 this operation would fail.
One major advantage of logical replication is that you can subscribe to changes from PostgreSQL servers of different versions! That gives you a lot of flexibility when working with applications using different versions of PostgreSQL.
CREATE SUBSCRIPTION inventory_requirements_sub
CONNECTION 'dbname=inventory host=127.0.0.1 port=5434 user=pubuser password=differentstrongpassword'
PUBLICATION inventory_requirements_pub;
CREATE SUBSCRIPTION inventory_slc_sub
CONNECTION 'dbname=inventory host=127.0.0.1 port=5434 user=pubuser password=differentstrongpassword'
PUBLICATION inventory_slc_pub;
If your primary server is idle, this operation will hang. This is because the
standby is waiting for information from the primary. You can speed up the
creation of this information by calling the new function
pg_log_standby_snapshot()
on the primary. In this example, we'd call it twice
because we're creating two subscriptions.
SELECT pg_log_standby_snapshot();
pg_log_standby_snapshot
-------------------------
0/23000180
That allows the replica to continue, and produces a message like this, and tells us a replication slot was created on the standby. New and improved in Postgres 16!
NOTICE: created replication slot "inventory_requirements_sub" on publisher
CREATE SUBSCRIPTION
We can verify that in the pg_stat_replication
system view on the standby.
SELECT pid, application_name, state, sync_state FROM pg_stat_replication;
pid | application_name | state | sync_state
-------+----------------------------+-----------+------------
23265 | inventory_slc_sub | streaming | async
23251 | inventory_requirements_sub | streaming | async
(2 rows)
As soon as the changes are replicated to the standby, they will then be copied
downstream to the reporting server, and we'll see those changes there. Notice
that only the SLC1
records replicate over.
SELECT * FROM inventory ORDER BY product_id;
warehouse | product_id | quantity
-----------+------------+----------
SLC1 | 11 | 7
SLC1 | 13 | 13
SLC1 | 15 | 18
SLC1 | 22 | 15
SLC1 | 24 | 20
From here I can create multiple logical replicas if I need, for different locations and warehouses.
Replication slots persisting failover
As we saw previously, when we created subscriptions that connected to the standby, it created replication slots on a non-writable standby. What happens if our primary has a catastrophic failure and our standby gets promoted? Will the different warehouses stop receiving changes because the standby we were getting changes from can no longer talk to the primary?
In Postgres 16 the replication slots persist after a failover, which is very exciting! Our former standby gets promoted to become the primary, the replication slots failover are persisted, and our subscribers continue to receive changes as if nothing had happened!
SELECT pg_promote();
pg_promote
------------
t
(1 row)
SELECT pg_is_in_recovery();
pg_is_in_recovery
-------------------
f
(1 row)
UPDATE inventory SET quantity = 2
WHERE warehouse = 'SLC1' and product_id = 11;
UPDATE 1
In our warehouse reporting PostgreSQL server, subscribed to our now-primary
server, the changes to the inventory
table appeared without any other action
being taken (make sure that the tables you're logically replicating have a
REPLICA IDENTITY
set):
SELECT * FROM inventory WHERE warehouse = 'SLC1' order by product_id;
warehouse | product_id | quantity
-----------+------------+----------
SLC1 | 11 | 2
SLC1 | 13 | 13
SLC1 | 15 | 18
SLC1 | 22 | 15
SLC1 | 24 | 20
Conclusion
Did I mention I was excited?
As Postgres continues to be the database people turn to, it makes sense that supporting richer options in flowing data continue to arrive in Postgres. Logical replication from standbys continues to build on what first arrived several years ago and has gotten many enhancements since. Thanks to the amazing work of the PostgreSQL community, in PG 16 standby servers users will be able to:
- Create logical replication slots
- Launch logical decoding
- Subscribe to changes on a standby
- Persist logical replication slots after a failover
A standby server will be able to serve as source for logical replication subscribers with very few changes:
- Enable
hot_standby_feedback = on
- Use a physical replication slot to replicate from upstream to the standby
- Run
pg_log_standby_snapshot()
on the primary if a subscriber stalls waiting for changes while creating a subscription to a standby.
Documentation for this new feature is still being written and improved upon, and I'll use what I learned to submit improvements. We’re excited for you to download, test and offer feedback too.
Image credit: AI-generated image by photographer (and my son <3) Rafael Mello.