Using TimescaleDB with PGO, the open source Postgres Operator
One of the many reasons "the answer is Postgres" is due to its extensibility.
The ability to extend Postgres has given rise to an ecosystem of Postgres extensions that change the behavior of the database to support a wide range of interesting capabilities. At Crunchy Data we are big fans of PostGIS, the geospatial extender for Postgres.
Another extension we are asked about often is TimescaleDB.
TimescaleDB is an open-source extension designed to make SQL scalable for time-series data. Timescale, Inc., the company behind TimescaleDB, provides an Apache 2 edition of TimescaleDB that is packaged as a Postgres extension that provides automated partitioning across time and space.
We are often asked about the potential to deploy the Apache 2 edition of TimescaleDB as an extension within our Crunchy PostgreSQL for Kubernetes using PGO, the open source Postgres Operator. We announced that we added the Apache 2 edition of TimescaleDB to PGO 4.7, and we have brought TimescaleDB into PGO v5.
Let us look at how you can deploy the TimescaleDB extension as part of an HA Postgres cluster native to Kubernetes using the PGO Postgres Operator.
Deploying TimescaleDB on Kubernetes with PGO
For brevity, we will assume that you have already deployed PGO, the Postgres Operator from Crunchy Data in your Kubernetes environment. Read more on how you can quickly get started with PGO:
https://access.crunchydata.com/documentation/postgres-operator/v5/quickstart/
Since TimescaleDB is packaged in the PostgreSQL container, you can customize your Postgres configuration to get TimescaleDB up and running. We can do this before starting up the Postgres container by creating a manifest that looks similar to this:
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: hippo
spec:
image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:centos8-13.5-0
postgresVersion: 13
instances:
- name: instance1
dataVolumeClaimSpec:
accessModes:
- 'ReadWriteOnce'
resources:
requests:
storage: 1Gi
backups:
pgbackrest:
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:centos8-2.36-0
repos:
- name: repo1
volume:
volumeClaimSpec:
accessModes:
- 'ReadWriteOnce'
resources:
requests:
storage: 1Gi
patroni:
dynamicConfiguration:
postgresql:
parameters:
shared_preload_libraries: timescaledb
The above loads the TimescaleDB shared library along with the other standard
shared libraries loaded with PGO. You can apply the manifest with
kubectl apply
.
Once the cluster is initialized,
connect to the Postgres cluster
as a superuser.
You may want to create your own account that can be a superuser, or configure access to the postgres
user.
For the purposes of this exercise, we can connect to the database using
kubectl exec
:
kubectl exec -it -n postgres-operator -c database \
$(kubectl get pods -n postgres-operator --selector='postgres-operator.crunchydata.com/cluster=hippo,postgres-operator.crunchydata.com/role=master' -o name) \
-- psql
First, confirm that the TimescaleDB library was loaded:
SHOW shared_preload_libraries ;
should yield something similar to:
shared_preload_libraries
-----------------------------
pgaudit,pgaudit,timescaledb
This setup has the added benefit that the TimescaleDB library will be loaded on any replicas added to this Postgres cluster.
Now, add the TimescaleDB extension to the database:
CREATE EXTENSION timescaledb;
To test that TimescaleDB is successfully installed, go ahead and create a "hypertable" and start inserting data:
CREATE TABLE hippos (
observed_at timestamptz NOT NULL,
total int NOT NULL
);
SELECT create_hypertable('hippos', 'observed_at');
INSERT INTO hippos
SELECT ts, (random() * 100)::int
FROM generate_series(CURRENT_TIMESTAMP - '1 year'::interval, CURRENT_TIMESTAMP, '1 minute'::interval) ts;
This is just a small data set so you can observe that the TimescaleDB set up
works. Try using a smaller increment (e.g. '5 seconds'::interval
) to generate
a larger data set!
Next Steps
At Crunchy Data, we want to provide users with the benefits of trusted open source Postgres, including the powerful ecosystem of Postgres extensions. TimescaleDB provides interesting functionality for managing time series data in Postgres. By enabling the easy deployment of TimeScaleDB through PGO, you can easily deploy TimescaleDB to manage your time series data in Postgres natively in Kubernetes.
(Interested in seeing PGO in action? Join us for a free webinar on Wednesday, Dec 15th. There is also a paid, in-depth training available on November 29th.)