Introduction

When using PostgreSQL for the Delft-FEWS Central Database, you can create a HA database cluster using repmgr. This can be used as a protection against a database failure.

The setup consists of three PostgreSQL database servers:

primaryThe currently active database
standbyChanges on the primary node will be replicated to this node
witnessThe witness is not participating in the replication, but is used to prevent a split-brain situation

The repmgr daemon proces (repmgrd) monitors the primary node. When it is no longer reachable it will perform a fail-over, and promote the standby node to act as the primary. 

Setup

All nodes

To enable replication connections between the nodes, do the following on all nodes:

  • Create a file .pgpass in the home directory of user postgres with the following content:
    *:*:*:repmgr:secret 
  • Add a line like the following to pg_hba.conf:
    host    replication  all   0.0.0.0/0   scram-sha-256 

Primary

Take these steps to create the primary node:

  1. Install the PostgreSQL software on the server (as root):
    dnf install postgresql15-server repmgr-15*
  2. Initialize the PostgreSQL instance (as root):
    /usr/pgsql-15/bin/postgresql-15-setup initdb 
  3. Enable and start the service (as root):
    systemctl enable --now postgresql-15.service 
  4. Create the FEWS database, see Create central database schema and user - 2022.01 and later
  5. Configure PostgreSQL for use with FEWS, see PostgreSQL configuration - 2021.02 and later
  6. Configure PostgreSQL replication settings in postgresql.conf:
    max_wal_senders = 10 
    wal_level = 'replica' 
    hot_standby = on 
    archive_mode = on 
    archive_command = '/bin/true' 
    shared_preload_libraries = 'repmgr' 
  7. Setup repmgr
    1. Create the repmgr user and database as a superuser using psql (as user postgres):
      create user repmgr with superuser with password 'secret'; 
      create database repmgr with owner repmgr; 
    2. Edit the repmgr settings /etc/repmgr/15/repmgr.conf (as root):
      node_id=1 
      node_name=c-fews19691 
      conninfo='host=c-fews19691 user=repmgr dbname=repmgr connect_timeout=2' 
      data_directory='/var/lib/pgsql/15/data/' 
      failover=automatic 
      promote_command='/usr/pgsql-15/bin/repmgr standby promote -f /etc/repmgr/15/repmgr.conf --log-to-file' 
      follow_command='/usr/pgsql-15/bin/repmgr standby follow -f /etc/repmgr/15/repmgr.conf --log-to-file --upstream-node-id=%n' 
    3. Register the primary node (as user postgres):
      /usr/pgsql-15/bin/repmgr -f /etc/repmgr/15/repmgr.conf primary register 
    4. Enable and start the service (as root):
      systemctl enable --now repmgr-15.service 

Standby

Take these steps to create the standby node:

  1. Install the PostgreSQL software on the server (as root):
    dnf install postgresql15-server repmgr-15*
  2. Create the postgres data directory (/var/lib/pgsql/15/data), owned by user postgres
  3. Create the fews tablespace directories, owned by user postgres
  4. Do not initialize the PostgreSQL instance!
  5. Setup repmgr
    1. Edit the repmgr settings /etc/repmgr/15/repmgr.conf (as root):
      node_id=2 
      node_name=c-fews27556 
      conninfo='host=c-fews27556 user=repmgr dbname=repmgr connect_timeout=2' 
      data_directory='/var/lib/pgsql/15/data/' 
      failover=automatic 
      promote_command='/usr/pgsql-15/bin/repmgr standby promote -f /etc/repmgr/15/repmgr.conf --log-to-file' 
      follow_command='/usr/pgsql-15/bin/repmgr standby follow -f /etc/repmgr/15/repmgr.conf --log-to-file --upstream-node-id=%n' 
    2. Clone the primary node:
      /usr/pgsql-15/bin/repmgr -h c-fews19691 -U repmgr -d repmgr -f /etc/repmgr/15/repmgr.conf --verbose standby clone 
    3. Enable and start the postgres service:
      systemctl enable --now postgresql-15.service
    4. Register the standby node (as user postgres):
      /usr/pgsql-15/bin/repmgr -f /etc/repmgr/15/repmgr.conf standby register 
    5. Enable and start the repmgr service (as root):
      systemctl enable --now repmgr-15.service 

Witness

For the witness node, take these steps:

  1. Install the PostgreSQL software on the server (as root):
    dnf install postgresql15-server repmgr-15*
  2. Initialize the PostgreSQL instance (as root):
    /usr/pgsql-15/bin/postgresql-15-setup initdb 
  3. Enable and start the service (as root):
    systemctl enable --now postgresql-15.service 
  4. Setup repmgr
    1. Create the repmgr user and database as a superuser using psql (as user postgres):
      create user repmgr with superuser with password 'secret'; 
      create database repmgr with owner repmgr; 
    2. Edit the repmgr settings /etc/repmgr/15/repmgr.conf (as root):
      node_id=3 
      node_name=c-fews27231 
      conninfo='host=c-fews27231 user=repmgr dbname=repmgr connect_timeout=2' 
      data_directory='/var/lib/pgsql/15/data/' 
      Register the primary node (as user postgres):
      /usr/pgsql-15/bin/repmgr -f /etc/repmgr/15/repmgr.conf witness register -h c-fews19691 -F  
    3. Enable and start the service (as root):
      systemctl enable --now repmgr-15.service 

Check cluster status

The status of the cluster can be checked with this command:
/usr/pgsql-15/bin/repmgr -f /etc/repmgr/15/repmgr.conf cluster show 

The output should be something like this:

postgres@c-fews19691 ~ $ /usr/pgsql-15/bin/repmgr -f /etc/repmgr/15/repmgr.conf cluster show
 ID | Name        | Role    | Status    | Upstream    | Location | Priority | Timeline | Connection string
----+-------------+---------+-----------+-------------+----------+----------+----------+--------------------------------------------------------------
 1  | c-fews19691 | primary | * running |             | default  | 100      | 1        | host=c-fews19691 user=repmgr dbname=repmgr connect_timeout=2
 2  | c-fews27556 | standby |   running | c-fews19691 | default  | 100      | 1        | host=c-fews27556 user=repmgr dbname=repmgr connect_timeout=2
 3  | c-fews27231 | witness | * running | c-fews19691 | default  | 0        | n/a      | host=c-fews27231 user=repmgr dbname=repmgr connect_timeout=2

More information

See the following web pages for more information:

How to Automate PostgreSQL Replication and Failover with repmgr

repmgr documentation

  • No labels