Hi Everyone,
For context : We’re trying to integrate PostgreSQL with Patroni & HAProxy along with zookeeper & keepalived.
HAProxy version 2.6.0-a1efc04 2022/05/31
psql (PostgreSQL) 16.4
patroni 4.0.2
PostgreSQL, HAProxy, Patroni, Keepalived, zookeeper installed on two servers (10.0.0.1, 10.0.0.2) and zookeeper is installed on one additional server (10.0.0.3) to maintain quorum.
Below are the configuration files of patroni and haproxy along with other information
patronictl -c /etc/patroni.yml list
- Cluster: postgres (7424658285489500758) -----±—±----------+
| Member | Host | Role | State | TL | Lag in MB |
±-------±--------------±--------±----------±—±----------+
| node1 | 10.0.0.1| Leader | running | 12 | |
| node2 | 10.0.0.2 | Replica | streaming | 12 | 0 |
±-------±--------------±--------±----------±—±----------+
Curl works on both patroni nodes on 8008 port.
curl http://10.0.0.1:8008/patroni
{“state”: “running”, “postmaster_start_time”: “2024-10-15 06:59:13.591938-07:00”, “role”: “replica”, “server_version”: 160004, “xlog”: {“received_location”: 184549376, “replayed_location”: 184550096, “replayed_timestamp”: null, “paused”: false}, “timeline”: 12, “replication_state”: “streaming”, “dcs_last_seen”: 1729005724, “database_system_identifier”: “7424658285489500758”, “patroni”: {“version”: “4.0.2”, “scope”: “postgres”, “name”: “node2”}}
haproxy.cfg file
global
log /dev/log local0
log /dev/log local1 notice
maxconn 2000
user haproxy
group haproxy
daemon
defaults
log global
option tcplog
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms
frontend postgres_front
bind *:5433
default_backend postgres_back
backend postgres_back
option httpchk
http-check send meth GET uri /patroni
http-check expect status 200
server-template pg 2 10.0.0.15432 check inter 2000 rise 2 fall 3
server node1 10.0.0.1:5432 check
server node2 10.0.0.2:5432 check
option http-server-close
option httpchk GET /patroni
patroni.yml file from one postgresql server, a bit different on the second node.
scope: postgres
namespace: /db/
name: node1 # Unique name for the primary node
restapi:
listen: 10.0.0.1:8008 # Node IP address
connect_address: 10.0.0.1:8008
zookeeper:
hosts: 10.0.0.1:2181,10.0.0.2:2181,10.0.0.3:2181 # Zookeeper nodes
bootstrap:
dcs:
ttl: 30
loop_wait: 10
maximum_lag_on_failover: 1048576
postgresql:
use_pg_rewind: true
postgresql:
user: postgres
listen: ‘*’ # Listen on all interfaces
connect_address: 10.0.0.1:5432 # Primary server IP
data_dir: /casdb/postgres/data/ # Data directory
bin_dir: /usr/pgsql-16/bin # PostgreSQL binary directory
authentication:
replication:
username: repuser
password: xxxxxxx
superuser:
username: postgres
password: xxxxxxxx
parameters: # PostgreSQL parameters can be configured here
max_connections: 100
shared_buffers: 256MB
effective_cache_size: 768MB
maintenance_work_mem: 64MB
wal_level: replica
archive_mode: on
archive_command: ‘test ! -f /casdb/postgres/walarch/%f && cp %p /casdb/postgres/walarch/%f’ # Updated archive command
synchronous_commit: on
hot_standby: on
primary_conninfo: ‘’ # Not needed for primary
restore_command: ‘’ # Not needed for primary
So far zookeeper, postgresql, patroni running fine and when I start the HAProxy service seeing the below error.
[root@testserver1~]# haproxy -f /etc/haproxy/haproxy.cfg -d
Available polling systems :
epoll : pref=300, test result OK
poll : pref=200, test result OK
select : pref=150, test result FAILED
Total: 3 (2 usable), will use epoll.
Available filters :
[CACHE] cache
[COMP] compression
[FCGI] fcgi-app
[SPOE] spoe
[TRACE] trace
Using epoll() as the polling mechanism.
[WARNING] (136627) : Server postgres_back/pg1 is DOWN, reason: Layer7 invalid response, check duration: 2ms. 3 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[WARNING] (136627) : Server postgres_back/pg2 is DOWN, reason: Layer7 invalid response, check duration: 2ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[WARNING] (136627) : Server postgres_back/node1 is DOWN, reason: Layer7 invalid response, check duration: 2ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[WARNING] (136627) : Server postgres_back/node2 is DOWN, reason: Layer7 invalid response, check duration: 4ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[ALERT] (136627) : backend ‘postgres_back’ has no server available!
Broadcast message from systemd-journald@testserver1(Tue 2024-10-15 07:28:34 PDT):
haproxy[136627]: backend postgres_back has no server available!
Broadcast message from systemd-journald@testserver1 (Tue 2024-10-15 07:28:34 PDT):
haproxy[136627]: backend postgres_back has no server available!
Message from syslogd@testserver1 Oct 15 07:28:34 …
haproxy[136627]:backend postgres_back has no server available!
Message from syslogd@testserver1 at Oct 15 07:28:34 …
haproxy[136627]:backend postgres_back has no server available!
Could someone share thoughts in case if I’m missing any thing here.