HAProxy backend postgres_back has no server available

Hi Everyone,

For context : We’re trying to integrate PostgreSQL with Patroni & HAProxy along with zookeeper & keepalived.

HAProxy version 2.6.0-a1efc04 2022/05/31
psql (PostgreSQL) 16.4
patroni 4.0.2

PostgreSQL, HAProxy, Patroni, Keepalived, zookeeper installed on two servers (10.0.0.1, 10.0.0.2) and zookeeper is installed on one additional server (10.0.0.3) to maintain quorum.

Below are the configuration files of patroni and haproxy along with other information

patronictl -c /etc/patroni.yml list

  • Cluster: postgres (7424658285489500758) -----±—±----------+
    | Member | Host | Role | State | TL | Lag in MB |
    ±-------±--------------±--------±----------±—±----------+
    | node1 | 10.0.0.1| Leader | running | 12 | |
    | node2 | 10.0.0.2 | Replica | streaming | 12 | 0 |
    ±-------±--------------±--------±----------±—±----------+

Curl works on both patroni nodes on 8008 port.

curl http://10.0.0.1:8008/patroni
{“state”: “running”, “postmaster_start_time”: “2024-10-15 06:59:13.591938-07:00”, “role”: “replica”, “server_version”: 160004, “xlog”: {“received_location”: 184549376, “replayed_location”: 184550096, “replayed_timestamp”: null, “paused”: false}, “timeline”: 12, “replication_state”: “streaming”, “dcs_last_seen”: 1729005724, “database_system_identifier”: “7424658285489500758”, “patroni”: {“version”: “4.0.2”, “scope”: “postgres”, “name”: “node2”}}

haproxy.cfg file

global
log /dev/log local0
log /dev/log local1 notice
maxconn 2000
user haproxy
group haproxy
daemon

defaults
log global
option tcplog
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms

frontend postgres_front
bind *:5433
default_backend postgres_back

backend postgres_back
option httpchk
http-check send meth GET uri /patroni
http-check expect status 200

server-template pg 2 10.0.0.15432 check inter 2000 rise 2 fall 3
server node1 10.0.0.1:5432 check
server node2 10.0.0.2:5432 check

option http-server-close
option httpchk GET /patroni

patroni.yml file from one postgresql server, a bit different on the second node.

scope: postgres
namespace: /db/
name: node1 # Unique name for the primary node

restapi:
listen: 10.0.0.1:8008 # Node IP address
connect_address: 10.0.0.1:8008

zookeeper:
hosts: 10.0.0.1:2181,10.0.0.2:2181,10.0.0.3:2181 # Zookeeper nodes

bootstrap:
dcs:
ttl: 30
loop_wait: 10
maximum_lag_on_failover: 1048576
postgresql:
use_pg_rewind: true

postgresql:
user: postgres
listen: ‘*’ # Listen on all interfaces
connect_address: 10.0.0.1:5432 # Primary server IP
data_dir: /casdb/postgres/data/ # Data directory
bin_dir: /usr/pgsql-16/bin # PostgreSQL binary directory
authentication:
replication:
username: repuser
password: xxxxxxx
superuser:
username: postgres
password: xxxxxxxx
parameters: # PostgreSQL parameters can be configured here
max_connections: 100
shared_buffers: 256MB
effective_cache_size: 768MB
maintenance_work_mem: 64MB
wal_level: replica
archive_mode: on
archive_command: ‘test ! -f /casdb/postgres/walarch/%f && cp %p /casdb/postgres/walarch/%f’ # Updated archive command
synchronous_commit: on
hot_standby: on
primary_conninfo: ‘’ # Not needed for primary
restore_command: ‘’ # Not needed for primary

So far zookeeper, postgresql, patroni running fine and when I start the HAProxy service seeing the below error.

[root@testserver1~]# haproxy -f /etc/haproxy/haproxy.cfg -d
Available polling systems :
epoll : pref=300, test result OK
poll : pref=200, test result OK
select : pref=150, test result FAILED
Total: 3 (2 usable), will use epoll.

Available filters :
[CACHE] cache
[COMP] compression
[FCGI] fcgi-app
[SPOE] spoe
[TRACE] trace
Using epoll() as the polling mechanism.
[WARNING] (136627) : Server postgres_back/pg1 is DOWN, reason: Layer7 invalid response, check duration: 2ms. 3 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[WARNING] (136627) : Server postgres_back/pg2 is DOWN, reason: Layer7 invalid response, check duration: 2ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[WARNING] (136627) : Server postgres_back/node1 is DOWN, reason: Layer7 invalid response, check duration: 2ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[WARNING] (136627) : Server postgres_back/node2 is DOWN, reason: Layer7 invalid response, check duration: 4ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[ALERT] (136627) : backend ‘postgres_back’ has no server available!

Broadcast message from systemd-journald@testserver1(Tue 2024-10-15 07:28:34 PDT):

haproxy[136627]: backend postgres_back has no server available!

Broadcast message from systemd-journald@testserver1 (Tue 2024-10-15 07:28:34 PDT):

haproxy[136627]: backend postgres_back has no server available!

Message from syslogd@testserver1 Oct 15 07:28:34 …
haproxy[136627]:backend postgres_back has no server available!

Message from syslogd@testserver1 at Oct 15 07:28:34 …
haproxy[136627]:backend postgres_back has no server available!

Could someone share thoughts in case if I’m missing any thing here.

Your server lines are missing the port for the check. You are using option httpchk but with a missing “check port 8008” on those lines, the check will be sent to postgres, not to patroni.