1.8.4-1deb90d all max counters now at high numbers


#1

Upgraded from 1.7.9 to 1.8.4 and now finding that stats shows max counters at unbelievable high numbers
even after a ‘clear counters all’, they climb/bump quick to similar high figures.

Eg. Frontends and Backends ‘Max Session Rate’ or ‘Max Sessions’ are in millions, which isn’t true. They normally very in hundredes in 1.7.9.

Could this be a bug?


#2

Can you try 1.8.5 (just released), it has the following fix which I believe is related:

commit f4bae5e29b (BUG/MINOR: listener: Don’t decrease actconn twice when a new session is rejected)


#3

1.8.5 didn’t fix the weirdly high max cnx rate numbers seen vs 1.7.9 :confused:


#4

@capflam any idea about this?


#5

I’ll take a look. For now, no idea.


#6

Just to be sure, did you start HAProxy with several threads ? Could you also provide output of “show info” and “show stat” CLI commands.

I’m bit puzzled for now because “Max sessions” (smax) and “Max sessions rate” (rate_max) depend on different counters. And you talked about “millions” vs “hundreds”. So this is not an overflow (and btw rates are never decremented). So, for now, I have no clue.

Finally, could you provide the ouput of “haproxy -vv” command ?


#7

Yes using 6 processes, info from proc 1:

Name: HAProxy
Version: 1.8.5
Release_date: 2018/03/23
Nbthread: 1
Nbproc: 6
Process_num: 1
Pid: 27284
Uptime: 6d 0h25m29s
Uptime_sec: 519929
Memmax_MB: 0
PoolAlloc_MB: 9
PoolUsed_MB: 5
PoolFailed: 0
Ulimit-n: 327894
Maxsock: 327894
Maxconn: 163840
Hard_maxconn: 163840
CurrConns: 533
CumConns: 14114321
CumReq: 34066026
MaxSslConns: 0
CurrSslConns: 522
CumSslConns: 13894178
Maxpipes: 0
PipesUsed: 0
PipesFree: 0
ConnRate: 85
ConnRateLimit: 0
MaxConnRate: 33554432
SessRate: 85
SessRateLimit: 0
MaxSessRate: 32764
SslRate: 83
SslRateLimit: 0
MaxSslRate: 8048
SslFrontendKeyRate: 25
SslFrontendMaxKeyRate: 59
SslFrontendSessionReuse_pct: 70
SslBackendKeyRate: 0
SslBackendMaxKeyRate: 0
SslCacheLookups: 11403938
SslCacheMisses: 1671904
CompressBpsIn: 0
CompressBpsOut: 0
CompressBpsRateLim: 0
ZlibMemUsage: 0
MaxZlibMemUsage: 0
Tasks: 686
Run_queue: 1
Idle_pct: 84
node: hapA

Stat from proc 1:

    pxname,svname,qcur,qmax,scur,smax,slim,stot,bin,bout,dreq,dresp,ereq,econ,eresp,wretr,wredis,status,weight,act,bck,chkfail,chkdown,lastchg,downtime,qlimit,pid,iid,sid,throttle,lbtot,tracked,type,rate,rate_lim,rate_max,check_status,check_code,check_duration,hrsp_1xx,hrsp_2xx,hrsp_3xx,hrsp_4xx,hrsp_5xx,hrsp_other,hanafail,req_rate,req_rate_max,req_tot,cli_abrt,srv_abrt,comp_in,comp_out,comp_byp,comp_rsp,lastsess,last_chk,last_agt,qtime,ctime,rtime,ttime,agent_status,agent_code,agent_duration,check_desc,agent_desc,check_rise,check_fall,check_health,agent_rise,agent_fall,agent_health,addr,cookie,mode,algo,conn_rate,conn_rate_max,conn_tot,intercepted,dcon,dses,
    stats1,FRONTEND,,,1,19259712,32,8,3285,185414,0,0,0,,,,,OPEN,,,,,,,,,1,2,0,,,,0,0,0,3284456480,,,,0,5,0,3,0,0,,0,4225440,8,,,0,0,0,0,,,,,,,,,,,,,,,,,,,,,http,,0,21390792,8,5,0,0,
    stats1,BACKEND,0,0,0,0,4,0,3285,185414,0,0,,0,0,0,0,UP,0,0,0,,0,520076,,,1,2,0,,0,,1,0,,0,,,,0,0,0,0,0,0,,,,0,0,0,0,0,0,0,7,,,0,0,0,0,,,,,,,,,,,,,,http,roundrobin,,,,,,,
    fe-web,FRONTEND,,,492,18595296,32768,639587,3642563621,37798592196,10671,0,185340,,,,,OPEN,,,,,,,,,1,27,0,,,,0,69,500,3284456480,,,,0,819220,164857,197397,106,853,,145,4225440,1182473,,,0,0,0,0,,,,,,,,,,,,,,,,,,,,,http,,75,21080440,658791,0,0,0,
    be-web,fep1,0,0,3,5245653,,92530,306966633,3676023375,,0,,0,1,0,0,UP,1,1,0,0,0,520076,0,,1,28,1,,17221,,2,14,,12614064,L4OK,,0,0,77106,15179,186,2,0,,,,,457,1,,,,,0,,,0,0,213,1042,,,,Layer4 check passed,,2,3,4,,,,62.243.41.97:80,,http,,,,,,,,
    be-web,fep2,0,0,4,5245653,,96043,329539229,3732712007,,0,,0,2,0,0,UP,1,1,0,0,0,520076,0,,1,28,2,,17428,,2,5,,12649824,L4OK,,0,0,80175,15658,151,7,0,,,,,465,1,,,,,0,,,0,1,195,936,,,,Layer4 check passed,,2,3,4,,,,62.243.41.98:80,,http,,,,,,,,
    be-web,fep3,0,0,5,5245653,,100665,375063727,3862172891,,0,,0,8,0,0,UP,1,1,0,0,0,520076,0,,1,28,3,,18372,,2,7,,12685584,L4OK,,0,0,82530,17886,160,12,0,,,,,506,0,,,,,0,,,0,0,167,939,,,,Layer4 check passed,,2,3,4,,,,62.243.41.99:80,,http,,,,,,,,
    be-web,fep4,0,0,5,5245653,,101679,365195999,4063682665,,0,,0,4,0,0,UP,1,1,0,0,0,520076,0,,1,28,4,,18546,,2,5,,12721344,L4OK,,0,0,84205,17278,132,4,0,,,,,526,0,,,,,0,,,0,1,165,916,,,,Layer4 check passed,,2,3,4,,,,62.243.41.100:80,,http,,,,,,,,
    be-web,fep5,0,0,5,5245653,,95127,318293895,3676708755,,0,,0,3,0,0,UP,1,1,0,0,0,520076,0,,1,28,5,,17925,,2,14,,12757104,L4OK,,0,0,80222,14682,162,0,0,,,,,502,0,,,,,0,,,0,1,171,887,,,,Layer4 check passed,,2,3,4,,,,62.243.41.101:80,,http,,,,,,,,
    be-web,fep6,0,0,6,5245653,,99698,370914269,3791911151,,0,,0,1,0,0,UP,1,1,0,0,0,520076,0,,1,28,6,,18247,,2,10,,12792864,L4OK,,0,0,83065,16369,153,6,0,,,,,522,4,,,,,0,,,0,0,186,1029,,,,Layer4 check passed,,2,3,4,,,,62.243.41.102:80,,http,,,,,,,,
    be-web,fep7,0,0,6,5245653,,99507,365312472,3715952798,,0,,0,2,0,0,UP,1,1,0,0,0,520076,0,,1,28,7,,19102,,2,18,,12828624,L4OK,,0,0,81597,17664,164,5,0,,,,,476,0,,,,,0,,,0,1,167,903,,,,Layer4 check passed,,2,3,4,,,,62.243.41.103:80,,http,,,,,,,,
    be-web,fep8,0,0,5,5245653,,99349,422091772,3605632900,,0,,0,2,0,0,UP,1,1,0,0,0,520076,0,,1,28,8,,19150,,2,9,,12864384,L4OK,,0,0,82680,16457,136,1,0,,,,,511,5,,,,,0,,,0,1,201,990,,,,Layer4 check passed,,2,3,4,,,,62.243.41.104:80,,http,,,,,,,,
    be-web,fep9,0,0,7,5245653,,98693,399218589,3827516565,,0,,0,6,0,0,UP,1,1,0,0,0,520076,0,,1,28,9,,19447,,2,8,,12900144,L4OK,,0,0,82932,15524,150,2,0,,,,,504,0,,,,,0,,,0,0,192,1017,,,,Layer4 check passed,,2,3,4,,,,62.243.41.105:80,,http,,,,,,,,
    be-web,fep10,0,0,3,5245653,,103150,378328402,3805450664,,0,,0,4,0,0,UP,1,1,0,0,0,520076,0,,1,28,10,,19190,,2,21,,12935904,L4OK,,0,0,84733,18160,155,4,0,,,,,536,2,,,,,0,,,0,0,157,926,,,,Layer4 check passed,,2,3,4,,,,62.243.41.106:80,,http,,,,,,,,
    be-web,BACKEND,0,0,49,20705376,3277,986625,3631212473,37757763771,0,0,,0,33,0,0,UP,10,10,0,,0,520076,0,,1,28,0,,184628,,1,116,,20705392,,,,0,819220,164857,1549,106,853,,,,986585,5175,13,0,0,0,0,0,,,0,0,168,970,,,,,,,,,,,,,,http,leastconn,,,,,,,

haproxy version outout:

# haproxy -vv
HA-Proxy version 1.8.5 2018/03/23
Copyright 2000-2018 Willy Tarreau <willy@haproxy.org>

Build options :
  TARGET  = linux2628
  CPU     = x86-64
  CC      = gcc
  CFLAGS  = -g -fno-strict-aliasing -Wdeclaration-after-statement -fwrapv -fno-strict-overflow -Wno-unused-label
  OPTIONS = USE_ZLIB=1 USE_OPENSSL=1 USE_PCRE=1

Default settings :
  maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with network namespace support.
Built with zlib version : 1.2.3
Running on zlib version : 1.2.3
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with PCRE version : 7.8 2008-09-05
Running on PCRE version : 7.8 2008-09-05
PCRE library supports JIT : no (USE_PCRE_JIT not set)
Built with multi-threading support.
Encrypted password support via crypt(3): yes
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
Built with OpenSSL version : OpenSSL 1.0.2n  7 Dec 2017
Running on OpenSSL version : OpenSSL 1.0.2n  7 Dec 2017
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : SSLv3 TLSv1.0 TLSv1.1 TLSv1.2

Available polling systems :
      epoll : pref=300,  test result OK
       poll : pref=200,  test result OK
     select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

Available filters :
        [TRACE] trace
        [COMP] compression
        [SPOE] spoe

#8

You talked about processes, so I presume you’re starting several processes and not threads (nbproc instead of nbthread parameter in the global section). If I’m right, this is not a thread-safety issue. So I need more info…

Could you share your configuration ? What is you’re system date ? And more generally, what is your system (OS, Hardware, VM…) ?

I have no idea for now, but I suspect a problem in update_freq_ctr or tv_update_date.


#9

Correct, not threads but nbproc > 1. All stats looked realistic/fine prior to 1.8, now Max rate as well as Max sessions are very too high/unrealistic, eg. Max sessions per real server in below web backend are now all the same: 5.245.653, same number are seen for other backend real servers. So it seems to be a bug ImHO.

Not sure what you mean by ‘system date’ (time is of course sync by ntp)?
System is linux KVMs running CentOS 6.9 as OS in 4x vCPUs, 4GB.

Config snippet:

global
   maxconn 163840
   ca-base /etc/haproxy/ssl
   crt-base /etc/haproxy/ssl
   group haproxy
   user root
   chroot /opt/haproxy-jail
   daemon
   pidfile /var/run/haproxy.pid
   log localhost local0 info

   nbproc 6
   stats socket /var/lib/haproxy/stats.web user haproxy group haproxy mode 660 level admin process 1
   stats socket /var/lib/haproxy/stats.pop user haproxy group haproxy mode 660 level admin process 2
   stats socket /var/lib/haproxy/stats.imap user haproxy group haproxy mode 660 level admin process 3
   stats socket /var/lib/haproxy/stats.dmta user haproxy group haproxy mode 660 level admin process 4
   stats socket /var/lib/haproxy/stats.momi user haproxy group haproxy mode 660 level admin process 5
   stats socket /var/lib/haproxy/stats.momxdiv user haproxy group haproxy mode 660 level admin process 6

   tune.ssl.default-dh-param 2048      # set DH params to 2048 bits
   ssl-default-bind-ciphers    EECDH+AES:EDH+AES:-SHA1:EECDH+AES256:EDH+AES256:AES256-SHA:!aNULL:!eNULL:!EXP:!LOW:!MD5:!RC4
   ssl-default-server-ciphers  EECDH+AES:EDH+AES:-SHA1:EECDH+AES256:EDH+AES256:AES256-SHA:!aNULL:!eNULL:!EXP:!LOW:!MD5:!RC4
   ssl-default-bind-options    no-sslv3 no-tls-tickets
   ssl-server-verify none

   spread-checks 3

mailers sysadminmailer
   mailer relay "${SMTPRELAY}"

peers happeers
   peer hapA ${PEER1}:2020
   peer hapB ${PEER2}:2020

defaults
   maxconn 32768
   email-alert mailers sysadminmailer
   email-alert from "${SYSADMINEMAIL}"
   email-alert to "${SYSADMINEMAIL}"
   mode http
   rate-limit sessions 500
   option  dontlog-normal
   option  persist
   option  redispatch
   option  contstats
   retries 3 
   timeout connect 10s
   timeout client 30s
   timeout server 30s
   timeout http-keep-alive 5s
   timeout http-request 5s
   timeout queue 30s
   timeout tarpit 1m
   timeout check 60s
   backlog 10000
   source 0.0.0.0 usesrc clientip
   balance leastconn

frontend fe-web
   bind-process 1
   bind ipv4@${VIP1}:80 transparent mss 1460
   #bind ipv4@${VIP1}:443 ssl crt ${CERT1} mss 1460 alpn h2,http/1.1
   bind ipv4@${VIP1}:443 ssl crt ${CERT1} mss 1460
   bind ipv4@${VIP2}:80 transparent mss 1460
   #bind ipv4@${VIP2}:443 ssl crt ${CERT2} mss 1460 alpn h2,http/1.1
   bind ipv4@${VIP2}:443 ssl crt ${CERT2} mss 1460

   # deny various offen hit unknown urls here instead of in backend servers
   acl denyurls path_beg,url_dec -i /Microsoft-Server-ActiveSync
   acl denyurls path_reg  apple-touch-icon.+png$
   acl denyurls path_beg,url_dec -i /favicon.ico
   http-request deny if denyurls
   #
   acl sslport dst_port 443
   reqadd SSLENCRYPTED:\ true if sslport
   reqadd SSLENCRYPTED:\ false if !sslport
   # set HSTS response header on sslport requests
   http-response set-header Strict-Transport-Security "max-age=31536000; includeSubDomains; preload;" if sslport
   #
   default_backend be-web

backend be-web
   bind-process 1
   stick-table type string len 64 size 100k expire 60m peers happeers
   stick store-response res.cook(token)
   stick match req.cook(token)
   default-server inter 30s downinter 60s rise 2
   server fep1 ipv4@${FEP1}:80 check
   server fep2 ipv4@${FEP2}:80 check
   server fep3 ipv4@${FEP3}:80 check
   server fep4 ipv4@${FEP4}:80 check
   server fep5 ipv4@${FEP5}:80 check
   server fep6 ipv4@${FEP6}:80 check
   server fep7 ipv4@${FEP7}:80 check
   server fep8 ipv4@${FEP8}:80 check
   server fep9 ipv4@${FEP9}:80 check
   server fep10 ipv4@${FEP10}:80 check

#10

Hi @stefws ,

Sorry for the delay. I think I finally found the bug. I have a little less hair now but it should be good. It was a problem about macros expansions in hathreads.h with gcc < 4.7.

Could check the following patch please ? It must be applied on haproxy-1.8.


#11

@capflam thanks looks so much better when applied to 1.8.5 :slight_smile:


#12

Cool ! Thanks for your feedback.