我正在尝试在mongo服务器前面的tcp模式下使用haproxy。 在haproxy机器上,我有一个mongo客户端来testing。
从haproxy机器直接连接到mongo服务器时,它工作100%
当我使用haproxy从haproxy机器连接到mongo服务器时,大约有25%的时间无法协商正确的mongo连接。 Mongo客户端说recv():消息len 1347703880是太大了。 最大是48000000
这与mongo客户端或服务器似乎并不是问题,因为连接直接工作100%的时间。
场景中的服务器:
10.5.198.10 haproxy and mongo client for testing 10.5.20.20 mongo server running port 17010
版本信息/ HA代理机器和mongo客户端
OS: Debian Jessie SMP Debian 3.16.7-ckt20-1+deb8u3 (2016-01-17) x86_64 GNU/Linux bluebrick@ip-10-5-198-10:~$ mongo --version MongoDB shell version: 2.4.10 root@ip-10-5-198-10:~/tests/pmongo# haproxy -vv HA-Proxy version 1.6.3 2015/12/25 Copyright 2000-2015 Willy Tarreau <[email protected]> Build options : TARGET = linux2628 CPU = generic CC = gcc CFLAGS = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement OPTIONS = Default settings : maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 Encrypted password support via crypt(3): yes Built without compression support (neither USE_ZLIB nor USE_SLZ are set) Compression algorithms supported : identity("identity") Built without OpenSSL support (USE_OPENSSL not set) Built without PCRE support (using libc's regex instead) Built without Lua support Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND Available polling systems : epoll : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result OK Total: 3 (3 usable), will use epoll.
版本信息/ mongo服务器
Server OS: Ubuntu trusty 14.04.1-Ubuntu SMP Tue Sep 1 09:32:55 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux [email protected]:/home/bluebrick# mongod --version db version v3.2.1 git version: a14d55980c2cdc565d4704a7e3ad37e4e535c1b2 OpenSSL version: OpenSSL 1.0.1f 6 Jan 2014 allocator: tcmalloc modules: none build environment: distmod: ubuntu1404 distarch: x86_64 target_arch: x86_64
ha代理conf文件
root@ip-10-5-198-10:~/tests/pmongo# cat conf.10 ###################################################################### global ###################################################################### maxconn 2048 log /dev/log local0 log /dev/log local1 debug chroot /var/lib/haproxy user haproxy group haproxy debug ###################################################################### defaults ###################################################################### log global mode tcp option tcplog timeout connect 5000 timeout client 50000 timeout server 50000 ###################################################################### frontent ###################################################################### frontend fe_20_20_mongo_27010_tcp bind 10.5.198.10:27010 mode tcp option tcplog use_backend be_20_20_mongo_27010_tcp ###################################################################### backend ###################################################################### backend be_20_20_mongo_27010_tcp mode tcp option tcplog option tcpka server node1 10.5.20.20:27010 ################################################## ##################################################
当我连接到mongo绕过haproxy它看起来像这样:
bluebrick@ip-10-5-198-10:~$ mongo 10.5.20.20:27010 -verbose MongoDB shell version: 2.4.10 Sat Feb 27 13:12:46.776 versionArrayTest passed connecting to: 10.5.20.20:27010/test Sat Feb 27 13:12:46.798 creating new connection to:10.5.20.20:27010 Sat Feb 27 13:12:46.799 BackgroundJob starting: ConnectBG Sat Feb 27 13:12:46.803 connected connection! Server has startup warnings: 2016-02-27T12:48:57.313-0500 I CONTROL [initandlisten] ** WARNING: You are running this process as the root user, which is not recommended. 2016-02-27T12:48:57.313-0500 I CONTROL [initandlisten] 2016-02-27T12:48:57.313-0500 I CONTROL [initandlisten] 2016-02-27T12:48:57.313-0500 I CONTROL [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. 2016-02-27T12:48:57.313-0500 I CONTROL [initandlisten] ** We suggest setting it to 'never' 2016-02-27T12:48:57.313-0500 I CONTROL [initandlisten] 2016-02-27T12:48:57.313-0500 I CONTROL [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. 2016-02-27T12:48:57.313-0500 I CONTROL [initandlisten] ** We suggest setting it to 'never' 2016-02-27T12:48:57.313-0500 I CONTROL [initandlisten] rs0:SECONDARY> quit() bluebrick@ip-10-5-198-10:~$
当我运行haproxy服务器时,它看起来像这样:
root@ip-10-5-198-10:~/tests/pmongo# haproxy -d -f conf.10 Available polling systems : epoll : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result FAILED Total: 3 (2 usable), will use epoll. Using epoll() as the polling mechanism. 00000000:fe_20_20_mongo_27010_tcp.accept(0004)=0006 from [10.5.198.10:43177] 00000000:be_20_20_mongo_27010_tcp.srvcls[0006:0007] 00000000:be_20_20_mongo_27010_tcp.clicls[0006:0007] 00000000:be_20_20_mongo_27010_tcp.closed[0006:0007] 00000001:fe_20_20_mongo_27010_tcp.accept(0004)=0006 from [10.5.198.10:43206] 00000001:be_20_20_mongo_27010_tcp.srvcls[0006:0007] 00000001:be_20_20_mongo_27010_tcp.clicls[0006:0007] 00000001:be_20_20_mongo_27010_tcp.closed[0006:0007]
当我连接到mongo使用haproxy并且当它工作时它看起来像这样:
bluebrick@ip-10-5-198-10:~$ mongo 10.5.198.10:27010 -verbose mongodb shell version: 2.4.10 sat feb 27 13:04:00.655 versionarraytest passed connecting to: 10.5.198.10:27010/test sat feb 27 13:04:00.678 creating new connection to:10.5.198.10:27010 sat feb 27 13:04:00.678 backgroundjob starting: connectbg sat feb 27 13:04:00.678 connected connection! server has startup warnings: 2016-02-27t12:48:57.313-0500 i control [initandlisten] ** warning: you are running this process as the root user, which is not recommended. 2016-02-27t12:48:57.313-0500 i control [initandlisten] 2016-02-27t12:48:57.313-0500 i control [initandlisten] 2016-02-27t12:48:57.313-0500 i control [initandlisten] ** warning: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. 2016-02-27t12:48:57.313-0500 i control [initandlisten] ** we suggest setting it to 'never' 2016-02-27t12:48:57.313-0500 i control [initandlisten] 2016-02-27t12:48:57.313-0500 i control [initandlisten] ** warning: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. 2016-02-27t12:48:57.313-0500 i control [initandlisten] ** we suggest setting it to 'never' 2016-02-27t12:48:57.313-0500 i control [initandlisten] rs0:secondary> quit()
当我连接到mongo使用haproxy和当它失败它看起来像这样:
bluebrick@ip-10-5-198-10:~$ mongo 10.5.198.10:27010 -verbose MongoDB shell version: 2.4.10 Sat Feb 27 13:04:03.900 versionArrayTest passed connecting to: 10.5.198.10:27010/test Sat Feb 27 13:04:03.922 creating new connection to:10.5.198.10:27010 Sat Feb 27 13:04:03.922 BackgroundJob starting: ConnectBG Sat Feb 27 13:04:03.922 connected connection! Sat Feb 27 13:04:03.923 recv(): message len 1347703880 is too large. Max is 48000000 Sat Feb 27 13:04:03.923 DBClientCursor::init call() failed Sat Feb 27 13:04:03.923 User Assertion: 10276:DBClientBase::findN: transport error: 10.5.198.10:27010 ns: admin.$cmd query: { whatsmyuri: 1 } Sat Feb 27 13:04:03.923 Error: DBClientBase::findN: transport error: 10.5.198.10:27010 ns: admin.$cmd query: { whatsmyuri: 1 } at src/mongo/shell/mongo.js:147 Sat Feb 27 13:04:03.923 User Assertion: 12513:connect failed Sat Feb 27 13:04:03.923 freeing 1 uncollected N5mongo20DBClientWithCommandsE objects exception: connect failed bluebrick@ip-10-5-198-10:~$
查看mongo服务器日志:一个好的连接看起来像这样:
2016-02-27T12:53:14.944-0500 D STORAGE [WTJournalFlusher] flushed journal 2016-02-27T12:53:14.966-0500 I NETWORK [initandlisten] connection accepted from 10.5.198.10:36447 #30 (9 connections now open) 2016-02-27T12:53:14.966-0500 D COMMAND [conn30] run command admin.$cmd { whatsmyuri: 1 } 2016-02-27T12:53:14.966-0500 I COMMAND [conn30] command admin.$cmd command: whatsmyuri { whatsmyuri: 1 } keyUpdates:0 writeConflicts:0 numYields:0 reslen:66 locks:{} protocol:op_query 0ms 2016-02-27T12:53:14.968-0500 D COMMAND [conn30] run command admin.$cmd { getLog: "startupWarnings" } 2016-02-27T12:53:14.968-0500 D COMMAND [conn30] command: getLog 2016-02-27T12:53:14.968-0500 I COMMAND [conn30] command admin.$cmd command: getLog { getLog: "startupWarnings" } keyUpdates:0 writeConflicts:0 numYields:0 reslen:949 locks:{} protocol:op_query 0ms 2016-02-27T12:53:14.981-0500 D COMMAND [conn30] run command admin.$cmd { replSetGetStatus: 1.0, forShell: 1.0 } 2016-02-27T12:53:14.981-0500 D COMMAND [conn30] command: replSetGetStatus 2016-02-27T12:53:14.981-0500 I COMMAND [conn30] command admin.$cmd command: replSetGetStatus { replSetGetStatus: 1.0, forShell: 1.0 } keyUpdates:0 writeConflicts:0 numYields:0 reslen:964 locks:{} protocol:op_query 0ms
查看mongo服务器日志:失败的连接
There is nothing put in the mongo server logs in this scenario.