MariaDB Galera集群不易同步

我们正在尝试debuggingMariaDB群集的问题。

我们在Amazon EC2的c4.large实例上运行Maria 10.0.19; 操作系统是Ubuntu 14.04(Trusty)。

有三台机器聚集在一起,复制好(我们可以运行create database foo;在一台机器上看到另一台机器等等)。 但是 :当我们试图在所有三台机器正在运行和同步时从转储中恢复数据库时,出现错误:

 $ du -sh *.sql 2.7G sqldump.sql $ cat sqldump.sql | sudo mysql ERROR 1047 (08S01) at line 4361: WSREP has not yet prepared node for application use 

这个错误似乎与import需要多长时间有关。 如果我们在群集中的三个节点中的两个节点上运行service mysql stop ,并对其余节点运行SQL命令,那么它工作正常。 然后,我们可以逐个启动群集中的每台机器,通过SST复制数据,所以看起来这是Galeraconfiguration的问题。

这不仅发生在运行大量MySQL导入时:它发生在小事务的例行使用期间。 尽pipe如此,大量的import是我们最可靠的方法来复制这个问题。

导入时的系统内存使用率不是特别高,也不是CPU使用率。 networkingstream量远远低于机器链接的能力,在我们的testing中,除了SSH连接外,没有其他stream量。

有人可以帮助了解可能是什么原因造成这个问题?

以下是有关集群中的计算机和MariaDBconfiguration的更多详细信息:

Ubuntu的:

 $ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 14.04.2 LTS Release: 14.04 Codename: trusty 

核心:

 $ uname -a Linux servername.domain 3.13.0-53-generic #89-Ubuntu SMP Wed May 20 10:34:39 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux 

MySQLconfiguration( wsrep_cluster_address IP地址,域等有意混淆):

 $ find /etc/mysql/ -name "*.cnf" -exec cat {} \; | egrep -v "^#" | grep v "^$" [mysqld] server-id = 965424531 bind-address = * max_connections = 500 max_connect_errors = 1000000 innodb_buffer_pool_size = 2635M log_bin = /var/lib/mysql/mysql/mysql-bin expire_logs_days = 7 sync_binlog = 1 binlog_format = MIXED log-slave-updates = 1 slow_query_log = 1 slow_query_log_file = /var/log/mysql/mysql-slow.log [mysqld] innodb_use_native_aio = 0 innodb_flush_method = O_DSYNC [client] [mysqld] [mysqld_safe] syslog [mariadb] [client] port = 3306 socket = /var/run/mysqld/mysqld.sock [mysqld_safe] socket = /var/run/mysqld/mysqld.sock nice = 0 [mysqld] user = mysql pid-file = /var/run/mysqld/mysqld.pid socket = /var/run/mysqld/mysqld.sock port = 3306 basedir = /usr datadir = /var/lib/mysql tmpdir = /tmp lc-messages-dir = /usr/share/mysql skip-external-locking bind-address = 127.0.0.1 key_buffer = 16M max_allowed_packet = 16M thread_stack = 192K thread_cache_size = 8 myisam-recover = BACKUP query_cache_limit = 1M query_cache_size = 16M log_error = /var/log/mysql/error.log expire_logs_days = 10 max_binlog_size = 100M [mysqldump] quick quote-names max_allowed_packet = 16M [isamchk] key_buffer = 16M !includedir /etc/mysql/conf.d/ [mysqld] wsrep_provider=/usr/lib/galera/libgalera_smm.so wsrep_debug=ON wsrep_cluster_name="clustername" wsrep_cluster_address="gcomm://10.0.XX,10.0.XX" wsrep_sst_method=xtrabackup-v2 wsrep_sst_auth=sstuser:sstpassword wsrep_node_address="10.0.1.10" wsrep_node_name="servername.domain" binlog_format=ROW wsrep_on=ON default_storage_engine=InnoDB innodb_autoinc_lock_mode=2 innodb_doublewrite=1 query_cache_size=0 innodb_log_file_size = 256M 

群集状态:

 $ sudo mysql Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 288 Server version: 10.0.19-MariaDB-1~trusty-wsrep-log mariadb.org binary distribution, wsrep_25.10.r4144 Copyright (c) 2000, 2015, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. MariaDB [(none)]> show status like '%wsrep%'; +------------------------------+---------------------------------------------------+ | Variable_name | Value | +------------------------------+---------------------------------------------------+ | wsrep_local_state_uuid | e856afdc-18af-11e5-a3a6-efccde439ba4 | | wsrep_protocol_version | 7 | | wsrep_last_committed | 45764 | | wsrep_replicated | 2031 | | wsrep_replicated_bytes | 1527494811 | | wsrep_repl_keys | 9973524 | | wsrep_repl_keys_bytes | 79839767 | | wsrep_repl_data_bytes | 1447525060 | | wsrep_repl_other_bytes | 0 | | wsrep_received | 1478 | | wsrep_received_bytes | 13040 | | wsrep_local_commits | 1750 | | wsrep_local_cert_failures | 0 | | wsrep_local_replays | 0 | | wsrep_local_send_queue | 0 | | wsrep_local_send_queue_max | 2 | | wsrep_local_send_queue_min | 0 | | wsrep_local_send_queue_avg | 0.001140 | | wsrep_local_recv_queue | 0 | | wsrep_local_recv_queue_max | 7 | | wsrep_local_recv_queue_min | 0 | | wsrep_local_recv_queue_avg | 0.043302 | | wsrep_local_cached_downto | 45564 | | wsrep_flow_control_paused_ns | 3956186469 | | wsrep_flow_control_paused | 0.005006 | | wsrep_flow_control_sent | 0 | | wsrep_flow_control_recv | 41 | | wsrep_cert_deps_distance | 4.487445 | | wsrep_apply_oooe | 0.000000 | | wsrep_apply_oool | 0.000000 | | wsrep_apply_window | 1.000000 | | wsrep_commit_oooe | 0.000000 | | wsrep_commit_oool | 0.000000 | | wsrep_commit_window | 1.000000 | | wsrep_local_state | 4 | | wsrep_local_state_comment | Synced | | wsrep_cert_index_size | 11438 | | wsrep_causal_reads | 0 | | wsrep_cert_interval | 0.000000 | | wsrep_incoming_addresses | ,, | | wsrep_evs_delayed | | | wsrep_evs_evict_list | | | wsrep_evs_repl_latency | 0.00059098/0.000958534/0.00469729/0.000375612/732 | | wsrep_evs_state | OPERATIONAL | | wsrep_gcomm_uuid | 8bcfefe4-25f7-11e5-be32-062acc002ed5 | | wsrep_cluster_conf_id | 88 | | wsrep_cluster_size | 3 | | wsrep_cluster_state_uuid | e856afdc-18af-11e5-a3a6-efccde439ba4 | | wsrep_cluster_status | Primary | | wsrep_connected | ON | | wsrep_local_bf_aborts | 0 | | wsrep_local_index | 2 | | wsrep_provider_name | Galera | | wsrep_provider_vendor | Codership Oy <[email protected]> | | wsrep_provider_version | 3.9(rXXXX) | | wsrep_ready | ON | | wsrep_thread_count | 2 | +------------------------------+---------------------------------------------------+ 57 rows in set (0.00 sec) 

首先,在Galera集群上的巨大事务和LOAD DATA INFILE仍然是一个已知的限制,如果你不得不build议它将这些事务分割成5k-10k trx或更小的YMMV。

尝试增加wsrep-max-ws-size

在所有节点上设置innodb_flush_log_at_trx_commit=0

基于https://mariadb.com/kb/en/mariadb/mariadb-galera-cluster-known-limitations/

“事务大小尽pipeGalera没有明确地限制事务大小,但是writeset被当作一个驻留在内存中的缓冲区来处理,因此非常大的事务(例如LOAD DATA)可能会对节点的性能产生不利影响,为了避免这种情况,wsrep_max_ws_rows而wsrep_max_ws_size系统variables将事务行限制为128K,事务大小默认为1Gb,如果需要,用户可能希望增加这些限制,未来的版本将增加对事务碎片的支持。

它不清楚你的SQL实际上包含什么,然而大型交易可能达到这些限制。

执行mysqldump的计算机上的max_allowed_pa​​cket可能比您configuration的大。

你应该发布你的mysql错误日志的事件,因为这将提供有意义的信息。