InnoDB的IO问题,需要评估的评估

TL; DR – 与InnoDB有一些基于IO的问题。 在我签署表格以申请更多硬件之前,请先看看是否同意我的评估

为一个客户端工作,我inheritance了一个大的InnoDB MySQL数据库服务器。 根据Information_Schema,所有表(大约180个模式)的数据大小约为34.9GB,总大小约为38GB。

它位于运行心跳和DRBD(主动/被动)的Xen DomU(Debian)上,通过gigibit交叉同步data_dir分区并提供故障转移。 注:Web服务器,caching服务器和数据库服务器都是位于两台服务器上的Xen虚拟服务器,以实现冗余和故障转移。

总数据大小由存储在数据库中的大约25GB BLOB数据(图像,video,file upload等)组成,以允许多个前端Web服务器将这些文件caching在其本地文件系统上。 这些服务器上的caching大概平均每周刷新10次。

其余的数据(大约9GB)是经常更新的实际内容。 在前端有多个Web服务器,还有坐在它们前面的静态文件caching。 networking服务器仍然每天大约有50万次点击。 Web服务器只有在数据库在IO上等待时才被加载,并且在这种情况发生之前似乎能够处理大量的查询和连接。

该应用程序是一个名为eZ Publish的基于PHP的CMS。 我们在这个设置中托pipe了大约90个站点。

InnoTop每天显示约4000万个查询。 平均大约500 qps。 数据库服务器有2个CPU,一个很less使用,运行32Bit内核,并有2GB的内存。 InnoDB缓冲池已经分配了512MB。 我知道这并不多。 服务器没有更多的免费。

根据下面的输出和阅读在线博客和本书,我得出结论,服务器严重缺乏资源,如果我们有足够的内存来存储经常访问的9GB数据(如果不是所有的35GB的数据)和/或至less10GB光纤通道复制更快的磁盘。

这个评估看起来是否准确? 有什么设置可以在这里改变,可以带来任何性能上的好处吗?

〜$ vmstat 1 10

procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- rb swpd free buff cache si so bi bo in cs us sy id wa 4 3 32 312280 26852 710860 0 0 2 1 8 2 3 1 86 10 1 16 32 60964 27144 955584 0 0 1402 19095 2102 1921 31 13 0 55 0 16 32 245340 27320 771756 0 0 1050 15640 1389 1990 11 8 31 49 1 19 32 294744 27464 722744 0 0 1634 12470 2356 3082 0 1 49 49 2 7 32 465268 27756 556484 0 0 2393 2982 3127 3339 8 6 29 57 2 7 32 364820 28108 654308 0 0 1542 10695 2773 2614 12 7 21 60 1 20 32 144328 28428 871120 0 0 1766 9515 3110 2882 29 10 4 57 1 14 32 231284 28060 761364 0 0 1496 16435 2913 3058 25 13 3 59 0 5 32 335140 28144 659352 0 0 986 21689 2197 2483 1 1 43 55 0 8 32 367516 28200 627312 0 0 666 7098 1519 1602 0 0 50 49 

〜$ iostat -x

sda8是DRBD复制的文件系统

 avg-cpu: %user %nice %system %iowait %steal %idle 2.68 0.00 1.21 9.94 0.10 86.08 Device: rrqm/s wrqm/sr/sw/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda1 0.03 3.59 0.46 3.85 18.55 59.53 18.13 0.70 163.56 2.70 1.16 sda2 0.04 0.01 0.02 0.03 0.50 0.33 16.35 0.01 102.47 5.11 0.03 sda7 0.00 0.00 0.00 2.43 0.00 2.43 1.00 0.04 16.02 16.02 3.89 sda8 126.18 38.46 29.41 33.89 1244.76 578.80 28.81 0.24 3.75 3.33 21.05 

mysql> SHOW ENGINE INNODB STATUS \ G

 Status: ===================================== 101214 11:35:28 INNODB MONITOR OUTPUT ===================================== Per second averages calculated from the last 4 seconds ---------- SEMAPHORES ---------- OS WAIT ARRAY INFO: reservation count 296294, signal count 169637 Mutex spin waits 0, rounds 3304249, OS waits 12485 RW-shared spins 397195, OS waits 245310; RW-excl spins 44610, OS waits 37492 ------------ TRANSACTIONS ------------ Trx id counter 0 1849382073 Purge done for trx's n:o < 0 1849375557 undo n:o < 0 0 History list length 302 Total number of lock structs in row lock hash table 0 LIST OF TRANSACTIONS FOR EACH SESSION: -- TRIMMED OP -- -------- FILE I/O -------- I/O thread 0 state: waiting for i/o request (insert buffer thread) I/O thread 1 state: waiting for i/o request (log thread) I/O thread 2 state: waiting for i/o request (read thread) I/O thread 3 state: waiting for i/o request (write thread) Pending normal aio reads: 0, aio writes: 0, ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0 Pending flushes (fsync) log: 0; buffer pool: 1 1588467 OS file reads, 491628 OS file writes, 166422 OS fsyncs 1 pending preads, 0 pending pwrites 191.95 reads/s, 28245 avg bytes/read, 3.25 writes/s, 2.75 fsyncs/s ------------------------------------- INSERT BUFFER AND ADAPTIVE HASH INDEX ------------------------------------- Ibuf: size 3, free list len 6, seg size 10, 18975 inserts, 18793 merged recs, 16850 merges Hash table size 2212699, used cells 1083042, node heap has 1459 buffer(s) 8141.96 hash searches/s, 1822.54 non-hash searches/s --- LOG --- Log sequence number 33 3880839250 Log flushed up to 33 3880839250 Last checkpoint at 33 3875354146 0 pending log writes, 0 pending chkp writes 123375 log i/o's done, 2.75 log i/o's/second ---------------------- BUFFER POOL AND MEMORY ---------------------- Total memory allocated 681812036; in additional pool allocated 1048576 Buffer pool size 32768 Free buffers 0 Database pages 31309 Modified db pages 3167 Pending reads 1 Pending writes: LRU 129, flush list 0, single page 0 Pages read 3360890, created 9011, written 408990 331.17 reads/s, 0.25 creates/s, 0.00 writes/s Buffer pool hit rate 990 / 1000 -------------- ROW OPERATIONS -------------- 3 queries inside InnoDB, 0 queries in queue 4 read views open inside InnoDB Main thread process no. 7916, id 2395159472, state: sleeping Number of rows inserted 73371, updated 55517, deleted 47643, read 154375744 1.50 inserts/s, 1.25 updates/s, 0.00 deletes/s, 9428.64 reads/s ---------------------------- END OF INNODB MONITOR OUTPUT ============================ 

mysql> SHOW STATUS;

 +-----------------------------------+------------+ | Variable_name | Value | +-----------------------------------+------------+ | Aborted_clients | 700 | | Aborted_connects | 1 | | Binlog_cache_disk_use | 0 | | Binlog_cache_use | 0 | | Bytes_received | 116 | | Bytes_sent | 177 | | Com_select | 1 | | Com_show_status | 1 | | Com_xxx (OP rest were 0) | 0 | | Compression | OFF | | Connections | 166487 | | Created_tmp_disk_tables | 0 | | Created_tmp_files | 94 | | Created_tmp_tables | 1 | | Delayed_errors | 0 | | Delayed_insert_threads | 0 | | Delayed_writes | 0 | | Flush_commands | 1 | | Handler_write | 131 | | Handler_xxx (OP rest were 0) | 0 | | Innodb_buffer_pool_pages_data | 31224 | | Innodb_buffer_pool_pages_dirty | 2942 | | Innodb_buffer_pool_pages_flushed | 408710 | | Innodb_buffer_pool_pages_free | 0 | | Innodb_buffer_pool_pages_latched | 12 | | Innodb_buffer_pool_pages_misc | 1544 | | Innodb_buffer_pool_pages_total | 32768 | | Innodb_buffer_pool_read_ahead_rnd | 39211 | | Innodb_buffer_pool_read_ahead_seq | 2735 | | Innodb_buffer_pool_read_requests | 279758942 | | Innodb_buffer_pool_reads | 1341622 | | Innodb_buffer_pool_wait_free | 0 | | Innodb_buffer_pool_write_requests | 1744350 | | Innodb_data_fsyncs | 166166 | | Innodb_data_pending_fsyncs | 2 | | Innodb_data_pending_reads | 3 | | Innodb_data_pending_writes | 0 | | Innodb_data_read | 3356102656 | | Innodb_data_reads | 1581795 | | Innodb_data_writes | 491260 | | Innodb_data_written | 877349888 | | Innodb_dblwr_pages_written | 408828 | | Innodb_dblwr_writes | 10622 | | Innodb_log_waits | 2 | | Innodb_log_write_requests | 783217 | | Innodb_log_writes | 116708 | | Innodb_os_log_fsyncs | 123325 | | Innodb_os_log_pending_fsyncs | 1 | | Innodb_os_log_pending_writes | 0 | | Innodb_os_log_written | 362475008 | | Innodb_page_size | 16384 | | Innodb_pages_created | 8991 | | Innodb_pages_read | 3350432 | | Innodb_pages_written | 408828 | | Innodb_row_lock_current_waits | 0 | | Innodb_row_lock_time | 10732 | | Innodb_row_lock_time_avg | 195 | | Innodb_row_lock_time_max | 2426 | | Innodb_row_lock_waits | 55 | | Innodb_rows_deleted | 47499 | | Innodb_rows_inserted | 73260 | | Innodb_rows_read | 153981836 | | Innodb_rows_updated | 55359 | | Key_blocks_not_flushed | 60 | | Key_blocks_unused | 85607 | | Key_blocks_used | 30452 | | Key_read_requests | 43862323 | | Key_reads | 102541 | | Key_write_requests | 5199340 | | Key_writes | 95356 | | Last_query_cost | 0.000000 | | Max_used_connections | 132 | | Ndb_cluster_node_id | 0 | | Ndb_config_from_host | | | Ndb_config_from_port | 0 | | Ndb_number_of_data_nodes | 0 | | Not_flushed_delayed_rows | 0 | | Open_files | 7266 | | Open_streams | 0 | | Open_tables | 12674 | | Opened_tables | 0 | | Prepared_stmt_count | 0 | | Qcache_free_blocks | 4709 | | Qcache_free_memory | 16054648 | | Qcache_hits | 21241525 | | Qcache_inserts | 1070573 | | Qcache_lowmem_prunes | 560462 | | Qcache_not_cached | 2537364 | | Qcache_queries_in_cache | 12205 | | Qcache_total_blocks | 31665 | | Questions | 27104225 | | Rpl_status | NULL | | Select_full_join | 0 | | Select_full_range_join | 0 | | Select_range | 0 | | Select_range_check | 0 | | Select_scan | 1 | | Slave_open_temp_tables | 0 | | Slave_retried_transactions | 0 | | Slave_running | OFF | | Slow_launch_threads | 0 | | Slow_queries | 0 | | Sort_merge_passes | 0 | | Sort_range | 0 | | Sort_rows | 0 | | Sort_scan | 0 | | Ssl_xxx (OP removed as not used) | | | Table_locks_immediate | 9749827 | | Table_locks_waited | 359 | | Tc_log_max_pages_used | 0 | | Tc_log_page_size | 0 | | Tc_log_page_waits | 0 | | Threads_cached | 0 | | Threads_connected | 20 | | Threads_created | 4862 | | Threads_running | 16 | | Uptime | 67476 | +-----------------------------------+------------+ 

mysql> SHOW VARIABLES

 +---------------------------------+-----------------------------+ | Variable_name | Value | +---------------------------------+-----------------------------+ | auto_increment_increment | 1 | | auto_increment_offset | 1 | | automatic_sp_privileges | ON | | back_log | 50 | | binlog_cache_size | 32768 | | bulk_insert_buffer_size | 8388608 | | completion_type | 0 | | concurrent_insert | 1 | | connect_timeout | 10 | | default_week_format | 0 | | delay_key_write | ON | | delayed_insert_limit | 100 | | delayed_insert_timeout | 300 | | delayed_queue_size | 1000 | | div_precision_increment | 4 | | engine_condition_pushdown | OFF | | expire_logs_days | 0 | | flush | OFF | | flush_time | 0 | | ft_boolean_syntax | + -><()~*:""&| | | ft_max_word_len | 84 | | ft_min_word_len | 4 | | ft_query_expansion_limit | 20 | | ft_stopword_file | (built-in) | | group_concat_max_len | 1024 | | have_archive | YES | | have_bdb | NO | | have_blackhole_engine | NO | | have_compress | YES | | have_crypt | YES | | have_csv | YES | | have_dynamic_loading | YES | | have_example_engine | NO | | have_federated_engine | YES | | have_geometry | YES | | have_innodb | YES | | have_isam | NO | | have_merge_engine | YES | | have_ndbcluster | DISABLED | | have_openssl | DISABLED | | have_query_cache | YES | | have_raid | NO | | have_rtree_keys | YES | | have_symlink | YES | | init_connect | | | init_file | | | init_slave | | | innodb_additional_mem_pool_size | 1048576 | | innodb_autoextend_increment | 8 | | innodb_buffer_pool_awe_mem_mb | 0 | | innodb_buffer_pool_size | 536870912 | | innodb_checksums | ON | | innodb_commit_concurrency | 0 | | innodb_concurrency_tickets | 500 | | innodb_data_file_path | ibdata1:10M:autoextend | | innodb_data_home_dir | | | innodb_doublewrite | ON | | innodb_fast_shutdown | 1 | | innodb_file_io_threads | 4 | | innodb_file_per_table | ON | | innodb_flush_log_at_trx_commit | 1 | | innodb_flush_method | | | innodb_force_recovery | 0 | | innodb_lock_wait_timeout | 50 | | innodb_locks_unsafe_for_binlog | OFF | | innodb_log_arch_dir | | | innodb_log_archive | OFF | | innodb_log_buffer_size | 1048576 | | innodb_log_file_size | 5242880 | | innodb_log_files_in_group | 2 | | innodb_log_group_home_dir | ./ | | innodb_max_dirty_pages_pct | 90 | | innodb_max_purge_lag | 0 | | innodb_mirrored_log_groups | 1 | | innodb_open_files | 300 | | innodb_rollback_on_timeout | OFF | | innodb_support_xa | ON | | innodb_sync_spin_loops | 20 | | innodb_table_locks | ON | | innodb_thread_concurrency | 8 | | innodb_thread_sleep_delay | 10000 | | interactive_timeout | 50 | | join_buffer_size | 536866816 | | key_buffer_size | 132120576 | | key_cache_age_threshold | 300 | | key_cache_block_size | 1024 | | key_cache_division_limit | 100 | | language | /usr/share/mysql/english/ | | large_files_support | ON | | large_page_size | 0 | | large_pages | OFF | | lc_time_names | en_US | | license | GPL | | local_infile | ON | | locked_in_memory | OFF | | log | OFF | | log_bin | OFF | | log_bin_trust_function_creators | OFF | | log_error | | | log_queries_not_using_indexes | OFF | | log_slave_updates | OFF | | log_slow_queries | OFF | | log_warnings | 1 | | long_query_time | 10 | | low_priority_updates | OFF | | lower_case_file_system | OFF | | lower_case_table_names | 0 | | max_allowed_packet | 16776192 | | max_binlog_cache_size | 4294967295 | | max_binlog_size | 1073741824 | | max_connect_errors | 50 | | max_connections | 800 | | max_delayed_threads | 20 | | max_error_count | 64 | | max_heap_table_size | 268435456 | | max_insert_delayed_threads | 20 | | max_join_size | 18446744073709551615 | | max_length_for_sort_data | 1024 | | max_prepared_stmt_count | 16382 | | max_relay_log_size | 0 | | max_seeks_for_key | 4294967295 | | max_sort_length | 1024 | | max_sp_recursion_depth | 0 | | max_tmp_tables | 32 | | max_user_connections | 0 | | max_write_lock_count | 4294967295 | | multi_range_count | 256 | | myisam_data_pointer_size | 6 | | myisam_max_sort_file_size | 2147483647 | | myisam_recover_options | OFF | | myisam_repair_threads | 1 | | myisam_sort_buffer_size | 8388608 | | myisam_stats_method | nulls_unequal | | ndb_autoincrement_prefetch_sz | 32 | | ndb_force_send | ON | | ndb_use_exact_count | ON | | ndb_use_transactions | ON | | ndb_cache_check_time | 0 | | net_buffer_length | 16384 | | net_read_timeout | 30 | | net_retry_count | 10 | | net_write_timeout | 60 | | new | OFF | | old_passwords | OFF | | open_files_limit | 65535 | | optimizer_prune_level | 1 | | optimizer_search_depth | 62 | | pid_file | /var/run/mysqld/mysqld.pid | | port | 3306 | | preload_buffer_size | 32768 | | protocol_version | 10 | | query_alloc_block_size | 8192 | | query_cache_limit | 2097152 | | query_cache_min_res_unit | 4096 | | query_cache_size | 67108864 | | query_cache_type | ON | | query_cache_wlock_invalidate | OFF | | query_prealloc_size | 8192 | | range_alloc_block_size | 2048 | | read_buffer_size | 131072 | | read_only | OFF | | read_rnd_buffer_size | 262144 | | relay_log_purge | ON | | relay_log_space_limit | 0 | | rpl_recovery_rank | 0 | | secure_auth | OFF | | server_id | 0 | | skip_external_locking | ON | | skip_networking | OFF | | skip_show_database | OFF | | slave_compressed_protocol | OFF | | slave_load_tmpdir | /tmp/ | | slave_net_timeout | 3600 | | slave_skip_errors | OFF | | slave_transaction_retries | 10 | | slow_launch_time | 2 | | socket | /var/run/mysqld/mysqld.sock | | sort_buffer_size | 2097144 | | sql_big_selects | ON | | sql_mode | | | sql_notes | ON | | sql_warnings | OFF | | ssl_ca | | | ssl_capath | | | ssl_cert | | | ssl_cipher | | | ssl_key | | | storage_engine | MyISAM | | sync_binlog | 0 | | sync_frm | ON | | system_time_zone | NZDT | | table_cache | 32362 | | table_lock_wait_timeout | 50 | | table_type | MyISAM | | thread_cache_size | 8 | | thread_stack | 131072 | | time_format | %H:%i:%s | | time_zone | SYSTEM | | timed_mutexes | OFF | | tmp_table_size | 134217728 | | tmpdir | /tmp | | transaction_alloc_block_size | 8192 | | transaction_prealloc_size | 4096 | | tx_isolation | REPEATABLE-READ | | updatable_views_with_limit | YES | | version | 5.0.32-Debian_7etch10 | | version_comment | Debian etch distribution | | version_compile_machine | i486 | | version_compile_os | pc-linux-gnu | | wait_timeout | 50 | +---------------------------------+-----------------------------+ 

由于字符限制,某些path设置被删除。

平均大约500 qps

这是第一个挑战。 单个(快速)硬盘只能达到200-250 IOPS。

在你的iostat转储中,我们看到sda8正在做165 IOPS,对于一个复制磁盘来说并不差。 平均队列大小仍然很小,平均等待时间和平均服务时间略长于3毫秒。 换句话说,光盘不是瓶颈,至less在你运行iostat的时候不是。

很明显,RAM缓冲区和其他InnoDB优化已经在工作,大大减less了IO请求的数量(否则,你会试图做500个请求,而sda8将无法如此快速地服务)。

那么,你的问题是什么?

编辑

好吧,重负载的数字画了一个完全不同的图片,IO数量堆积,响应时间受到很大影响 。 绝对是一些硬件重做的情况。

首先,我要重新评估DRBD。 三个选项:

  1. 保持DRBD,但在主机之间获得更好的延迟。 无论是10GbE,还是一些更好的连接。 我会说InfiniBand可能是一个选项。
  2. 用一些外部共享存储(SAN存储)replaceDRBD。 一个可以提供大量的IOPS。 以太网上的iSCSI不会削减它。 无论是高端的FC还是共享的SAS。
  3. 考虑数据库复制,而不是共享存储。 这样,存储对于数据库来说是本地的,并且延迟也是本地的。 复制发生在'事实'之后,所以它不会影响IOPS。

然后,无论您select哪种解决scheme,您仍然需要比当前磁盘所能提供的多得多的IOPS。

获得高IOPS主要有两种方法:

  1. 添加更多的主轴。 通常在RAID10中,您可以获得的IOPS大致等于所有活动磁盘的IOPS总和。 这就是为什么高转速磁盘可用于如此低的容量,你通常需要大量的磁盘来获得更高的速度。 20-30的驱动只是加起来的主轴并不是闻所未闻的; 这就是你需要进入〜6000 IPS球场的数量。 (一个15kRPM磁盘给出180-250 IOPS)
  2. 使用SSD。 高端闪存驱动器可以获得20,000 IOPS。 一定要使用“服务器”固态硬盘,用于工作站的固态硬盘将在几个月内死亡。 看看SLC和MLC之间的区别,例如,英特尔的X-25E和X-25M。

165 IOPS? 你从哪里读到? 我认为你通过添加rrqm(读取请求合并)和wrqm(写入请求合并)而不是使用r / s + w / s(磁盘实际IOPS)来计算错误。 另外磁盘的利用率只有21% – 通常不是添加更多主轴的标志。 对于给定的LUN,一切低于70%都可以认为是正常的。

显然,只有一小部分查询正在打击主轴。