MySQL复制上的延迟大（Relay_Log_Pos和Exec_Master_Log_Pos不增加）

今天我的两个奴隶（一个MySQL 5.1和第二个MariaDB 5.5，主人是MySQL 5.1）开始滞后。类似的情况经常会出现延迟，甚至10000秒，因为从属的硬件configuration比较差，但现在我非常强调。两个服务器上的滞销仍在上升，在这一点上，它在主服务器之后25K秒。于是我开始调查出了什么问题。通过主日志和奴隶mysql日志没有给我什么。服务器在Centos 5上，Mariadb在Centos 6上。

这是从MariaDB从站状态输出的：

 MariaDB [（none）]>显示从站状态\ G
 *************************** 1. row ******************** *******
                Slave_IO_State：等待主控发送事件
                   Master_Host：masterserevr
                   Master_User：slaveuser
                   Master_Port：3306
                 Connect_Retry：60
               Master_Log_File：mysqld-bin.006778
           Read_Master_Log_Pos：401041447
                Relay_Log_File：relay-bin.020343
                 Relay_Log_Pos：14867924
         Relay_Master_Log_File：mysqld-bin.006777
              Slave_IO_Running：是的
             Slave_SQL_Running：是的
               Replicate_Do_DB： 
           Replicate_Ignore_DB：ses，phar
            Replicate_Do_Table： 
        Replicate_Ignore_Table：portal.aaa_jm_tmp，portal.newsletter
       Replicate_Wild_Do_Table： 
   Replicate_Wild_Ignore_Table： 
                    Last_Errno：0
                    Last_Error： 
                  Skip_Counter：0
           Exec_Master_Log_Pos：14867639
               Relay_Log_Space：1474785535
               Until_Condition：无
                Until_Log_File： 
                 Until_Log_Pos：0
            Master_SSL_Allowed：否
            Master_SSL_CA_File： 
            Master_SSL_CA_Path： 
               Master_SSL_Cert： 
             Master_SSL_Cipher： 
                Master_SSL_Key： 
         Seconds_Behind_Master：26484
 Master_SSL_Verify_Server_Cert：否
                 Last_IO_Errno：0
                 Last_IO_Error： 
                Last_SQL_Errno：0
                Last_SQL_Error： 
   Replicate_Ignore_Server_Ids： 
              Master_Server_Id：1
一排（0.00秒）

从几个输出我注意到，Relay_Log_Pos和Exec_Master_Log_Pos不会增加。我试图重新启动奴隶进程，但没有改变，滞后仍然增加。下一步是查看什么查询复制已经停止。

使用mysqlbinlog

 mysqlbinlog relay-bin.020343> /root/RelayLogQueries1.txt

在RelayLogQueries1.txt中创build位置14867924：

 ＃在14867924
 ＃130927 10:03:21服务器ID 1 end_log_pos 14867709查询thread_id = 160780134 exec_time = 3 error_code = 0
 SET TIMESTAMP = 1380269001 / *！* /;
 / *！\ C utf8 * // *！* /;
 SET @@ session.character_set_client = 33，@@ session.collation_connection = 33，@@ session.collation_server = 9 / *！* /;
开始
 / * * /！;
 ＃在14867994
 ＃在14868101
 ＃在14868669
 ＃在14869417
 ＃在14869873
 ＃在14870663
 ＃在14871697
 ＃在14872055
 ＃在14872845
 ＃在14873747
 ＃在14874591
 ＃在14875387
 ＃在14876265
 ＃在14877039
 ＃在14877985
 ＃在14878299
 ＃在14879091
 ＃在14879853
 ＃在14880255
 ＃在14881029
 。
 。
 。
 ＃在117398235
 ＃在117399219
 ＃在117400203
 ＃在117401191
 ＃在117402179
 ＃在117403167
 ＃在117403969
 ＃在117404957
 ＃在117405945
 ＃在117406933
 ＃在117407921
 ＃在117408909
 ＃在117409897
 ＃在117410885
 ＃在117411873
 ＃在117412861
 ＃在117413849
 ＃在117414837
 ＃在117415785
 ＃在117416797
 ＃在117417839
 ＃在117418595
 ＃在117419585
 ＃130927 10:03:21 server id 1 end_log_pos 14867816 Table_map：`test`.`pac_list` mapped to number 216570427
 ＃130927 10:03:21服务器ID 1 end_log_pos 14868384 Update_rows：表ID 216570427
 ＃130927 10:03:21服务器ID 1 end_log_pos 14869132 Update_rows：表ID 216570427
 ＃130927 10:03:21服务器ID 1 end_log_pos 14869588 Update_rows：表ID 216570427
 ＃130927 10:03:21服务器ID 1 end_log_pos 14870378 Update_rows：表ID 216570427
 ＃130927 10:03:21服务器ID 1 end_log_pos 14871412 Update_rows：表ID 216570427
 ＃130927 10:03:21服务器ID 1 end_log_pos 14871770 Update_rows：表ID 216570427
 ＃130927 10:03:21服务器ID 1 end_log_pos 14872560 Update_rows：表ID 216570427
 ＃130927 10:03:21服务器ID 1 end_log_pos 14873462 Update_rows：表ID 216570427
 。
 。
 。

现在我很困惑，因为我不知道如何解释这个日志（是好还是错），其次是不知道如何解决这个问题。

有时候当我得到一些复制错误时，这个技巧是有帮助的：

 SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;  START SLAVE;

但是现在我没有错误，并且IO和SQL从属进程都在运行。

可以设置SQL_SLAVE_SKIP_COUNTER = 1带回复制？

我能做些什么来诊断更多的这个问题，并修复它，而无需从头开始设置副本（我想避免的最后一个场景）

编辑：当开发人员意外复制一个表pac_list（200MB与600000logging），他复制命名为test.pac_list（它在表名中有点）他想要在数据库testing中创build副本，但他做了一些错误并在原始表的相同数据库中创build表test.pac_list。在发现他的错误后，他将表test.pac_list删除，并在新数据库中创build表pac_list。这可能是这么大的滞后的原因吗？

做一个显示完整的进程列表来查看挂起复制的查询。

另外，我看到有一个开始：

14867924
SET TIMESTAMP = 1380269001 / ！ /;
SET @@ session.character_set_client = 33，@@ session.collation_connection = 33，@@ session.collation_server = 9 / ！ /;
开始

所以它可能是复制被locking，直到结束？

http://dev.mysql.com/doc/refman/5.0/en/begin-end.html