以下是兩種slave復制異常的情況。導致的原因都是由於跨機房同步,slave的機房突然掉電導致的。
案例一、
這個錯誤大原因是Read_Master_Log_Pos: 1028687822的pos號在主庫上是沒有的.
處理方法:獲取這個pos號的前一個pos號,從新開啟同步,這裡注意如果是row模式的話就沒有問題.如果是mix的或者statement的話,就需要去分析binlog,確認具體執行到哪個pos號了,不然可能會導致數據不一致。
(andy:db:)[(none)] 11:18:39> show slave status\G
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: 192.168.11.24
Master_User: slave
Master_Port: 3306
Connect_Retry: 10
Master_Log_File: mysql-bin.000342
Read_Master_Log_Pos: 1028687822
Relay_Log_File: localhost-relay-bin.000767
Relay_Log_Pos: 4
Relay_Master_Log_File: mysql-bin.000342
Slave_IO_Running: No
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 1028687822
Relay_Log_Space: 120
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 1236
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: ‘Client requested master to start replication from position > file size; the first event ‘mysql-bin.000342′ at 1028687822, the last event read from ‘./mysql-bin.000342′ at 4, the last byte read from ‘./mysql-bin.000342′ at 4.’
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1024
Master_UUID: e9143523-c116-11e2-a8a1-0022195d25da
Master_Info_File: /usr/local/mysql-6/data/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp: 140725 06:55:26
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position: 0
1 row in set (0.00 sec)
案例二、
這個是由於slave的relay-log損壞了.可以考慮使用Exec_Master_Log_Pos: 439512771pos號,從新開啟同步.或者使用relay_log_recovery=ON參數啟動。
(root:bi:)[(none)] 10:56:01> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 172.20.100.16
Master_User: slave
Master_Port: 3306
Connect_Retry: 10
Master_Log_File: mysql-bin.000076
Read_Master_Log_Pos: 474667384
Relay_Log_File: localhost-relay-bin.000137
Relay_Log_Pos: 212337426
Relay_Master_Log_File: mysql-bin.000076
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1594
Last_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master’s binary log is corrupted (you can check this by running ‘mysqlbinlog’ on the binary log), the slave’s relay log is corrupted (you can check this by running ‘mysqlbinlog’ on the relay log), a network problem, or a bug in the master’s or slave’s MySQL code. If you want to check the master’s binary log or slave’s relay log, you will be able to know their names by issuing ‘SHOW SLAVE STATUS’ on this slave.
Skip_Counter: 0
Exec_Master_Log_Pos: 439512771
Relay_Log_Space: 474669681
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1594
Last_SQL_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master’s binary log is corrupted (you can check this by running ‘mysqlbinlog’ on the binary log), the slave’s relay log is corrupted (you can check this by running ‘mysqlbinlog’ on the relay log), a network problem, or a bug in the master’s or slave’s MySQL code. If you want to check the master’s binary log or slave’s relay log, you will be able to know their names by issuing ‘SHOW SLAVE STATUS’ on this slave.
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1016
Master_UUID: 7b6b6934-b981-11e3-bc82-c81f66de76ac
Master_Info_File: /data/mysql/data/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State:
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp: 140725 10:54:50
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position: 0
1 row in set (0.00 sec)
[root@localhost data]# mysqlbinlog localhost-relay-bin.000137 > a.txt
ERROR: Error in Log_event::read_log_event(): ‘read error’, data_len: 7736, event_type: 31
ERROR: Could not read entry at offset 215681738: Error in log format or read error.
WARNING: The range of printed events ends with a row event or a table map event that does not have the STMT_END_F flag set. This might be because the last statement was not fully written to the log, or because you are using a -stop-position or -stop-datetime that refers to an event in the middle of a statement. The event(s) from the partial statement have not been written to output.
總結:這兩個錯誤的導致的原因是機房掉電,也就是沒有crash safe.
如何讓slave cras safe?
可以參照我另外一篇文章
《mysql-5.6 Slave支持crash-safe》
為什麼我沒有開啟crash-safe ? 因為我slave是mariadb-10.0,不是官方的,並且我開啟了multi-source。