萬盛學電腦網

 萬盛學電腦網 >> 數據庫 >> mysql教程 >> MysqlMaster切換日志分析示例

MysqlMaster切換日志分析示例

MHA監控復制架構的主服務器,一旦檢測到主服務器故障,就會自動進行故障轉移。即使有些從服務器沒有收到最新的relay log,MHA自動從最新的從服務器上識別差異的relay log並把這些日志應用到其他從服務器上,因此所有的從服務器保持一致性了,下面我們來看一個MysqlMaster切換日志分析,希望對各位有幫助。

下面是MHA從啟動,到failover的完整manager.log,以及集群中所有mysql的general.log.我們可以通過這些日志,更多的了解和分析MHA的運行原理.對用好MHA有很多幫助。
架構如下:
DB1 MASTER
DB2 SLAVE,candidate_master
DB3 SLAVE

DB1:

[root@localhost mysql]# cat /var/lib/mysql/localhost.log

140928 10:17:04     5 Connect   [email protected] on
                    5 Query     set autocommit=1
                    5 Query     SELECT CONNECTION_ID() AS Value
                    6 Connect   [email protected] on
                    6 Query     set autocommit=1
                    6 Query     SELECT CONNECTION_ID() AS Value
                    6 Query     SET wait_timeout=86400
                    6 Query     SELECT @@global.server_id As Value
                    6 Query     SELECT VERSION() AS Value
                    6 Query     SHOW GLOBAL VARIABLES LIKE 'log_bin'
                    6 Query     SHOW MASTER STATUS
                    6 Query     SELECT @@global.relay_log_info_repository AS Value
                    6 Query     SELECT @@global.datadir AS Value
                    6 Query     SELECT @@global.relay_log_info_file AS Value
                    6 Query     SELECT @@global.datadir AS Value
                    6 Query     SHOW SLAVE STATUS
                    6 Query     SELECT @@global.read_only As Value
                    6 Query     SELECT @@global.relay_log_purge As Value
140928 10:17:11     7 Connect   [email protected] on
                    7 Query     select @@version_comment limit 1
                    7 Query     select count(*) as c from mysql.user
                    6 Quit
                    7 Quit
                    8 Connect   [email protected] on
                    8 Query     set autocommit=1
                    8 Query     SET wait_timeout=6
                    8 Query     SELECT GET_LOCK('MHA_Master_High_Availability_Monitor', '12') AS Value
                    8 Query     SELECT 1 As Value
140928 10:17:14     8 Query     SELECT 1 As Value
140928 10:17:17     8 Query     SELECT 1 As Value
140928 10:17:20     8 Query     SELECT 1 As Value
140928 10:17:23     8 Query     SELECT 1 As Value
140928 10:17:26     8 Query     SELECT 1 As Value
140928 10:17:29     8 Query     SELECT 1 As Value
140928 10:17:32     8 Query     SELECT 1 As Value
140928 10:17:35     8 Query     SELECT 1 As Value
140928 10:17:38     8 Query     SELECT 1 As Value
DB2:

[root@localhost mysql]# cat /var/lib/mysql/localhost.log

140928 10:17:04   214 Connect   [email protected] on
                  214 Query     set autocommit=1
                  214 Query     SELECT CONNECTION_ID() AS Value
                  215 Connect   [email protected] on
                  215 Query     set autocommit=1
                  215 Query     SELECT CONNECTION_ID() AS Value
                  215 Query     SET wait_timeout=86400
                  215 Query     SELECT @@global.server_id As Value
                  215 Query     SELECT VERSION() AS Value
                  215 Query     SHOW GLOBAL VARIABLES LIKE 'log_bin'
                  215 Query     SHOW MASTER STATUS
                  215 Query     SELECT @@global.relay_log_info_repository AS Value
                  215 Query     SELECT @@global.datadir AS Value
                  215 Query     SELECT @@global.relay_log_info_file AS Value
                  215 Query     SELECT @@global.datadir AS Value
                  215 Query     SHOW SLAVE STATUS
                  215 Query     SELECT @@global.read_only As Value
                  215 Query     SELECT @@global.relay_log_purge As Value
                  215 Query     SHOW SLAVE STATUS
                  215 Query     SELECT Repl_slave_priv AS Value FROM mysql.user WHERE user = 'slave'
140928 10:17:09   216 Connect   [email protected] on
                  216 Query     select @@version_comment limit 1
                  216 Query     set sql_log_bin=0
                  216 Query     create table if not exists mysql.apply_diff_relay_logs_test(id int)
                  216 Query     insert into mysql.apply_diff_relay_logs_test values(1)
                  216 Query     update mysql.apply_diff_relay_logs_test set id=id+1 where id=1
                  216 Query     delete from mysql.apply_diff_relay_logs_test
                  216 Query     drop table mysql.apply_diff_relay_logs_test
140928 10:17:10   216 Quit
140928 10:17:11   215 Quit
140928 10:17:50   217 Connect   [email protected] on
                  217 Query     set autocommit=1
                  217 Query     SELECT CONNECTION_ID() AS Value
                  218 Connect   [email protected] on
                  218 Query     set autocommit=1
                  218 Query     SELECT CONNECTION_ID() AS Value
                  218 Query     SET wait_timeout=86400
                  218 Query     SELECT @@global.server_id As Value
                  218 Query     SELECT VERSION() AS Value
                  218 Query     SHOW GLOBAL VARIABLES LIKE 'log_bin'
                  218 Query     SHOW MASTER STATUS
                  218 Query     SELECT @@global.relay_log_info_repository AS Value
                  218 Query     SELECT @@global.datadir AS Value
                  218 Query     SELECT @@global.relay_log_info_file AS Value
                  218 Query     SELECT @@global.datadir AS Value
                  218 Query     SHOW SLAVE STATUS
                  218 Query     SELECT @@global.read_only As Value
                  218 Query     SELECT @@global.relay_log_purge As Value
                  218 Query     SHOW SLAVE STATUS
                  219 Connect   [email protected] on
                  219 Query     set autocommit=1
                  219 Query     SELECT CONNECTION_ID() AS Value
                  220 Connect   [email protected] on
                  220 Query     set autocommit=1
                  220 Query     SELECT CONNECTION_ID() AS Value
                  220 Query     SET wait_timeout=86400
                  220 Query     SELECT @@global.server_id As Value
                  220 Query     SELECT VERSION() AS Value
                  220 Query     SHOW GLOBAL VARIABLES LIKE 'log_bin'
                  220 Query     SHOW MASTER STATUS
                  220 Query     SELECT @@global.relay_log_info_repository AS Value
                  220 Query     SELECT @@global.datadir AS Value
                  220 Query     SELECT @@global.relay_log_info_file AS Value
                  220 Query     SELECT @@global.datadir AS Value
                  220 Query     SHOW SLAVE STATUS
                  220 Query     SELECT @@global.read_only As Value
                  220 Query     SELECT @@global.relay_log_purge As Value
                  220 Query     SHOW SLAVE STATUS
                  220 Query     SELECT GET_LOCK('MHA_Master_High_Availability_Failover', '1') AS Value
                  220 Query     SHOW SLAVE STATUS
                  220 Query     STOP SLAVE IO_THREAD
                  220 Query     SHOW SLAVE STATUS
140928 10:17:51   220 Query     SHOW SLAVE STATUS
                  220 Query     SHOW SLAVE STATUS
                  220 Query     STOP SLAVE SQL_THREAD
                  220 Query     SHOW SLAVE STATUS
                  220 Query     SHOW MASTER STATUS
                  220 Query     SELECT @@global.read_only As Value
                  221 Connect   [email protected] on
                  221 Query     SELECT UNIX_TIMESTAMP()
                  221 Query     SHOW VARIABLES LIKE 'SERVER_ID'
                  221 Query     SET @master_heartbeat_period= 1799999979520
                  221 Query     SET @master_binlog_checksum= @@global.binlog_checksum
                  221 Query     SELECT @master_binlog_checksum
                  221 Query     SELECT @@GLOBAL.GTID_MODE
                  221 Query     SHOW VARIABLES LIKE 'SERVER_UUID'
                  221 Query     SET @slave_uuid= 'dda2f003-4244-11e4-b851-000c29da163f'
                  221 Binlog Dump       Log: 'mysql-bin.000016'  Pos: 688
                  220 Query     STOP SLAVE
                  220 Query     SHOW SLAVE STATUS
                  220 Query     RESET SLAVE /*!50516 ALL */
                  220 Query     SHOW SLAVE STATUS
                  220 Query     SELECT RELEASE_LOCK('MHA_Master_High_Availability_Failover') As Value
                  220 Quit
DB3:

[root@localhost ~]# cat /var/lib/mysql/localhost.log
140928 10:17:04   223 Connect   [email protected] on
                  223 Query     set autocommit=1
                  223 Query     SELECT CONNECTION_ID() AS Value
                  224 Connect   [email protected] on
                  224 Query     set autocommit=1
                  224 Query     SELECT CONNECTION_ID() AS Value
                  224 Query     SET wait_timeout=86400
                  224 Query     SELECT @@global.server_id As Value
                  224 Query     SELECT VERSION() AS Value
                  224 Query     SHOW GLOBAL VARIABLES LIKE 'log_bin'
                  224 Query     SHOW MASTER STATUS
                  224 Query     SELECT @@global.relay_log_info_repository AS Value
                  224 Query     SELECT @@global.datadir AS Value
                  224 Query     SELECT @@global.relay_log_info_file AS Value
                  224 Query     SELECT @@global.datadir AS Value
                  224 Query     SHOW SLAVE STATUS
                  224 Query     SELECT @@global.read_only As Value
                  224 Query     SELECT @@global.relay_log_purge As Value
                  224 Query     SHOW SLAVE STATUS
140928 10:17:10   225 Connect   [email protected] on
                  225 Query     select @@version_comment limit 1
                  225 Query     set sql_log_bin=0
                  225 Query     create table if not exists mysql.apply_diff_relay_logs_test(id int)
                  225 Query     insert into mysql.apply_diff_relay_logs_test values(1)
                  225 Query     update mysql.apply_diff_relay_logs_test set id=id+1 where id=1
                  225 Query     delete from mysql.apply_diff_relay_logs_test
                  225 Query     drop table mysql.apply_diff_relay_logs_test
140928 10:17:11   225 Quit
                  224 Quit
#開始failover流程的監測
140928 10:17:50   226 Connect   [email protected] on
                  226 Query     set autocommit=1
                  226 Query     SELECT CONNECTION_ID() AS Value
                  227 Connect   [email protected] on
                  227 Query     set autocommit=1
                  227 Query     SELECT CONNECTION_ID() AS Value
                  227 Query     SET wait_timeout=86400
                  227 Query     SELECT @@global.server_id As Value
                  227 Query     SELECT VERSION() AS Value
                  227 Query     SHOW GLOBAL VARIABLES LIKE 'log_bin'
                  227 Query     SHOW MASTER STATUS
                  227 Query     SELECT @@global.relay_log_info_repository AS Value
                  227 Query     SELECT @@global.datadir AS Value
                  227 Query     SELECT @@global.relay_log_info_file AS Value
                  227 Query     SELECT @@global.datadir AS Value
                  227 Query     SHOW SLAVE STATUS
                  227 Query     SELECT @@global.read_only As Value
                  227 Query     SELECT @@global.relay_log_purge As Value
                  227 Query     SHOW SLAVE STATUS
#正式開始failover的監測
                  228 Connect   [email protected] on
                  228 Query     set autocommit=1
                  228 Query     SELECT CONNECTION_ID() AS Value
                  229 Connect   [email protected] on
                  229 Query     set autocommit=1
                  229 Query     SELECT CONNECTION_ID() AS Value
                  229 Query     SET wait_timeout=86400
                  229 Query     SELECT @@global.server_id As Value
                  229 Query     SELECT VERSION() AS Value
                  229 Query     SHOW GLOBAL VARIABLES LIKE 'log_bin'
                  229 Query     SHOW MASTER STATUS
                  229 Query     SELECT @@global.relay_log_info_repository AS Value
                  229 Query     SELECT @@global.datadir AS Value
                  229 Query     SELECT @@global.relay_log_info_file AS Value
                  229 Query     SELECT @@global.datadir AS Value
                  229 Query     SHOW SLAVE STATUS
                  229 Query     SELECT @@global.read_only As Value
                  229 Query     SELECT @@global.relay_log_purge As Value
                  229 Query     SHOW SLAVE STATUS
                  229 Query     SELECT GET_LOCK('MHA_Master_High_Availability_Failover', '1') AS Value
                  229 Query     SHOW SLAVE STATUS
                  229 Query     STOP SLAVE IO_THREAD
                  229 Query     SHOW SLAVE STATUS
140928 10:17:51   229 Query     SHOW SLAVE STATUS
                  229 Query     SHOW SLAVE STATUS
                  229 Query     SHOW SLAVE STATUS
                  229 Query     SHOW SLAVE STATUS
                  229 Query     STOP SLAVE SQL_THREAD
                  229 Query     SHOW SLAVE STATUS
                  229 Query     STOP SLAVE
                  229 Query     SHOW SLAVE STATUS
                  229 Query     RESET SLAVE
                  229 Query     CHANGE MASTER TO MASTER_HOST = '192.168.153.151' MASTER_USER = 'slave' MASTER_PASSWORD =  MASTER_PORT = 3306 MASTER_LOG_FILE = 'mysql-bin.000016' MASTER_LOG_POS = 688
                  229 Query     START SLAVE
                  230 Connect Out       [email protected]:3306
                  229 Query     SHOW SLAVE STATUS
                  229 Query     SELECT RELEASE_LOCK('MHA_Master_High_Availability_Failover') As Value
                  229 Quit
140928 10:18:07   236 Connect   root@localhost on
                  236 Query     select @@version_comment limit 1
                  236 Query     select USER()
140928 10:18:10   236 Quit
MHA Manage.log:

[root@localhost ~]# cat /var/log/masterha/app1/manager.log
#啟動前的准備工作

#檢查服務器狀態,獲取相關參數設置
Sun Sep 28 10:17:03 2014 - [info] MHA::MasterMonitor version 0.55.
Sun Sep 28 10:17:04 2014 - [info] Dead Servers:
Sun Sep 28 10:17:04 2014 - [info] Alive Servers:
Sun Sep 28 10:17:04 2014 - [info]   192.168.153.150(192.168.153.150:3306)
Sun Sep 28 10:17:04 2014 - [info]   192.168.153.151(192.168.153.151:3306)
Sun Sep 28 10:17:04 2014 - [info]   192.168.153.152(192.168.153.152:3306)
Sun Sep 28 10:17:04 2014 - [info] Alive Slaves:
Sun Sep 28 10:17:04 2014 - [info]   192.168.153.151(192.168.153.151:3306)  Version=5.6.20-68.0-log (oldest major version between slaves) log-bin:enabled
Sun Sep 28 10:17:04 2014 - [info]     Replicating from 192.168.153.150(192.168.153.150:3306)
Sun Sep 28 10:17:04 2014 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun Sep 28 10:17:04 2014 - [info]   192.168.153.152(192.168.153.152:3306)  Version=5.6.20-68.0-log (oldest major version between slaves) log-bin:enabled
Sun Sep 28 10:17:04 2014 - [info]     Replicating from 192.168.153.150(192.168.153.150:3306)
Sun Sep 28 10:17:04 2014 - [info]     Not candidate for the new Master (no_master is set)
Sun Sep 28 10:17:04 2014 - [info] Current Alive Master: 192.168.153.150(192.168.153.150:3306)
Sun Sep 28 10:17:04 2014 - [info] Checking slave configurations..
Sun Sep 28 10:17:04 2014 - [info]  read_only=1 is not set on slave 192.168.153.151(192.168.153.151:3306).
Sun Sep 28 10:17:04 2014 - [warning]  relay_log_purge=0 is not set on slave 192.168.153.151(192.168.153.151:3306).
Sun Sep 28 10:17:04 2014 - [info]  read_only=1 is not set on slave 192.168.153.152(192.168.153.152:3306).
Sun Sep 28 10:17:04 2014 - [warning]  relay_log_purge=0 is not set on slave 192.168.153.152(192.168.153.152:3306).
Sun Sep 28 10:17:04 2014 - [info] Checking replication filtering settings..
Sun Sep 28 10:17:04 2014 - [info]  binlog_do_db= , binlog_ignore_db=
Sun Sep 28 10:17:04 2014 - [info]  Replication filtering check ok.
#測試ssh連接是否成功
Sun Sep 28 10:17:04 2014 - [info] Starting SSH connection tests..
Sun Sep 28 10:17:07 2014 - [info] All SSH connection tests passed successfully.
#測試MHA node是否可用
Sun Sep 28 10:17:07 2014 - [info] Checking MHA Node version..
Sun Sep 28 10:17:08 2014 - [info]  Version check ok.
Sun Sep 28 10:17:08 2014 - [info] Checking SSH publickey authentication settings on the current master..
Sun Sep 28 10:17:08 2014 - [info] HealthCheck: SSH to 192.168.153.150 is reachable.
Sun Sep 28 10:17:09 2014 - [info] Master MHA Node version is 0.54.
Sun Sep 28 10:17:09 2014 - [info] Checking recovery script configurations on the current master..
Sun Sep 28 10:17:09 2014 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql,/var/log/mysql --output_file=/var/log/masterha/app1/save_binary_logs_test --manager_version=0.55 --start_file=mysql-bin.000026
#創建MHA日志目錄
Sun Sep 28 10:17:09 2014 - [info]   Connecting to [email protected](192.168.153.150)..
  Creating /var/log/masterha/app1 if not exists..    ok.
  Checking output directory is accessible or not..
   ok.
  Binlog found at /var/lib/mysql, up to mysql-bin.000026
Sun Sep 28 10:17:09 2014 - [info] Master setting check done.
#master的檢查到此完畢

#開始檢查slave的差異日志應用權限
Sun Sep 28 10:17:09 2014 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Sun Sep 28 10:17:09 2014 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=192.168.153.151 --slave_ip=192.168.153.151 --slave_port=3306 --workdir=/var/log/masterha/app1 --target_version=5.6.20-68.0-log --manager_version=0.55 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx
Sun Sep 28 10:17:09 2014 - [info]   Connecting to [email protected](192.168.153.151:22)..
  Checking slave recovery environment settings..
    Opening /var/lib/mysql/relay-log.info ... ok.
    Relay log found at /var/lib/mysql, up to mysql-relay.000003
    Temporary relay log file is /var/lib/mysql/mysql-relay.000003
    Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.
 done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Sun Sep 28 10:17:10 2014 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=192.168.153.152 --slave_ip=192.168.153.152 --slave_port=3306 --workdir=/var/log/masterha/app1 --target_version=5.6.20-68.0-log --manager_version=0.55 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx
Sun Sep 28 10:17:10 2014 - [info]   Connecting to [email protected](192.168.153.152:22)..
  Checking slave recovery environment settings..
    Opening /var/lib/mysql/relay-log.info ... ok.
    Relay log found at /var/lib/mysql, up to mysql-relay.000003
    Temporary relay log file is /var/lib/mysql/mysql-relay.000003
    Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.
 done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Sun Sep 28 10:17:11 2014 - [info] Slaves settings check done.
Sun Sep 28 10:17:11 2014 - [info]
192.168.153.150 (current master)
 +--192.168.153.151
 +--192.168.153.152

#調試master_ip_failover_script
Sun Sep 28 10:17:11 2014 - [warning] master_ip_failover_script is not defined.
#調試shutdown_script
Sun Sep 28 10:17:11 2014 - [info] Checking shutdown script status:
Sun Sep 28 10:17:11 2014 - [info]   /opt/master_ip_failover.sh --command=status --ssh_user=root --host=192.168.153.150 --ip=192.168.153.150
Sun Sep 28 10:17:11 2014 - [info]  OK.
Sun Sep 28 10:17:11 2014 - [info] Set master ping interval 3 seconds.
#設置二次檢查的主機
Sun Sep 28 10:17:11 2014 - [info] Set secondary check script: masterha_secondary_check -s 192.168.153.151 -s 192.168.153.152
Sun Sep 28 10:17:11 2014 - [info] Starting ping health check on 192.168.153.150(192.168.153.150:3306)..

#MHA啟動完畢,進入監測狀態
Sun Sep 28 10:17:11 2014 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..

#監測DB1服務器掛了
Sun Sep 28 10:17:41 2014 - [warning] Got error on MySQL select ping: 2006 (MySQL server has gone away)

#通過定義的二次監測,確認master是否掛了
Sun Sep 28 10:17:41 2014 - [info] Executing seconary network check script: masterha_secondary_check -s 192.168.153.151 -s 192.168.153.152  --user=root  --master_host=192.168.153.150  --master_ip=192.168.153.150  --master_port=3306
Sun Sep 28 10:17:41 2014 - [info] Executing SSH check script: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql,/var/log/mysql --output_file=/var/log/masterha/app1/save_binary_logs_test --manager_version=0.55 --binlog_prefix=mysql-bin
Monitoring server 192.168.153.151 is reachable, Master is not reachable from 192.168.153.151. OK.
Sun Sep 28 10:17:42 2014 - [info] HealthCheck: SSH to 192.168.153.150 is reachable.
Monitoring server 192.168.153.152 is reachable, Master is not reachable from 192.168.153.152. OK.

#確認master掛了,開始進入failover流程
#再試嘗試連接master和master的ssh
Sun Sep 28 10:17:42 2014 - [info] Master is not reachable from all other monitoring servers. Failover should start.
Sun Sep 28 10:17:44 2014 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Sun Sep 28 10:17:44 2014 - [warning] Connection failed 1 time(s)..
Sun Sep 28 10:17:47 2014 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Sun Sep 28 10:17:47 2014 - [warning] Connection failed 2 time(s)..
Sun Sep 28 10:17:50 2014 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Sun Sep 28 10:17:50 2014 - [warning] Connection failed 3 time(s)..
Sun Sep 28 10:17:50 2014 - [warning] Master is not reachable from health checker!
Sun Sep 28 10:17:50 2014 - [warning] Master 192.168.153.150(192.168.153.150:3306) is not reachable!
Sun Sep 28 10:17:50 2014 - [warning] SSH is reachable.
#通過配置文件,監測其他slave的狀態
Sun Sep 28 10:17:50 2014 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/app1.cnf again, and trying to connect to all servers to check server status..
Sun Sep 28 10:17:50 2014 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sun Sep 28 10:17:50 2014 - [info] Reading application default configurations from /etc/app1.cnf..
Sun Sep 28 10:17:50 2014 - [info] Reading server configurations from /etc/app1.cnf..
Sun Sep 28 10:17:50 2014 - [info] Dead Servers:
Sun Sep 28 10:17:50 2014 - [info]   192.168.153.150(192.168.153.150:3306)
Sun Sep 28 10:17:50 2014 - [info] Alive Servers:
Sun Sep 28 10:17:50 2014 - [info]   192.168.153.151(192.168.153.151:3306)
Sun Sep 28 10:17:50 2014 - [info]   192.168.153.152(192.168.153.152:3306)
Sun Sep 28 10:17:50 2014 - [info] Alive Slaves:
Sun Sep 28 10:17:50 2014 - [info]   192.168.153.151(192.168.153.151:3306)  Version=5.6.20-68.0-log (oldest major version between slaves) log-bin:enabled
Sun Sep 28 10:17:50 2014 - [info]     Replicating from 192.168.153.150(192.168.153.150:3306)
Sun Sep 28 10:17:50 2014 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun Sep 28 10:17:50 2014 - [info]   192.168.153.152(192.168.153.152:3306)  Version=5.6.20-68.0-log (oldest major version between slaves) log-bin:enabled
Sun Sep 28 10:17:50 2014 - [info]     Replicating from 192.168.153.150(192.168.153.150:3306)
Sun Sep 28 10:17:50 2014 - [info]     Not candidate for the new Master (no_master is set)
#再次監測slave的配置是否有變化,是否符合failover條件
Sun Sep 28 10:17:50 2014 - [info] Checking slave configurations..
Sun Sep 28 10:17:50 2014 - [info]  read_only=1 is not set on slave 192.168.153.151(192.168.153.151:3306).
Sun Sep 28 10:17:50 2014 - [warning]  relay_log_purge=0 is not set on slave 192.168.153.151(192.168.153.151:3306).
Sun Sep 28 10:17:50 2014 - [info]  read_only=1 is not set on slave 192.168.153.152(192.168.153.152:3306).
Sun Sep 28 10:17:50 2014 - [warning]  relay_log_purge=0 is not set on slave 192.168.153.152(192.168.153.152:3306).
Sun Sep 28 10:17:50 2014 - [info] Checking replication filtering settings..
Sun Sep 28 10:17:50 2014 - [info]  Replication filtering check ok.
Sun Sep 28 10:17:50 2014 - [info] Master is down!
Sun Sep 28 10:17:50 2014 - [info] Terminating monitoring script.
Sun Sep 28 10:17:50 2014 - [info] Got exit code 20 (Master dead).
Sun Sep 28 10:17:50 2014 - [info] MHA::MasterFailover version 0.55.
#正式開始failover
Sun Sep 28 10:17:50 2014 - [info] Starting master failover.
Sun Sep 28 10:17:50 2014 - [info]

#再次對slave配置做檢查
Sun Sep 28 10:17:50 2014 - [info] * Phase 1: Configuration Check Phase..
Sun Sep 28 10:17:50 2014 - [info]
Sun Sep 28 10:17:50 2014 - [info] Dead Servers:
Sun Sep 28 10:17:50 2014 - [info]   192.168.153.150(192.168.153.150:3306)
Sun Sep 28 10:17:50 2014 - [info] Checking master reachability via mysql(double check)..
Sun Sep 28 10:17:50 2014 - [info]  ok.
Sun Sep 28 10:17:50 2014 - [info] Alive Servers:
Sun Sep 28 10:17:50 2014 - [info]   192.168.153.151(192.168.153.151:3306)
Sun Sep 28 10:17:50 2014 - [info]   192.168.153.152(192.168.153.152:3306)
Sun Sep 28 10:17:50 2014 - [info] Alive Slaves:
Sun Sep 28 10:17:50 2014 - [info]   192.168.153.151(192.168.153.151:3306)  Version=5.6.20-68.0-log (oldest major version between slaves) log-bin:enabled
Sun Sep 28 10:17:50 2014 - [info]     Replicating from 192.168.153.150(192.168.153.150:3306)
Sun Sep 28 10:17:50 2014 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun Sep 28 10:17:50 2014 - [info]   192.168.153.152(192.168.153.152:3306)  Version=5.6.20-68.0-log (oldest major version between slaves) log-bin:enabled
Sun Sep 28 10:17:50 2014 - [info]     Replicating from 192.168.153.150(192.168.153.150:3306)
Sun Sep 28 10:17:50 2014 - [info]     Not candidate for the new Master (no_master is set)
Sun Sep 28 10:17:50 2014 - [info] ** Phase 1: Configuration Check Phase completed.
Sun Sep 28 10:17:50 2014 - [info]

#對原Master做master_ip_failover_script和shutdown_script的操作
Sun Sep 28 10:17:50 2014 - [info] * Phase 2: Dead Master Shutdown Phase..
Sun Sep 28 10:17:50 2014 - [info]
Sun Sep 28 10:17:50 2014 - [info] Forcing shutdown so that applications never connect to the current master..
Sun Sep 28 10:17:50 2014 - [warning] master_ip_failover_script is not set. Skipping invalidating dead master ip address.
Sun Sep 28 10:17:50 2014 - [info] Executing SHUTDOWN script:
Sun Sep 28 10:17:50 2014 - [info]   /opt/master_ip_failover.sh --command=stopssh --ssh_user=root  --host=192.168.153.150  --ip=192.168.153.150  --port=3306
Sun Sep 28 10:17:51 2014 - [info]  Power off done.

Sun Sep 28 10:17:51 2014 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Sun Sep 28 10:17:51 2014 - [info]

#開始差異日志的恢復:獲取slave最後得到的binlog位置
Sun Sep 28 10:17:51 2014 - [info] * Phase 3: Master Recovery Phase..
Sun Sep 28 10:17:51 2014 - [info]
Sun Sep 28 10:17:51 2014 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Sun Sep 28 10:17:51 2014 - [info]
Sun Sep 28 10:17:51 2014 - [info] The latest binary log file/position on all slaves is mysql-bin.000026:120
Sun Sep 28 10:17:51 2014 - [info] Latest slaves (Slaves that received relay log files to the latest):
Sun Sep 28 10:17:51 2014 - [info]   192.168.153.151(192.168.153.151:3306)  Version=5.6.20-68.0-log (oldest major version between slaves) log-bin:enabled
Sun Sep 28 10:17:51 2014 - [info]     Replicating from 192.168.153.150(192.168.153.150:3306)
Sun Sep 28 10:17:51 2014 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun Sep 28 10:17:51 2014 - [info]   192.168.153.152(192.168.153.152:3306)  Version=5.6.20-68.0-log (oldest major version between slaves) log-bin:enabled
Sun Sep 28 10:17:51 2014 - [info]     Replicating from 192.168.153.150(192.168.153.150:3306)
Sun Sep 28 10:17:51 2014 - [info]     Not candidate for the new Master (no_master is set)
Sun Sep 28 10:17:51 2014 - [info] The oldest binary log file/position on all slaves is mysql-bin.000026:120
Sun Sep 28 10:17:51 2014 - [info] Oldest slaves:
Sun Sep 28 10:17:51 2014 - [info]   192.168.153.151(192.168.153.151:3306)  Version=5.6.20-68.0-log (oldest major version between slaves) log-bin:enabled
Sun Sep 28 10:17:51 2014 - [info]     Replicating from 192.168.153.150(192.168.153.150:3306)
Sun Sep 28 10:17:51 2014 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun Sep 28 10:17:51 2014 - [info]   192.168.153.152(192.168.153.152:3306)  Version=5.6.20-68.0-log (oldest major version between slaves) log-bin:enabled
Sun Sep 28 10:17:51 2014 - [info]     Replicating from 192.168.153.150(192.168.153.150:3306)
Sun Sep 28 10:17:51 2014 - [info]     Not candidate for the new Master (no_master is set)
Sun Sep 28 10:17:51 2014 - [info]
#獲取原master的binlog日志
Sun Sep 28 10:17:51 2014 - [info] * Phase 3.2: Saving Dead Master's Binlog Phase..
Sun Sep 28 10:17:51 2014 - [info]
Sun Sep 28 10:17:51 2014 - [warning] Dead Master is not SSH reachable. Could not save it's binlogs. Transactions that were not sent to the latest slave (Read_Master_Log_Pos to the tail of the dead master's binlog) were lost.
Sun Sep 28 10:17:51 2014 - [info]

#確定新的master
Sun Sep 28 10:17:51 2014 - [info] * Phase 3.3: Determining New Master Phase..
Sun Sep 28 10:17:51 2014 - [info]
Sun Sep 28 10:17:51 2014 - [info] Finding the latest slave that has all relay logs for recovering other slaves..
Sun Sep 28 10:17:51 2014 - [info] All slaves received relay logs to the same position. No need to resync each other.
Sun Sep 28 10:17:51 2014 - [info] Searching new master from slaves..
Sun Sep 28 10:17:51 2014 - [info]  Candidate masters from the configuration file:
Sun Sep 28 10:17:51 2014 - [info]   192.168.153.151(192.168.153.151:3306)  Version=5.6.20-68.0-log (oldest major version between slaves) log-bin:enabled
Sun Sep 28 10:17:51 2014 - [info]     Replicating from 192.168.153.150(192.168.153.150:3306)
Sun Sep 28 10:17:51 2014 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun Sep 28 10:17:51 2014 - [info]  Non-candidate masters:
Sun Sep 28 10:17:51 2014 - [info]   192.168.153.152(192.168.153.152:3306)  Version=5.6.20-68.0-log (oldest major version between slaves) log-bin:enabled
Sun Sep 28 10:17:51 2014 - [info]     Replicating from 192.168.153.150(192.168.153.150:3306)
Sun Sep 28 10:17:51 2014 - [info]     Not candidate for the new Master (no_master is set)
Sun Sep 28 10:17:51 2014 - [info]  Searching from candidate_master slaves which have received the latest relay log events..
Sun Sep 28 10:17:51 2014 - [info] New master is 192.168.153.151(192.168.153.151:3306)
Sun Sep 28 10:17:51 2014 - [info] Starting master failover..
Sun Sep 28 10:17:51 2014 - [info]
From:
192.168.153.150 (current master)
 +--192.168.153.151
 +--192.168.153.152

To:
192.168.153.151 (new master)
 +--192.168.153.152
Sun Sep 28 10:17:51 2014 - [info]
#在new master上應用差異的binlog日志
Sun Sep 28 10:17:51 2014 - [info] * Phase 3.3: New Master Diff Log Generation Phase..
Sun Sep 28 10:17:51 2014 - [info]
Sun Sep 28 10:17:51 2014 - [info]  This server has all relay logs. No need to generate diff files from the latest slave.
Sun Sep 28 10:17:51 2014 - [info]
Sun Sep 28 10:17:51 2014 - [info] * Phase 3.4: Master Log Apply Phase..
Sun Sep 28 10:17:51 2014 - [info]
Sun Sep 28 10:17:51 2014 - [info] *NOTICE: If any error happens from this phase, manual recovery is needed.
Sun Sep 28 10:17:51 2014 - [info] Starting recovery on 192.168.153.151(192.168.153.151:3306)..
Sun Sep 28 10:17:51 2014 - [info]  This server has all relay logs. Waiting all logs to be applied..
Sun Sep 28 10:17:51 2014 - [info]   done.
Sun Sep 28 10:17:51 2014 - [info]  All relay logs were successfully applied.
#獲取new master的binlog位置。
Sun Sep 28 10:17:51 2014 - [info] Getting new master's binlog name and position..
Sun Sep 28 10:17:51 2014 - [info]  mysql-bin.000016:688
Sun Sep 28 10:17:51 2014 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.153.151', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000016', MASTER_LOG_POS=688, MASTER_USER='slave', MASTER_PASSWORD='xxx';
#如果有master_ip_failover_script,那麼給new master設置VIP
Sun Sep 28 10:17:51 2014 - [warning] master_ip_failover_script is not set. Skipping taking over new master ip address.
Sun Sep 28 10:17:51 2014 - [info] ** Finished master recovery successfully.
Sun Sep 28 10:17:51 2014 - [info] * Phase 3: Master Recovery Phase completed.
Sun Sep 28 10:17:51 2014 - [info]

#開始恢復其他slave,也是從原master的binlog對比來做恢復
Sun Sep 28 10:17:51 2014 - [info] * Phase 4: Slaves Recovery Phase..
Sun Sep 28 10:17:51 2014 - [info]
Sun Sep 28 10:17:51 2014 - [info] * Phase 4.1: Starting Parallel Slave Diff Log Generation Phase..
Sun Sep 28 10:17:51 2014 - [info]
Sun Sep 28 10:17:51 2014 - [info] -- Slave diff file generation on host 192.168.153.152(192.168.153.152:3306) started, pid: 8387. Check tmp log /var/log/masterha/app1/192.168.153.152_3306_20140928101750.log if it takes time..
Sun Sep 28 10:17:51 2014 - [info]
Sun Sep 28 10:17:51 2014 - [info] Log messages from 192.168.153.152 ...
Sun Sep 28 10:17:51 2014 - [info]
Sun Sep 28 10:17:51 2014 - [info]  This server has all relay logs. No need to generate diff files from the latest slave.
Sun Sep 28 10:17:51 2014 - [info] End of log messages from 192.168.153.152.
Sun Sep 28 10:17:51 2014 - [info] -- 192.168.153.152(192.168.153.152:3306) has the latest relay log events.
Sun Sep 28 10:17:51 2014 - [info] Generating relay diff files from the latest slave succeeded.
Sun Sep 28 10:17:51 2014 - [info]

Sun Sep 28 10:17:51 2014 - [info] * Phase 4.2: Starting Parallel Slave Log Apply Phase..
Sun Sep 28 10:17:51 2014 - [info]
Sun Sep 28 10:17:51 2014 - [info] -- Slave recovery on host 192.168.153.152(192.168.153.152:3306) started, pid: 8389. Check tmp log /var/log/masterha/app1/192.168.153.152_3306_20140928101750.log if it takes time..
Sun Sep 28 10:17:51 2014 - [info]
Sun Sep 28 10:17:51 2014 - [info] Log messages from 192.168.153.152 ...
Sun Sep 28 10:17:51 2014 - [info]
Sun Sep 28 10:17:51 2014 - [info] Starting recovery on 192.168.153.152(192.168.153.152:3306)..
Sun Sep 28 10:17:51 2014 - [info]  This server has all relay logs. Waiting all logs to be applied..
Sun Sep 28 10:17:51 2014 - [info]   done.
Sun Sep 28 10:17:51 2014 - [info]  All relay logs were successfully applied.
#差異日志應用完成以後,切換所有slave到new master。
Sun Sep 28 10:17:51 2014 - [info]  Resetting slave 192.168.153.152(192.168.153.152:3306) and starting replication from the new master 192.168.153.151(192.168.153.151:3306)..
Sun Sep 28 10:17:51 2014 - [info]  Executed CHANGE MASTER.
Sun Sep 28 10:17:51 2014 - [info]  Slave started.
Sun Sep 28 10:17:51 2014 - [info] End of log messages from 192.168.153.152.
Sun Sep 28 10:17:51 2014 - [info] -- Slave recovery on host 192.168.153.152(192.168.153.152:3306) succeeded.
Sun Sep 28 10:17:51 2014 - [info] All new slave servers recovered successfully.
Sun Sep 28 10:17:51 2014 - [info]
Sun Sep 28 10:17:51 2014 - [info] * Phase 5: New master cleanup phase..
Sun Sep 28 10:17:51 2014 - [info]
Sun Sep 28 10:17:51 2014 - [info] Resetting slave info on the new master..
Sun Sep 28 10:17:51 2014 - [info]  192.168.153.151: Resetting slave info succeeded.
Sun Sep 28 10:17:51 2014 - [info] Master failover to 192.168.153.151(192.168.153.151:3306) completed successfully.
Sun Sep 28 10:17:51 2014 - [info]

#failover操作完成,生成failover報告
----- Failover Report -----

app1: MySQL Master failover 192.168.153.150 to 192.168.153.151 succeeded

Master 192.168.153.150 is down!

Check MHA Manager logs at localhost.localdomain:/var/log/masterha/app1/manager.log for details.

Started automated(non-interactive) failover.
Power off 192.168.153.150.
The latest slave 192.168.153.151(192.168.153.151:3306) has all relay logs for recovery.
Selected 192.168.153.151 as a new master.
192.168.153.151: OK: Applying all logs succeeded.
192.168.153.152: This host has the latest relay log events.
Generating relay diff files from the latest slave succeeded.
192.168.153.152: OK: Applying all logs succeeded. Slave started, replicating from 192.168.153.151.
192.168.153.151: Resetting slave info succeeded.
Master failover to 192.168.153.151(192.168.153.151:3306) completed

copyright © 萬盛學電腦網 all rights reserved