Replication slave lag monitoring using heartbeat and windows batch scripts.

AID: 1423
  • Status: Published

2890 points

  • BytheGhost_k8
  • TypeTutorial
  • Posted on2009-09-01 at 04:55:54
Awards
  • Community Pick
"Show Slave Status" command has a last column "Seconds_Behind_Master", which gives us idea about how much time slave is lagging behind master. It is an important to be considered parameter in monitoring and maintaining replication.

This article explains us a way to monitor replication slave lag time. It also includes a sample batch scripts to automate the monitoring process, makes it easy to understand.

Whats wrong with "Seconds_Behind_Master":

Show Slave Status command; does shows us Seconds_Behind_Master.

Now Documentation says: The field measures the time difference in seconds between the slave SQL thread and the slave I/O thread. If the network is slow, this is not a good approximation; the slave SQL thread may quite often be caught up with the slow-reading slave I/O thread, so Seconds_Behind_Master often shows a value of 0, even if the I/O thread is late compared to the master. In other words, this column is useful only for fast networks.

Now this has been discussed at a lot of places for a lot of times, the solution is to periodically insert a row into a "heartbeat" table on the master server. Let it get replicated on slave and check the "heartbeat"s on the slave. This will surely explain you replication time and the slave behind master.

So whats the idea?

Consider descriptions of following two mysql functions:

current_timestamp() : The function when used in a query replicates its output same across the slaves. Thus even if the query is executed at a later time on slave, the value stored would be same as what is there in the master.

sysdate() : The behavior of this function is different from the above. It stores the current output of the function at the slave when executed. Thus the output produced could be different from what was produced at the master.

Considering behavior, if we have both values inserted on master server; we will get it replicated on slave. But for sysdate() function, the time will be of slave's and that will help us calculating the slave lag comparing with current_timestamp() value.

Implementing Slave Behind Master Monitoring:

Continuing with the article: http://www.experts-exchange.com/articles/Database/Miscellaneous/Quick-Multi-MySQL-Server-Installation-with-Master-Master-Replication-on-Same-Windows-Box.html I've tried to implement replication slave lag monitoring through "heartbeat" method.

Here for making an example I've considered MySQL Server with port 3307 as Master and Server with port 3306 (Default) as Slave for monitoring Slave Lag.

Create table on Master MySQL Server:
create table heartbeat
(
  master_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  slave_time TIMESTAMP NOT NULL
) ENGINE=MyISAM;
                                    
1:
2:
3:
4:
5:

Select allOpen in new window



Periodical value insertion - Generate Heartbeat:
insert into heartbeat(slave_time) values(SYSDATE());
                                    
1:

Select allOpen in new window



One more case is, when slave is intentionally kept behind master. We can surely add up that time and make according insertion.

Insert Into heartbeat(slave_time) 
Values( from_unixtime((unix_timestamp(sysdate()) + time_to_sec('HH:MM:SS'))) ;
                                    
1:
2:

Select allOpen in new window


Here, HH:MM:SS is slave's known time of lagging behind master.


We have following queries to monitor slave-lag:

First query will tell us the time difference of slave and master query execution time. Which is actually caused by the functions explained above.

Select master_time, timediff(slave_time, master_time) 
From heartbeat 
Where DATE(master_time) = DATE(NOW()) 
Order By master_time;
                                    
1:
2:
3:
4:

Select allOpen in new window



This second query will tell us the time difference of last query execution time from replication to now.

select timediff(NOW(), max(master_time)) from heartbeat;
                                    
1:

Select allOpen in new window




Batch Scripts For Windows:

To understand above setup and making it little automated, I wrote couple of batch scripts.

Following batch script will auto insert one row into heartbeat table on per minute basis on Master.

##insert-master.bat
:begin
@echo off
set StartTime=%TIME%
set StartMin=%StartTime:~3,2%
:start
set EndTime=%TIME%
set EndMin=%EndTime:~3,2%
set /a Hour_Diff=StartMin - EndMin >nul
if %Hour_Diff%==0 (
goto start )
echo Beating...
mysql -uroot -pXXXX -P3307 -e "use master;insert into heartbeat(slave_time) values(SYSDATE());"
goto begin
pause
                                    
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:

Select allOpen in new window



Similar batch created to monitor the lag time on Slave MySQL Server.

##monitor-slave.bat
@echo off
echo Observing slave delay - per minute basis
echo SBM -  Slave Behind Master / Last Query Executed.
echo ETD - Execution Time Difference
echo ----------------------------------
echo SBM  -  ETD	- Slave Status
echo ----------------------------------
:begin
set StartTime=%TIME%
set StartMin=%StartTime:~3,2%
:start
set EndTime=%TIME%
set EndMin=%EndTime:~3,2%

set /a Hour_Diff=StartMin - EndMin >nul
if %Hour_Diff%==0 (
	goto start )
FOR /F "skip=1 tokens=*" %%I IN ('mysql -uroot -pXXXX -e "use master;select TIME_TO_SEC(timediff(NOW(), max(master_time)) ) last_query_time, second(timediff(max(slave_time), max(master_time))) master_slave_difference from heartbeat where DATE(master_time) = DATE(NOW()) order by master_time desc;"') DO set var=%%I%
@echo %var%
FOR /F "tokens=11 delims='|	'" %%G IN ('mysql -uroot -pXXXX -e "show slave status;"') do @echo			%%G 
FOR /F "tokens=12 delims='|	'" %%G IN ('mysql -uroot -pXXXX -e "show slave status;"') do @echo			%%G 

goto begin
pause
                                    
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
24:
25:

Select allOpen in new window



PS: Set passwords accordingly.

Using sample batch scripts we will be able to understand the replication lag.
ETD, the Execution Time Difference from master to slave shows us how long slave takes to replicate once master has executed the statement. By query we can understand it is difference between current_timestamp() and sysdate().
SBM, Slave Behind Master, points us the time in seconds last query executed on slave from master.
The monitoring script will also tell you the IO and SQL thread status after each minute.


References:
http://sql.dzone.com/articles/better-mysql-replication
http://dev.mysql.com/doc/refman/5.0/en/show-slave-status.html
Asked On
2009-09-01 at 04:55:54ID1423
Tags

mysql

,

windows

,

replication

,

slave lag

,

batch

Topic

MySQL Server

Views
1758

Comments

Add your Comment

Please Sign up or Log in to comment on this article.

Join Experts Exchange Today

Gain Access to all our Tech Resources

Get personalized answers

Ask unlimited questions

Access Proven Solutions

Search 3.2 million solutions

Read In-Depth How-To Guides

1000+ articles, demos, & tips

Watch Step by Step Tutorials

Learn direct from top tech pros

And Much More!

Your complete tech resource

See Plans and Pricing

30-day free trial. Register in 60 seconds.

Loading Advertisement...

Top MySQL Server Experts

  1. johanntagle

    286,814

    Guru

    6,000 points yesterday

    Profile
    Rank: Sage
  2. Ray_Paseur

    216,557

    Guru

    0 points yesterday

    Profile
    Rank: Savant
  3. DaveBaldwin

    119,595

    Master

    1,400 points yesterday

    Profile
    Rank: Genius
  4. angelIII

    61,340

    Master

    0 points yesterday

    Profile
    Rank: Elite
  5. mwvisa1

    57,185

    Master

    30 points yesterday

    Profile
    Rank: Genius
  6. HainKurt

    41,850

    0 points yesterday

    Profile
    Rank: Genius
  7. ralmada

    39,250

    0 points yesterday

    Profile
    Rank: Genius
  8. Roads_Roads

    33,080

    0 points yesterday

    Profile
    Rank: Genius
  9. arnold

    29,812

    0 points yesterday

    Profile
    Rank: Genius
  10. theGhost_k8

    29,785

    0 points yesterday

    Profile
    Rank: Sage
  11. Kdo

    29,682

    0 points yesterday

    Profile
    Rank: Genius
  12. bportlock

    26,604

    0 points yesterday

    Profile
    Rank: Genius
  13. jason1178

    23,574

    0 points yesterday

    Profile
    Rank: Genius
  14. maeltar

    23,236

    0 points yesterday

    Profile
    Rank: Guru
  15. StingRaY

    21,500

    0 points yesterday

    Profile
    Rank: Wizard
  16. smadeira

    19,968

    0 points yesterday

    Profile
    Rank: Wizard
  17. fundacionrts

    18,200

    0 points yesterday

    Profile
    Rank: Master
  18. gr8gonzo

    17,019

    0 points yesterday

    Profile
    Rank: Sage
  19. ChrisStanyon

    16,964

    0 points yesterday

    Profile
    Rank: Sage
  20. pratima_mcs

    16,614

    0 points yesterday

    Profile
    Rank: Genius
  21. TempDBA

    16,400

    0 points yesterday

    Profile
    Rank: Sage
  22. Sharath_123

    16,268

    0 points yesterday

    Profile
    Rank: Genius
  23. for_yan

    16,000

    0 points yesterday

    Profile
    Rank: Genius
  24. matthewspatrick

    15,800

    0 points yesterday

    Profile
    Rank: Savant
  25. AielloJ

    13,732

    0 points yesterday

    Profile
    Rank: Wizard

Hall Of Fame