• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 605
  • Last Modified:

Tar/Gzip backup locks up server...accidentally deletes .MYI file

I have a large MySQL master/slave database running on two Solaris 10 x86 servers.

Once a day, I stop the slave server to make a backup. Because of the sheer size of the tables (and the amount of time it takes), I chose NOT to use mysqldump but instead tar/gzip all of the DB directories into a new folder, where I can restore them if needed.

Twice in the last seven months I have run into a bizarre problem: at a point AFTER the slave has stopped and the tar/gzip has begun, the entire server stops responding (SSH, console login) - though it is still running. This then requires a hard reboot.

There are three smaller tables used to temporarily store data. These tables can be empty from time to time. After the reboot, restarting the slave fails because one of these three tables has an .MYI file of 0K. I have to recreate the table from a backup to get the slave server restarted.

/var/adm/messages and /var/log/syslog is empty. So I have very little to troubleshoot with.

Disk space is not an issue.

#!/usr/bin/perl

#disconnect MySQL
system("/etc/init.d/mysql stop");

my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);

#relocate and tar proper files...
my $timestamp= time;
system("cd /var/lib/mysql; tar cEf - ./* mysql-relay-log.info mysql-master.info | gzip > /BK/daily-${timestamp}.tar.gz");

#restart MySQL
system("/etc/init.d/mysql start");
0
shambright
Asked:
shambright
  • 4
  • 3
  • 2
2 Solutions
 
jlevieCommented:
Since I can't think of a reason that the tar/gzip should lock up the server to the point that a reboot would be necessary (though it could make response really slow), my suspicion is that MySQL may still be running when the tar starts. Try manually running through the steps, but check (with ps) to make sure that all of the MySQL processes have exited before starting the backup.
0
 
ahoffmannCommented:
first ensure that your "mysql stop" call succedds and realy stops the server
second: is /BK a mounted filesystem (not on the same host)?
0
 
shambrightAuthor Commented:
jlevie - this is meant to be automated. I am not a fast enough typist to duplicate a script :)

ahoffman - /BK is a local directory

Added this:

system("/etc/init.d/mysql stop");
my ($exists,$tries);
$tries = 0;
PAUSE_FOR_EXIT:
sleep(3);
$exists = kill 0, `head -1 /var/run/mysql.pid`;
++$tries;
if (($tries > 120)&&($exists)) { exit; } #failed... Nagios will pick up failure
if ($exists) { goto PAUSE_FOR_EXIT; }

Open in new window


If it is that MySQL did not shut down before tar started, then this should solve it.

Any thoughts on why nothing showed up in the logs regarding being locked completely out?
0
Keep up with what's happening at Experts Exchange!

Sign up to receive Decoded, a new monthly digest with product updates, feature release info, continuing education opportunities, and more.

 
ahoffmannCommented:
did you try also:
  kill -9 ...
0
 
shambrightAuthor Commented:
I would need access to the server for that :)
0
 
ahoffmannCommented:
somehow, but you already use kill in the php script you posted
0
 
jlevieCommented:
I understand that you need it to be automated. The point of doing it manually (and making sure that all MySql processes have exited) is to figure out if the MySql exit is the problem or if something else is the problem.
0
 
ahoffmannCommented:
my comment about "kill" wasn't clear enough, sorry
I agree with jlevie that you forst test manually with
   /etc/init.d/mysql stop
check the messages and the status
then try to start your
    tar whatever....

the important thing is if mysql stops properly, if not try with
   kill <--pid-->
if that does not help, try with
   kill -9 <--pid-->

also, I highly recommend not to rely on /var/run/mysql.pid to contain the correct pid,
you better get the real actual pid using ps like for example
   ps ax|awk '/mysqld /{print $1}'
0
 
shambrightAuthor Commented:
It hasn't happened again since -- but EE wants this matter closed. Thanks for your input.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Get expert help—faster!

Need expert help—fast? Use the Help Bell for personalized assistance getting answers to your important questions.

  • 4
  • 3
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now