Solved

Tar/Gzip backup locks up server...accidentally deletes .MYI file

Posted on 2012-03-27
9
584 Views
Last Modified: 2012-04-13
I have a large MySQL master/slave database running on two Solaris 10 x86 servers.

Once a day, I stop the slave server to make a backup. Because of the sheer size of the tables (and the amount of time it takes), I chose NOT to use mysqldump but instead tar/gzip all of the DB directories into a new folder, where I can restore them if needed.

Twice in the last seven months I have run into a bizarre problem: at a point AFTER the slave has stopped and the tar/gzip has begun, the entire server stops responding (SSH, console login) - though it is still running. This then requires a hard reboot.

There are three smaller tables used to temporarily store data. These tables can be empty from time to time. After the reboot, restarting the slave fails because one of these three tables has an .MYI file of 0K. I have to recreate the table from a backup to get the slave server restarted.

/var/adm/messages and /var/log/syslog is empty. So I have very little to troubleshoot with.

Disk space is not an issue.

#!/usr/bin/perl

#disconnect MySQL
system("/etc/init.d/mysql stop");

my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);

#relocate and tar proper files...
my $timestamp= time;
system("cd /var/lib/mysql; tar cEf - ./* mysql-relay-log.info mysql-master.info | gzip > /BK/daily-${timestamp}.tar.gz");

#restart MySQL
system("/etc/init.d/mysql start");
0
Comment
Question by:shambright
  • 4
  • 3
  • 2
9 Comments
 
LVL 40

Assisted Solution

by:jlevie
jlevie earned 250 total points
ID: 37775016
Since I can't think of a reason that the tar/gzip should lock up the server to the point that a reboot would be necessary (though it could make response really slow), my suspicion is that MySQL may still be running when the tar starts. Try manually running through the steps, but check (with ps) to make sure that all of the MySQL processes have exited before starting the backup.
0
 
LVL 51

Expert Comment

by:ahoffmann
ID: 37775314
first ensure that your "mysql stop" call succedds and realy stops the server
second: is /BK a mounted filesystem (not on the same host)?
0
 

Author Comment

by:shambright
ID: 37783073
jlevie - this is meant to be automated. I am not a fast enough typist to duplicate a script :)

ahoffman - /BK is a local directory

Added this:

system("/etc/init.d/mysql stop");
my ($exists,$tries);
$tries = 0;
PAUSE_FOR_EXIT:
sleep(3);
$exists = kill 0, `head -1 /var/run/mysql.pid`;
++$tries;
if (($tries > 120)&&($exists)) { exit; } #failed... Nagios will pick up failure
if ($exists) { goto PAUSE_FOR_EXIT; }

Open in new window


If it is that MySQL did not shut down before tar started, then this should solve it.

Any thoughts on why nothing showed up in the logs regarding being locked completely out?
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 51

Expert Comment

by:ahoffmann
ID: 37783313
did you try also:
  kill -9 ...
0
 

Author Comment

by:shambright
ID: 37787923
I would need access to the server for that :)
0
 
LVL 51

Expert Comment

by:ahoffmann
ID: 37788051
somehow, but you already use kill in the php script you posted
0
 
LVL 40

Expert Comment

by:jlevie
ID: 37789853
I understand that you need it to be automated. The point of doing it manually (and making sure that all MySql processes have exited) is to figure out if the MySql exit is the problem or if something else is the problem.
0
 
LVL 51

Accepted Solution

by:
ahoffmann earned 250 total points
ID: 37790162
my comment about "kill" wasn't clear enough, sorry
I agree with jlevie that you forst test manually with
   /etc/init.d/mysql stop
check the messages and the status
then try to start your
    tar whatever....

the important thing is if mysql stops properly, if not try with
   kill <--pid-->
if that does not help, try with
   kill -9 <--pid-->

also, I highly recommend not to rely on /var/run/mysql.pid to contain the correct pid,
you better get the real actual pid using ps like for example
   ps ax|awk '/mysqld /{print $1}'
0
 

Author Closing Comment

by:shambright
ID: 37845004
It hasn't happened again since -- but EE wants this matter closed. Thanks for your input.
0

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Hello fellow BSD lovers, I've created a patch process for patching openjdk6 for BSD (FreeBSD specifically), although I tried to keep all BSD versions in mind when creating my patch. Welcome to OpenJDK6 on BSD First let me start with a little …
I have been running these systems for a few years now and I am just very happy with them.   I just wanted to share the manual that I have created for upgrades and other things.  Oooh yes! FreeBSD makes me happy (as a server), no maintenance and I al…
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
In a previous video, we went over how to export a DynamoDB table into Amazon S3.  In this video, we show how to load the export from S3 into a DynamoDB table.

679 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question