?
Solved

MySQL hanging - CPU usage down to 0

Posted on 2011-10-18
32
Medium Priority
?
318 Views
Last Modified: 2012-06-27
Hi,

I had a problem previously with MySQL hanging with CPU usage down to 0.  With teh fantastic help on ExpertsExchange of mwvisa1 he managed to identify it as an issue with query cache set too high - please see http://www.experts-exchange.com/Database/MySQL/Q_27283597.html

Problem seemed to be fully resolved by setting query_cache_size=0 and it's been running fine now for about 6 weeks.

Now suddenly out of the blue, it has started happening again!! Arrrgggh! The setting hasn't changed and noting else of note has changed either.

I'm at a real loss here, I though we'd nailed this problem and now I don't know where to start!

Any help would be very very welcome

Thanks
Capture.JPG
0
Comment
Question by:cp30
  • 16
  • 15
32 Comments
 
LVL 61

Expert Comment

by:Kevin Cross
ID: 36990185
cp30, it is unfortunate this returned. Do you happen to have a development environment you can upgrade? And guess, first are you able to reproduce the issue in development? If this takes 6 weeks to manifest, then it may take awhile to test, but if it is reproducible then you can upgrade that system to 5.5.16: http://dev.mysql.com/downloads/installer/5.5.html

Just feels like a bug.
0
 

Author Comment

by:cp30
ID: 36990202
Unfortunately due to the environment required to run the system I can not currently afford to implement a dev environment :(

I've tried restarting MySQL and even the server and still no luck.

How straightforward is the upgrade process? Is it easily rolled back?

Thanks
0
 

Author Comment

by:cp30
ID: 36990222
Were many bugs fixed between 5.5.15 and 5.5.16? Just seems I'd have to get lucky that it was caught in that 1 version increment.

Is there anything that could have been filling up for last 6 weeks that I can clear down that you can think of?

Cheers
0
Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
LVL 61

Expert Comment

by:Kevin Cross
ID: 36990229
That is the what is curious that it ran six weeks straight fine and all of a sudden is unrecoverable even after a reboot. It sounds like data related. What is the current size of databases and load?
0
 

Author Comment

by:cp30
ID: 36990257
The load is always busy but always much  busier at weekends so should be fine now.  

Database size is 190MB.  Lots of data being loaded in but it gets removed regularly as it becomes stale so will never grow that big.

When it stops responding, the RDP to the server becomes unresponsive if you try and access windows explorer and the small MS SQL server that runs on the same server is also unresponsive during the period of cpu at 0.
0
 
LVL 61

Expert Comment

by:Kevin Cross
ID: 36990262
As far as what changed, the first bit is interesting but for the Commercial version. If I remember correctly, you are using the Community Server. http://dev.mysql.com/doc/refman/5.5/en/mysql-nutshell.html
Although, if the connections are the issue, then a full reboot of the server should have fixed the issue even temporarily. So something has changed in data or health of server, we just need to find it. Hopefully some others will come along. If not, I will see if I can get a few folks to stop by.
0
 
LVL 61

Expert Comment

by:Kevin Cross
ID: 36990269
Okay. I just saw your last post. Possibly this is a Windows issue. For some reason I remember something about RDP freezes, but that was a while back. We had a mix of Windows 2000 and Windows 2003 servers at the time and it ended up being a Windows bug that had to be hotfixed. There were some other issues with combinations of Antivirus and backup software, so may be good to list environment again for new participants and to refresh my memory.
0
 
LVL 61

Accepted Solution

by:
Kevin Cross earned 2000 total points
ID: 36990282
From the Windows side of things, check for failing hardware and normal troubleshooting paths. For example, there is a Warning in the System Logs of the screenshot you showed. What does that say? What are other messages being traced during times of issue?
0
 

Author Comment

by:cp30
ID: 36990291
Hi,  RDP doesn't actually freeze, it is responsive, just if I try and access windows explorer it says not responding until the freeze is over.  I can still drag a window via RDP etc during a freeze


Dedicate Server
6 core CPU
8GB ram
Windows Web Server 2008R2 (not running IIS)
MySQL 5.5.15
Microsoft SQL Server 2008R2 Express
Backup: SyncBackPro


Yes, I agree, it does kind of suggest that it may be a windows issue, but the fact that we fixed it last time so conclusively with exact same symptoms by changing the query cache seems to suggest that MySQL was causing it then so seems a big coincidence that windows should now have a similar problem.  
0
 

Author Comment

by:cp30
ID: 36990351
Hi,

Host have offered to swap hard drives out as one fell out of the mirror (could be unrelated but worth a shot) but means me rebuilding from scratch.  

If I install latest version of MySQL am I ok to restore my backup from 5.515 to 5.5.16?

Cheers
0
 
LVL 61

Expert Comment

by:Kevin Cross
ID: 36990379
That should work just fine. They should be able to add one drive, let it stripe working one in mirror, then once the new drive is online pull the other old drive and put other new drive in. But then again, if a clean start is possible, then might be worth it to ensure it isn't some corruption that was showing up based on MySQL load. The fact that the query cache fixed it previously, maybe it was indicative of memory failing versus a hard drive; however, the drive is what failed so could have been an odd combination of going out to memory and needing to write to pagefile which is on disk. A stretch, but would be a logical correlation to memory and hard drive. Guess my point is, I would ask they do a memory test also, especially if you have to be down anyway ...
0
 

Author Comment

by:cp30
ID: 36990481
Wow, there sure does seem to be a disk problem, getting worse by the minute as I try to backup some files.  Just getting what I can, is it the my.ini in program files that includes all the config?  I'm guessing that that and a local backup I have of db should be enough to recreate?

Thanks
0
 
LVL 61

Expert Comment

by:Kevin Cross
ID: 36990504
Yes, my.ini is the configuration file.
0
 
LVL 61

Expert Comment

by:Kevin Cross
ID: 36990509
If you have backups, you should be fine. I am paranoid, so I would probably grab everything in the MySQL data directory (concerned with mysql system database folder/files and your user database folders/files).
0
 

Author Comment

by:cp30
ID: 36990514
Every time I try and grab those files the server hangs then blue screens!!!! This is getting very bad, but I suppose it shows there's obviously some kind of corruption/failure somewhere.
0
 
LVL 61

Expert Comment

by:Kevin Cross
ID: 36990544
Yes. You will need MySQL stopped. And any backup services that may hook into those files. But it may be too late. Grab what you can then. If those files are part of the problem area on the disk, what you save may be highly corrupted, so time to cross fingers and hope the provider gets disks in quickly. Sounds bad. Sorry.
0
 
LVL 143

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 36999046
my guesses would go in this order:
* either some RAM components are bad
* some disk bad sectors
* network card is bad
0
 

Author Comment

by:cp30
ID: 36999073
Hi, both drives have now been replaced and I've completed rebuild of server. Just waiting to see if load cause any repetition of problem. Thanks for help and suggestions, hopefully disks will sort it. Fingers crossed!!!
0
 

Author Comment

by:cp30
ID: 37006713
HI, I'm having a lot of problems since the rebuild, not sure what's causing it, it's using the same .ini config as before so guessing that performance should be the same.

Current problem is that deletes are taking a long time.

I have one table with 20,000 records and I try to delete with something like following

select * FROM odds.tblStaticArbList
WHERE newFlag = 0 and outcomeTypeId = 1;

There was an index on newFlag and outcomeTypeId which I thought should do the trick, but was very slow, so I changed the newFlag incex to include both fields newFlag and outcomeTypeId, but running the following explain ......

explain select * FROM odds.tblMainTable
WHERE newFlag = 0 and outcomeTypeId = 1;

tells me.....

'1', 'SIMPLE', 'tblMainTable', 'ref', 'outcomeTypeId,newFlag', 'outcomeTypeId', '5', 'const', '113266', 'Using where'

So it's having to check 113,266 rows?

That doesn;t seem right as those fields are indexed and there's only 20,000 records in the table at the minute (although changing constantly through deletes/inserts).

Any advice on if this is a problem?

Thanks


0
 
LVL 61

Expert Comment

by:Kevin Cross
ID: 37006756
Sounds like index needs to be rebuilt. Try deleting the index and re-adding.
0
 

Author Comment

by:cp30
ID: 37006805
Hi, I did that a while ago while site was down then it was returning rows=1 but as soon as deletes and inserts started on the table the rows value started going up, is there any reason that it should need to be rebuilt, am I causing that?
0
 
LVL 61

Expert Comment

by:Kevin Cross
ID: 37009724
Maybe still an indication of hardware like memory being out of sorts. a3, have you ever seen this behavior with indexes in MySQL?

I know the order of the columns in the index matter, but it almost appears here that your combined index is not being picked up just the individual ones and it is selecting the one it thinks works best. Have you tried deleting all the indexes and just adding a combo index with newFlag first, then outcomeTypeId?

Did your ISP ever do a full diagnostics on system?

Is it possible for the ISP to move the new mirrored drives to an identical piece of hardware? That way, the server should boot to the drives without incident, but you will have the benefit of seeing how the OS and MySQL run on a different set of memory modules, network cards, motherboard, etc.
0
 

Author Comment

by:cp30
ID: 37009869
Thanks for the comments.....

I had to get something working quickly, so I have actually changed the way that the logic works to circumvent the large deletes. It now works by running my long running complicated views and saving the result of each one to a temp table I create, and it then does a double rename to make the live table old, and temp table to live. Now my website is doing a union on all the 7 live tables which, at the minute, seems to be working a lot better. Does this make sense, sound ok? It's basically to stop every request for data having to execute the 7 complicated views.

I will have to keep and eye on db to check for any other weird errors. My users have experienced so much downtime over last few days (disk swap was far from painless with rebuild) that I'm a little reluctant to ask isp to run memory test just yet, think I'll email them and ask how long it will take to schedule test for next week maybe.
0
 
LVL 61

Expert Comment

by:Kevin Cross
ID: 37009909
I agree. I thought they would have done that when they had the system down. But they probably scrambled to do the quickest thing possible to bring you back online. Stop-gap solutions are not always perfect, but often a necessity in the real world. Glad you found one. If you were deleting all the data, I would say use TRUNCATE. If it is almost all, then you can push the good data to a new table, then truncate the old one, then move back the good data. That may help.
0
 

Author Comment

by:cp30
ID: 37011712
Hi,  another weird one. I had an error in db and it turns out the the following statement

DECLARE v_result decimal;
SET v_result = p1 / p2;

Open in new window


is causing error....

Error Code: 1264
Out of range value for column

Open in new window


When I supply 100 and 1 as p1 and p2.  Should this cause error when trying to set a decimal to 100 divided by 1?

Thanks
0
 
LVL 61

Expert Comment

by:Kevin Cross
ID: 37011743
What is the precision and scale of the decimal v_result?
0
 

Author Comment

by:cp30
ID: 37011815
Hi, I've tried a few and none seem to work, it was 10,4 and I have tried 16,4 and not specifying at all, am I missing something?

Is there a way to trap an error in the function and return null, at the minute the function is used in a large view and returning an error means the whole view cannot be called, which is a real pain.

Thanks
0
 

Author Comment

by:cp30
ID: 37011833
Ahhhhh - Sorry, ignore that, just realized the problem is with the RETURNS precision of function, not the V_RESULT variable.  Th error was a little misleading when it mentioned V_RESULT.

Thanks anyway ;-)
0
 
LVL 61

Expert Comment

by:Kevin Cross
ID: 37011840
You are welcome.
0
 

Author Comment

by:cp30
ID: 37011844
Out of interest, is there a way to add an error handler to a function, real pain when it stops a whole view loading and not always easy to track down the culpable row? I can ask open a new question if you'd rather? ;-)
0
 
LVL 61

Expert Comment

by:Kevin Cross
ID: 37011875
See if these help (if not, then yes it would probably be good to ask a new question as that will get others to provide input also and be better searchable for new readers):

MySQL Signal Conditions
http://dev.mysql.com/doc/refman/5.5/en/signal.html

MySQL Declare Handlers
http://dev.mysql.com/doc/refman/5.5/en/declare-handler.html

Kevin
0
 

Author Closing Comment

by:cp30
ID: 37016747
The hanging seems to be resolved since the disk swap so fingers crossed this will be ok now.  Thanks, as always, for your help mwvisa1, you are an absolute star.

Cheers
0

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this blog, we’ll look at how improvements to Percona XtraDB Cluster improved IST performance.
Recursive SQL is one of the most fascinating and powerful and yet dangerous feature offered in many modern databases today using a Common Table Expression (CTE) first introduced in the ANSI SQL 99 standard. The first implementations of CTE began ap…
In this video, Percona Solution Engineer Dimitri Vanoverbeke discusses why you want to use at least three nodes in a database cluster. To discuss how Percona Consulting can help with your design and architecture needs for your database and infras…
In this video, Percona Solutions Engineer Barrett Chambers discusses some of the basic syntax differences between MySQL and MongoDB. To learn more check out our webinar on MongoDB administration for MySQL DBA: https://www.percona.com/resources/we…
Suggested Courses

615 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question