asked on

Mysql Crashing Intermittently

I am on an AWS virtual server running Centos 6 and Mysql 5.5.2.

My log shows

161115 17:13:49 mysqld_safe Number of processes running now: 0
161115 17:13:49 mysqld_safe mysqld restarted
161115 17:13:53 [Note] /usr/libexec/mysql55/mysqld (mysqld 5.5.52) starting as process 20516 ...
161115 17:13:54 [Note] Plugin 'FEDERATED' is disabled.
161115 17:13:54 InnoDB: The InnoDB memory heap is disabled
161115 17:13:54 InnoDB: Mutexes and rw_locks use GCC atomic builtins
161115 17:13:54 InnoDB: Compressed tables use zlib 1.2.8
161115 17:13:54 InnoDB: Using Linux native AIO
161115 17:13:55 InnoDB: Initializing buffer pool, size = 128.0M
161115 17:13:56 InnoDB: Completed initialization of buffer pool
161115 17:13:57 InnoDB: highest supported file format is Barracuda.
InnoDB: The log sequence number in ibdata files does not match
InnoDB: the log sequence number in the ib_logfiles!
161115 17:13:58  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
161115 17:14:09  InnoDB: Waiting for the background threads to start
161115 17:14:10 InnoDB: 5.5.52 started; log sequence number 172999606026
161115 17:14:10 [Note] Server hostname (bind-address): '0.0.0.0'; port: 3306
161115 17:14:10 [Note]   - '0.0.0.0' resolves to '0.0.0.0';
161115 17:14:10 [Note] Server socket created on IP: '0.0.0.0'.
161115 17:14:10 [Note] Event Scheduler: Loaded 0 events
161115 17:14:10 [Note] /usr/libexec/mysql55/mysqld: ready for connections.
Version: '5.5.52'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)
161115 17:29:20 mysqld_safe Number of processes running now: 0
161115 17:29:21 mysqld_safe mysqld restarted
161115 17:29:25 [Note] /usr/libexec/mysql55/mysqld (mysqld 5.5.52) starting as process 24461 ...
161115 17:29:26 [Note] Plugin 'FEDERATED' is disabled.
161115 17:29:27 InnoDB: The InnoDB memory heap is disabled
161115 17:29:27 InnoDB: Mutexes and rw_locks use GCC atomic builtins
161115 17:29:27 InnoDB: Compressed tables use zlib 1.2.8
161115 17:29:27 InnoDB: Using Linux native AIO
161115 17:29:28 InnoDB: Initializing buffer pool, size = 128.0M
InnoDB: mmap(137363456 bytes) failed; errno 12
161115 17:29:29 InnoDB: Completed initialization of buffer pool
161115 17:29:29 InnoDB: Fatal error: cannot allocate memory for the buffer pool
161115 17:29:29 [ERROR] Plugin 'InnoDB' init function returned error.
161115 17:29:29 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
161115 17:29:29 [ERROR] Unknown/unsupported storage engine: InnoDB
161115 17:29:29 [ERROR] Aborting

161115 17:29:29 [Note] /usr/libexec/mysql55/mysqld: Shutdown complete

161115 17:29:29 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
161115 17:54:45 mysqld_safe Starting mysqld daemon with databases from /data/mysql
161115 17:54:45 [Note] /usr/libexec/mysql55/mysqld (mysqld 5.5.52) starting as process 25139 ...
161115 17:54:45 [Note] Plugin 'FEDERATED' is disabled.
161115 17:54:45 InnoDB: The InnoDB memory heap is disabled
161115 17:54:45 InnoDB: Mutexes and rw_locks use GCC atomic builtins
161115 17:54:45 InnoDB: Compressed tables use zlib 1.2.8
161115 17:54:45 InnoDB: Using Linux native AIO
161115 17:54:45 InnoDB: Initializing buffer pool, size = 128.0M
161115 17:54:45 InnoDB: Completed initialization of buffer pool
161115 17:54:45 InnoDB: highest supported file format is Barracuda.
InnoDB: Log scan progressed past the checkpoint lsn 172999655127
161115 17:54:45  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
InnoDB: Doing recovery: scanned up to log sequence number 172999655147
161115 17:54:48  InnoDB: Waiting for the background threads to start
161115 17:54:49 InnoDB: 5.5.52 started; log sequence number 172999655147
161115 17:54:49 [Note] Server hostname (bind-address): '0.0.0.0'; port: 3306
161115 17:54:49 [Note]   - '0.0.0.0' resolves to '0.0.0.0';
161115 17:54:49 [Note] Server socket created on IP: '0.0.0.0'.
161115 17:54:49 [Note] Event Scheduler: Loaded 0 events
161115 17:54:49 [Note] /usr/libexec/mysql55/mysqld: ready for connections.
Version: '5.5.52'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)

Open in new window

It seems the buffer pool size is incorrect but when it tries to increase it an error occurs. Please help me get this mysql running more reliably.

Thanks,

Dan Craciun

>>the buffer pool size is incorrect

Nope. InnoDB tries to allocate 128M but can't, so it fails.

Check your other processes and see what eats the RAM. Or decrease buffer pool to 32 or 16M and accept the performance hit.

HTH,
Dan

sharingsunshine

ASKER

Check your other processes and see what eats the RAM

Open in new window

Do you have any suggestions on how to do this? By the time it is crashed the processes have changed so TOP really doesn't help. Looking at the logs I don't see any corresponding event. Is there some tool that can be used?

noci

Use top
and press M after it starts, gives you the Memory hogs on top.
Start mysql from another session and watch.

miron

I'd use top command to find out physical memory RAM available on the computer.
Here is good example : https://www.cyberciti.biz/faq/linux-check-memory-usage/

Then check my.cnf file in the root folder of MySQL installation. Look for line

innodb_buffer_pool_size=128M

depending on how much memory RAM is free and available modify 128 to appropriate value.

Hope it helps :)

sharingsunshine

ASKER

the events happen at odd hours so I can't just start another session unless it happens while I am staring at TOP. Instead I need some way to log mysql memory usage and the other processes. I have 7.5 GB of memory so how do I determine free memory?

noci

The easiest is the command: free
top also shows this info...
# free
total used free shared buff/cache available
Mem: 1029992 367868 54792 32976 607332 590500
Swap: 4201616 398628 3802988

top:
%Cpu(s): 2.0 us, 1.6 sy, 0.7 ni, 92.8 id, 0.0 wa, 0.0 hi, 2.9 si, 0.0 st
KiB Mem : 1029992 total, 54356 free, 368072 used, 607564 buff/cache
KiB Swap: 4201616 total, 3802988 free, 398628 used. 590300 avail Mem

SOLUTION

Dan Craciun

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

SOLUTION

miron

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

sharingsunshine

ASKER

I see the 4th line
https://gyazo.com/07431928d1afc8873c7114314057ba05

but I am unclear how it will help with the following

that should help you figure out how much memory you would like to specify in the my.cnf.

Open in new window

noci

Is this a serious setup? No insult intended but this is not realy a sane setup in my view.
8GB memory, with 31GB Swap space.... The rule of thumb for Swapspace like 4* memory etc. was when a few MB up about 1GB of memory was available. That rule of thumb does not apply on modern systems. You are running on AWS, there the injury is even worse... Virtualized Systems should have the right amount of memory, swapping is way too much overhead.
esp. using the emulated hardware to manage the IO. Now i do setup those kind os systems with some swapping memory (1-2GB) so that when the swapspace gets touched i have an early warning of a pending disaster. That way one can plan on expanding the VM or memory of the system involved.

The swap usage does indicate that memory has been exhausted, and that should have had a pretty heavy impact. I would expect a slugish machine not crashing...
Now how big do you make your buffers? regularly.... esp. for IO based systems it makes no sense to do one IO several times (swapping buffers) for the same set of data.

HTTPD using ~3% of memory, mysql ~3%...., who is using the memory...
When runnint top, press M to get a view on memory usage please.
Can you publish those here?

sharingsunshine

ASKER

It looks to me like the swap number is never changing. Here it is several days later and it is showing the same value as before.

top - 17:15:41 up 94 days, 22:23,  1 user,  load average: 2.36, 1.16, 0.76
Tasks: 113 total,   1 running, 112 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   7697008k total,  7064160k used,   632848k free,   359800k buffers
Swap: 31449596k total,  7064160k used,   632848k free,   359800k buffers
Swap: 31449596k total,   832416k used, 30617180k free,  4596328k cached
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                
19579 mysql     20   0 1345m 231m 5244 S  0.0  3.1 967:14.97 mysqld                 
 7737 apache    20   0  932m 128m  588 S  0.0  1.7   0:58.89 httpd                  
 8011 apache    20   0  931m 126m  712 S  0.0  1.7   0:58.03 httpd                  
 8367 apache    20   0  628m  90m  18m S  0.0  1.2   1:03.57 httpd                  
 8370 apache    20   0  626m  90m  20m S  0.0  1.2   0:27.01 httpd                  
 3789 apache    20   0  633m  90m  23m S  0.0  1.2   4:56.16 httpd                  
 6412 apache    20   0  634m  89m  23m S  0.0  1.2   1:33.87 httpd                  
 7282 apache    20   0  634m  89m  23m S  0.0  1.2   1:33.87 httpd                  
top - 17:16:03 up 94 days, 22:24,  1 user,  load average: 1.56, 1.06, 0.73
Tasks: 112 total,   1 running, 111 sleeping,   0 stopped,   0 zombie
Cpu(s): 10.1%us,  1.3%sy,  0.0%ni, 88.5%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   7697008k total,  7062836k used,   634172k free,   359816k buffers
Swap: 31449596k total,   832416k used, 30617180k free,  4596344k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                   
19579 mysql     20   0 1345m 231m 5244 S  2.3  3.1 967:15.26 mysqld                    
 7737 apache    20   0  932m 128m  588 S  0.0  1.7   0:58.89 httpd                     
 8011 apache    20   0  931m 126m  712 S  0.0  1.7   0:58.03 httpd                     
 8367 apache    20   0  628m  90m  18m S 10.0  1.2   1:03.87 httpd                     
 8370 apache    20   0  626m  90m  20m S  0.0  1.2   0:27.01 httpd                     
 3789 apache    20   0  633m  90m  23m S  0.0  1.2   4:56.16 httpd                     
 6412 apache    20   0  634m  89m  23m S  0.0  1.2   1:33.93 httpd                     
 7286 apache    20   0  632m  87m  22m S  1.3  1.2   2:08.31 httpd                     
 7326 apache    20   0  622m  86m  20m S  9.6  1.2   2:53.28 httpd                     
 7074 apache    20   0  621m  84m  19m S  0.0  1.1   1:03.43 httpd                     
 7321 apache    20   0  620m  84m  19m S  0.0  1.1   3:03.16 httpd                     
 8951 apache    20   0  621m  84m  19m S  0.0  1.1   0:18.23 httpd                     
 7206 apache    20   0  620m  83m  19m S  0.0  1.1   1:19.88 httpd                     
 8374 apache    20   0  621m  82m  18m S  0.0  1.1   0:32.25 httpd                     
 9441 apache    20   0  622m  82m  17m S  0.0  1.1   0:33.44 httpd                     
 9437 apache    20   0  621m  82m  17m S  0.0  1.1   0:12.53 httpd                     
 9234 apache    20   0  621m  81m  17m S  0.0  1.1   0:10.91 httpd                     
 9230 apache    20   0  620m  81m  17m S  0.0  1.1   0:09.40 httpd                     
 8948 apache    20   0  619m  81m  18m S  0.0  1.1   2:18.47 httpd                     
 9842 apache    20   0  618m  78m  17m S  0.0  1.0   0:01.17 httpd                     
 9440 apache    20   0  602m  62m  17m S  0.0  0.8   0:07.88 httpd                     
 9835 apache    20   0  589m  49m  16m S  0.0  0.7   0:01.80 httpd                     
 9834 apache    20   0  579m  39m  16m S  0.0  0.5   0:00.56 httpd                     
 9721 root      20   0  115m 6696 5608 S  0.0  0.1   0:00.01 sshd                      
 8318 postfix   20   0 93084 6676 5652 S  0.0  0.1   0:00.00 pickup                    
 9886 postfix   20   0 93076 6452 5424 S  0.0  0.1   0:00.00 anvil                     
 9759 root      20   0  177m 4376 3756 S  0.0  0.1   0:00.00 sudo                      
 8780 editor_c  20   0 15352 4288 3448 S  0.0  0.1   0:00.05 imap                      
 9941 root      20   0 32228 4144 3456 S  0.0  0.1   0:00.06 auth                      
 9729 ec2-user  20   0  115m 4024 2936 S  0.0  0.1   0:00.01 sshd                      
 9818 dovecot   20   0 29876 3668 3212 S  0.0  0.0   0:00.02 auth                      
 2184 root      20   0 71460 3612 3272 S  0.0  0.0   0:28.41 saslauthd                 
 2189 root      20   0 71460 3612 3272 S  0.0  0.0   0:28.45 saslauthd

Open in new window

looking at the above values I don't see any big memory user. Yes, this is an e-commerce site that has been operational since 2000 and with Amazon for 3 years.

sharingsunshine

ASKER

I was looking at the wrong column for the free swap. Instead I see it is more being used. I you saying I should reduce the actual size of the swap?

noci

n virtualised system you have no benefit from swap... the IO is over virtualised adapter so you get double the overhead on save & retrieving memory from there. You can use some to handle a sudden overwhelming overflow... As an early warning of trouble.
ie. if swap get used to had a lack of memory. Virtualised swapped memory is far more slow then real memory.
Even worse then in real hardware.
The amount of swap used is more than freespace so swap is needed now. Indicating you need more memory for your system.
To see if swapping is going on: use vmstat 10
and check the si & so columns (Swapin/Swap out)
I have a server running with 128GB of memory, and 8G of Swap with an alarm on swap going above a few MB.
And it is a bad deal to slow memory access a few thousand time to disk speed. But it helps in the case when there is a sudden surge in load to prevent a sudden death of a random process.

A workload that fitted your system well a few years back, might be more then your system can handle now.
The server i mentioned was sufficient 3 years ago it was envisioned to allow double the amount of work in 4 years...., it will be replaced soon with a bigger system because the workload far more then doubled in the mean time.

sharingsunshine

ASKER

Is there any way to find out what is eating the memory because our volume is 60% less than it was before this started happening?

ASKER CERTIFIED SOLUTION

noci

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

sharingsunshine

ASKER

I'll do some more digging on why it is needing that much swap.