Can anyone tell me what this process is doing?

Hello Experts,

My server has been running really slow for the past few days.  I decided to do a ps aux command through SSH to examine the processes.  Among them I found the following (website domain has been changed):

mysql     2920 56.3  6.0 924548 496056 ?       Sl   Oct10 1034:02 /usr/libexec/m
ysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql --log-error=/var/lib/
mysql/example.com.err --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/lib/
mysql/mysql.sock --port=3306

Open in new window


Can anyone tell me what this mysqld command is supposed to be doing?
LVL 17
OmniUnlimitedAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

BarthaxCommented:
This is process which runs the MySQL database(s) on this machine.  MySQL stores information in a series of directories and, at a minimum, will need only the --basedir command to specify where all other directories are relative to.  In this case, there are several commands to override the defaults:

--basedir: specifies the base directory from which relative settings will be discovered (unless overridden by other commands).
--datadir: the directory in which the databases are actually stored.
--user: the local user account the mysql server should run as (by default the "current user" is used and at boot up that would be root: bad idea).
--log-error: the file in which errors are logged.
--pid-file: the file into which the Process Identifier is placed.  In this snapshot of ps, it would be 2920 but could easily change if the mysqld is restarted.  Useful for scripts to find the mysqld pid and send it signals.
--socket: the file to use for inter-process connections (the file will exist and be empty: it holds no information).
--port: the TCP port to listen on.  The default is 3306 anyway.
0
OmniUnlimitedAuthor Commented:
OK, so it is as I was suspecting, simply a change of settings?  If that is the case, what would make it run for over a 1000 hours and use up over half the processor time?
0
slubekCommented:
56.3% of CPU usage can be temporary (only during ps aux execution). Better see output of top.
0
10 Tips to Protect Your Business from Ransomware

Did you know that ransomware is the most widespread, destructive malware in the world today? It accounts for 39% of all security breaches, with ransomware gangsters projected to make $11.5B in profits from online extortion by 2019.

OmniUnlimitedAuthor Commented:
@slubek: Yeah, you were right.  ps was wrong.  But it's way worse in top:

 2920 mysql     15   0  902m 488m 5496 S 116.3  6.1   1319:42 mysqld

Open in new window


This is showing over 100% (how is that even possible?) CPU usage.  So now I come back to the same question: if this command is just changing settings, why is it taking over 1300 hours to complete?
0
slubekCommented:
1. On multicore computers CPU usage can be over 100%.
2. mysqld is a daemon - it runs in the background and never completes. In your case it was simply started 1300 hours ago. Nothing to worry about.
3. To know why your server is slow, better check 3rd (CPU), 4th (Mem) and 5th (Swap) lines of top output. Can you show us them?
0
OmniUnlimitedAuthor Commented:
@slubek: Sure, here they are:

Cpu(s): 26.9%us, 25.9%sy,  0.0%ni, 11.6%id, 35.5%wa,  0.0%hi,  0.2%si,  0.0%st
Mem:   8145368k total,  8100388k used,    44980k free,  1146028k buffers
Swap:  4184924k total,      688k used,  4184236k free,  3640168k cached

Open in new window


I can see reasons why the server is so slow (take memory usage for example.)  My question is, what can I do about it?  If you are telling me mysqld is a daemon that never completes, that means that I can't kill it without suffering consequences, yet in all the listings, this process is the one that stands far above the rest in resource use.  Can I reduce its usage at all?
0
Seth SimmonsSr. Systems AdministratorCommented:
it runs as a service and either started with some application that was installed or at startup the last time it was rebooted

to stop it cleanly, do service mysql stop (and subsequently start to start again) and it will properly shutdown the database

looking at the top output you have there, the actual memory usage is only 488mb so it's not using much at all.  the file cache is 3.6gb so the remaining processes are using approximately 4gb combined
0
OmniUnlimitedAuthor Commented:
@seth2740: What happens if there are database accesses happening at the time I shut down mysql?  I have cron jobs running on this server, the databases could be accessed at any time.  I don't want to interrupt those processes.

My question concerns the process I listed in the ps aux output in my original question.  Can I kill process 2920 (which appears to only be a settings change process) without affecting mysql operations elsewhere?  And if this is a process that needs to run, can I reduce its resource usage?

BTW, you read the output from top wrong.  You said it indicates only 488 MB memory usage.  If you look again, that is memory that is free.  Almost 8 GB of memory is actually being used (almost 99.5% of the available) which is why I said, "I can see reasons why the server is so slow."
0
Seth SimmonsSr. Systems AdministratorCommented:
2920 is the pid for the mysql service
if you kill that process the databases will not be available and mysql would do a recovery check on the next startup since it was shutdown dirty

as i said before, the memory usage is small which means other processes are making up the remaining 4gb physical memory used

i was reading the output of top correctly.  look at your comment from 17:31 where you said "it's way worse in top".  the RES column is 488mb which is how much memory mysql is actually using

half the physical memory is file cache so no concern there
0
OmniUnlimitedAuthor Commented:
@seth2740: My apologies, I thought you were referring to the total memory usage in my comment 39567174, not the usage from the process itself.

In fact, I am unable to reconcile the process memory usage figures with the figures top gives me for overall.  When I do a shift-m in top, nothing appears to be using much memory.  In fact, the mysql process is the very top process in memory users, all others appear negligible.  Where is the almost 8 GB memory usage coming from?  Can this be a case of processes not releasing memory when they finish?  If so, how can I free up the memory?
0
Seth SimmonsSr. Systems AdministratorCommented:
it depends how you have the columns sorted
if you didn't change anything, the default is sort by cpu so if mysql is busy you would see it near or at the top

if you do < character 3 times it will sort by RES with the process using the most memory on top

the roughly 3.6gb file cache are just other files (.o or .so mostly) used by different processes.  by default, the kernel will load as much as possible as part of that file cache in physical memory for faster execution.  when a process requests more physical memory, the kernel will release some of the file cache for it.

to manually clear the file cache, run (as root) sync ; echo 3 > /proc/sys/vm/drop_caches

of course, the file cache will gradually grow as needed but right after doing that if you look at top, the cache would have dropped along with the total memory usage.

if you had a process not releasing memory that would be bad; and likely the system would have adverse affects which i highly doubt is a factor here.  just a matter of understanding the output of some of these basic tools
0
OmniUnlimitedAuthor Commented:
Here is the output from top after running sync:

Cpu(s):  0.2%us,  0.7%sy,  0.0%ni, 72.7%id, 24.7%wa,  0.2%hi,  1.6%si,  0.0%st
Mem:   8145368k total,  1816628k used,  6328740k free,    13952k buffers
Swap:  4184924k total,      688k used,  4184236k free,    61036k cached

Open in new window


Memory usage on the process itself also dropped:

 2920 mysql     15   0  902m 489m 5500 S  2.0  6.2   1556:20 mysqld

Open in new window


Unfortunately, there seems to be no noticeable difference in the actual performance of the server.  It still seems to be running a bit slow.

Huh, will you look at that, the figures went right back up after only 4 minutes:

 2920 mysql     15   0  902m 489m 5500 S 209.1  6.2   1561:32 mysqld

Open in new window

0
Seth SimmonsSr. Systems AdministratorCommented:
yeah, i would expect that - especially with something like mysql or oracle
they are designed to use a lot of physical memory for databases

if you run top and sort by RES you'll see descending list of what processes are using physical memory
0
OmniUnlimitedAuthor Commented:
Here is the list sorted by RES:

 2920 mysql     15   0  902m 490m 5500 S 299.8  6.2   1650:07 mysqld
  446 root      34  19  353m 296m  556 D  0.0  3.7   0:40.34 updatedb
16104 root      18   0  579m 277m  644 D  0.0  3.5  22:27.19 cp
 4352 apache    15   0  373m  66m 4168 S  0.0  0.8   0:06.56 httpd
 3025 root      15   0  157m  44m 2444 S  0.0  0.6   0:00.90 spamd
 4219 apache    18   0  347m  44m 4432 S  2.0  0.6   0:33.63 httpd
 2113 apache    15   0  345m  43m 4212 S  0.0  0.6   0:15.03 httpd
 7943 root      16   0  150m  43m 6832 S  0.0  0.5   0:00.80 sw-engine
 3026 popuser   15   0  157m  43m  976 S  0.0  0.5   0:00.00 spamd
 3027 popuser   18   0  157m  42m  892 S  0.0  0.5   0:00.01 spamd
 7891 root      18   0  148m  42m 6432 S  0.0  0.5   0:00.71 sw-engine
 4212 apache    15   0  342m  41m 4152 S  0.0  0.5   0:07.38 httpd
 4208 apache    16   0  336m  36m 4160 S  9.0  0.5   0:09.51 httpd
 2807 apache    16   0  336m  36m 4188 S  0.0  0.5   0:18.11 httpd
 2108 apache    15   0  336m  36m 4184 S  0.0  0.5   0:18.51 httpd
 4213 apache    16   0  336m  36m 4544 S  0.0  0.5   0:06.34 httpd
 3634 apache    15   0  336m  36m 4168 S  0.0  0.5   0:10.63 httpd

Open in new window


The cp is because we are doing a massive transfer of files to a slave drive.

Any clue here as to what could be slowing us down?
0
BarthaxCommented:
Disk activity is something you need to be checking given that list.  You mention large copying of files and your list also shows updatedb which is another source of (potentially) big disk activity.  Run atop to get an idea of the disk activity.

Note that your httpd is listed multiple times with some significant activity - these will likely be accessing your mysqld databases and potentially have poorly written scripts which also cause more database activity than is necessary.
0
Seth SimmonsSr. Systems AdministratorCommented:
what do you mean by slave drive?  is it nfs or locally attached?
the fact that the process status is D (i/o wait) might be a concern; though it could also be because updatedb is competing for resources
i would kill -9 446 and see if that helps

updatedb just creates a list of all files on the system so if you did something like locate bash it would look at that file listing.  you can always run it manually later when that cp is done; you can also tell it to not do nfs mounts by adding nfs to the prune file system line in /etc/updatedb.conf then it would only do local volumes

i would definitely kill that updatedb process as that could cause a bottleneck
0
slubekCommented:
I see some sw-engine entries in your top results. I suppose that server has Plesk installed. Google says that high loads on servers running Plesk can be caused by Plesk brute force login attempts. Grep your httpd logs to find how many "login" strings are there.
And one more remark: for security and performance reasons it is not good to have database (mysqld) server on the same machine where httpd server is running.
BTW, how do you check performance of your server? What is your machine load (you can find it in first line of top output)?
0
SandyCommented:
To me it seems like Cpanel and i do suggest you to increase worker threads in Apache and also increase client threads in Mysql.

;)

Else this will be eating up all the resources in system
0
OmniUnlimitedAuthor Commented:
Thank you Experts for your input:

@Barthax:

Run atop to get an idea of the disk activity.

I'm sorry, I am not familiar with the command "atop".  When I attempt to run it on the server, I get a -bash: atop: command not found error.

Note that your httpd is listed multiple times with some significant activity - these will likely be accessing your mysqld databases and potentially have poorly written scripts which also cause more database activity than is necessary.

This is fine.  And the solution would be...?

@seth2740: The slave drive is locally attached.

Can killing the updatedb process have any adverse affect on the cron jobs that run constantly on this machine?

@slubek: Very astute.  The server does indeed have Plesk installed.  Here are the results from the httpd access log:

180.76.5.138 - - [07/Oct/2013:07:47:09 -0700] "GET /login.php HTTP/1.1" 302 26 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"
180.76.5.134 - - [07/Oct/2013:07:47:11 -0700] "GET /imp/login.php HTTP/1.1" 200 3196 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"
66.249.73.4 - - [08/Oct/2013:12:49:35 -0700] "GET /login.php HTTP/1.1" 302 26 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.73.4 - - [08/Oct/2013:12:49:37 -0700] "GET /imp/login.php HTTP/1.1" 200 3227 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.73.4 - - [10/Oct/2013:18:12:38 -0700] "GET /login.php HTTP/1.1" 302 26 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.73.4 - - [10/Oct/2013:18:12:56 -0700] "GET /imp/login.php HTTP/1.1" 200 3224 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

Open in new window


for security and performance reasons it is not good to have database (mysqld) server on the same machine where httpd server is running.

This is news to me, I have never heard that before.  Both httpd and mysql are important for the websites that this server houses.  To separate any one of them from the website I would suspect would degrade, not improve, performance.  If we were to house the mysql on another server, for example, we would have to access it through the network which would slow things down considerably, wouldn't it?

how do you check performance of your server?

My complaint is from subjective observations of the performance of our websites.  There is no quantitative analysis.

What is your machine load?

top - 15:31:56 up 2 days, 13:53,  1 user,  load average: 4.74, 5.62, 5.56

Open in new window


@Sandeep_Agarwal_:

To me it seems like Cpanel

No, CPanel is too invasive for our applications.

i do suggest you to increase worker threads in Apache and also increase client threads in Mysql.

Sorry, I'm kind of new to this.  Can you elaborate on what worker and client threads are, and how increasing them will help my situation?
0
slubekCommented:
Load average from last 1, 5 and 15 minutes is your performance indicator. It means that about 5 processes were using or waiting for CPU during that time(s). Can you show us full top -n 1 -b results?
BTW, in that StackOverflow thread* you can find discussion about pros of putting db and http servers on different machines. Performance is one of them.
0
Seth SimmonsSr. Systems AdministratorCommented:
i don't know what cron jobs you have on your system but killing updatedb is perfectly safe

again, it only creates an index of the file names on the system and sounds like it's competing for resources while running along side the file copy - which would explain the load and the slow response.  it might be creating a list of files on that other drive that the files are being copied to.  wherever that other drive is mounted, you can exclude that by adding the mount point on the prune paths line in /etc/updatedb.conf

mysql is also fairly busy so unless you have a mysqldump process running (which would definitely have a performance hit), i would spend a little time looking at what mysql is doing (phpMyAdmin is a good tool for this)

a couple of native tools that i have used to check certain performance areas is iostat and vmstat.  if you try to run and says command not found, you can install from the CentOS repository by doing yum install sysstat

then i would run iostat -p 3 5 which will run 5 times in 3 second intervals then stop; you could post the output of that for review.  also vmstat 3 5 which will give additional statistics.  if you leave off the 5, they will keep running every 3 seconds until you stop manually.  i've used these many times to help see where a potential bottleneck exists and what device/partition the i/o is coming from
0
OmniUnlimitedAuthor Commented:
@slubek:

Load average from last 1, 5 and 15 minutes is your performance indicator.

How do I obtain this?

Can you show us full top -n 1 -b results?

Here you go:
# top -n 1 -b
top - 22:09:51 up 2 days, 20:31,  1 user,  load average: 3.08, 3.65, 3.54
Tasks: 149 total,   1 running, 148 sleeping,   0 stopped,   0 zombie
Cpu(s): 11.3%us,  9.2%sy,  0.0%ni, 52.3%id, 26.5%wa,  0.1%hi,  0.5%si,  0.0%st
Mem:   8145368k total,  8101076k used,    44292k free,  1208132k buffers
Swap:  4184924k total,      688k used,  4184236k free,  3971632k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
30238 root      15   0 12740  996  716 R  1.0  0.0   0:00.04 top
12617 root      18   0 58316 2856 2164 D  0.5  0.0   5:00.68 statistics
16104 root      18   0  579m 277m  644 D  0.5  3.5  30:58.16 cp
    1 root      15   0 10304  688  576 S  0.0  0.0   0:01.52 init
    2 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 migration/0
    3 root      34  19     0    0    0 S  0.0  0.0   0:04.22 ksoftirqd/0
    4 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 watchdog/0
    5 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 migration/1
    6 root      34  19     0    0    0 S  0.0  0.0   0:03.88 ksoftirqd/1
    7 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 watchdog/1
    8 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 migration/2
    9 root      34  19     0    0    0 S  0.0  0.0   0:02.99 ksoftirqd/2
   10 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 watchdog/2
   11 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 migration/3
   12 root      34  19     0    0    0 S  0.0  0.0   0:04.31 ksoftirqd/3
   13 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 watchdog/3
   14 root      10  -5     0    0    0 S  0.0  0.0   0:00.08 events/0
   15 root      10  -5     0    0    0 S  0.0  0.0   0:00.01 events/1
   16 root      10  -5     0    0    0 S  0.0  0.0   0:00.02 events/2
   17 root      10  -5     0    0    0 S  0.0  0.0   0:00.03 events/3
   18 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 khelper
   87 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 kthread
   94 root      10  -5     0    0    0 S  0.0  0.0   0:00.02 kblockd/0
   95 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 kblockd/1
   96 root      10  -5     0    0    0 S  0.0  0.0   0:00.61 kblockd/2
   97 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 kblockd/3
   98 root      14  -5     0    0    0 S  0.0  0.0   0:00.00 kacpid
  226 root      14  -5     0    0    0 S  0.0  0.0   0:00.00 cqueue/0
  227 root      14  -5     0    0    0 S  0.0  0.0   0:00.00 cqueue/1
  228 root      14  -5     0    0    0 S  0.0  0.0   0:00.00 cqueue/2
  229 root      14  -5     0    0    0 S  0.0  0.0   0:00.00 cqueue/3
  232 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 khubd
  234 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 kseriod
  324 root      15   0     0    0    0 S  0.0  0.0   0:00.00 khungtaskd
  326 root      15   0     0    0    0 S  0.0  0.0   0:43.13 pdflush
  327 root      10  -5     0    0    0 S  0.0  0.0  23:07.06 kswapd0
  328 root      18  -5     0    0    0 S  0.0  0.0   0:00.00 aio/0
  329 root      19  -5     0    0    0 S  0.0  0.0   0:00.00 aio/1
  330 root      20  -5     0    0    0 S  0.0  0.0   0:00.00 aio/2
  331 root      20  -5     0    0    0 S  0.0  0.0   0:00.00 aio/3
  471 root      11  -5     0    0    0 S  0.0  0.0   0:00.00 kpsmoused
  518 root      12  -5     0    0    0 S  0.0  0.0   0:00.00 ata/0
  519 root      13  -5     0    0    0 S  0.0  0.0   0:00.00 ata/1
  520 root      13  -5     0    0    0 S  0.0  0.0   0:00.00 ata/2
  521 root      12  -5     0    0    0 S  0.0  0.0   0:00.00 ata/3
  522 root      14  -5     0    0    0 S  0.0  0.0   0:00.00 ata_aux
  528 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 scsi_eh_0
  529 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 scsi_eh_1
  530 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 scsi_eh_2
  531 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 scsi_eh_3
  532 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 scsi_eh_4
  533 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 scsi_eh_5
  544 root      13  -5     0    0    0 S  0.0  0.0   0:00.00 kstriped
  565 root      10  -5     0    0    0 S  0.0  0.0   1:33.91 kjournald
  595 root      10  -5     0    0    0 S  0.0  0.0   0:00.25 kauditd
  628 root      12  -4 13644 1976  504 S  0.0  0.0   0:00.16 udevd
 1634 root      20  -5     0    0    0 S  0.0  0.0   0:00.00 kmpathd/0
 1635 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 kmpathd/1
 1636 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 kmpathd/2
 1637 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 kmpathd/3
 1638 root      20  -5     0    0    0 S  0.0  0.0   0:00.00 kmpath_handlerd
 1663 root      14  -5     0    0    0 S  0.0  0.0   0:00.00 kjournald
 1912 root      10  -5     0    0    0 S  0.0  0.0   0:00.09 kondemand/0
 1913 root      10  -5     0    0    0 S  0.0  0.0   0:00.20 kondemand/1
 1914 root      10  -5     0    0    0 S  0.0  0.0   0:00.30 kondemand/2
 1915 root      10  -5     0    0    0 S  0.0  0.0   0:00.20 kondemand/3
 2371 root      11  -4 92888  908  588 S  0.0  0.0   0:15.51 auditd
 2373 root       7  -8 81812  980  628 S  0.0  0.0   0:01.04 audispd
 2395 root      15   0 28920  18m  328 S  0.0  0.2   0:00.02 restorecond
 2408 root      15   0  5912  616  496 S  0.0  0.0   0:10.27 syslogd
 2411 root      15   0  3808  436  348 S  0.0  0.0   0:00.01 klogd
 2425 root      18   0 10764  372  244 S  0.0  0.0   0:03.68 irqbalance
 2442 root      16   0 14512  528  388 S  0.0  0.0   0:01.54 mcstransd
 2468 root      18   0  3804  564  468 S  0.0  0.0   0:00.00 acpid
 2495 root      18   0 54404 1500 1112 S  0.0  0.0   0:00.23 automount
 2525 sw-cp-se  15   0 61976 2796 1256 S  0.0  0.0   0:01.15 sw-cp-serverd
 2540 root      15   0 62580 1224  656 S  0.0  0.0   0:03.23 sshd
 2557 root      15   0 21600  908  692 S  0.0  0.0   0:00.39 xinetd
 2620 root      15   0 40900  796  588 S  0.0  0.0   0:00.10 couriertcpd
 2622 root      16   0 33584  908  728 S  0.0  0.0   0:00.04 courierlogger
 2630 root      18   0 40900  800  588 S  0.0  0.0   0:00.00 couriertcpd
 2632 root      18   0 33584  908  728 S  0.0  0.0   0:00.00 courierlogger
 2638 root      15   0 40900  800  588 S  0.0  0.0   0:00.11 couriertcpd
 2640 root      18   0 33584  904  728 S  0.0  0.0   0:00.05 courierlogger
 2647 root      25   0 40900  796  584 S  0.0  0.0   0:00.00 couriertcpd
 2649 root      25   0 33584  844  676 S  0.0  0.0   0:00.00 courierlogger
 2664 qmails    16   0  3852  532  424 S  0.0  0.0   0:14.50 qmail-send
 2666 qmaill    16   0  3800  520  440 S  0.0  0.0   0:00.41 splogger
 2667 root      15   0  3840  424  320 S  0.0  0.0   0:00.03 qmail-lspawn
 2668 qmailr    15   0  3840  424  316 S  0.0  0.0   0:00.23 qmail-rspawn
 2669 qmailq    18   0  3796  380  308 S  0.0  0.0   0:03.14 qmail-clean
 2689 root      18   0  6456  364  284 S  0.0  0.0   0:00.00 gpm
 2752 named     21   0  244m 4436 1924 S  0.0  0.1   0:00.11 named
 2798 root      25   0 65936 1308 1072 S  0.0  0.0   0:00.00 mysqld_safe
 2920 mysql     15   0  902m 489m 5500 S  0.0  6.2   2673:35 mysqld
 3025 root      15   0  157m  44m 2444 S  0.0  0.6   0:00.96 spamd
 3026 popuser   15   0  157m  43m  976 S  0.0  0.5   0:00.01 spamd
 3027 popuser   18   0  157m  42m  892 S  0.0  0.5   0:00.01 spamd
 3078 root      18   0  308m  18m 8852 S  0.0  0.2   0:04.31 httpd
 3080 apache    15   0  217m 6128  620 S  0.0  0.1   0:00.01 httpd
 3206 root      15   0 74768 1256  648 S  0.0  0.0   0:01.23 crond
 3242 xfs       18   0 20360 1216  724 S  0.0  0.0   0:00.00 xfs
 3267 root      18   0 18688  464  304 S  0.0  0.0   0:00.00 atd
 3352 root      18   0 18372  568  312 S  0.0  0.0   0:00.00 smartd
 3355 root      17   0  3796  480  412 S  0.0  0.0   0:00.00 mingetty
 3356 root      18   0  3796  484  412 S  0.0  0.0   0:00.00 mingetty
 3357 root      18   0  3796  484  412 S  0.0  0.0   0:00.00 mingetty
 3358 root      18   0  3796  488  416 S  0.0  0.0   0:00.00 mingetty
 3359 root      18   0  3796  480  412 S  0.0  0.0   0:00.00 mingetty
 3360 root      21   0  3796  480  412 S  0.0  0.0   0:00.00 mingetty
 4685 opennet   18   0 64280 4916 2416 S  0.0  0.1   0:00.07 test.fcgi
 7610 root      17   0  130m 2796 1760 S  0.0  0.0   0:00.00 crond
 7611 root      18   0  8704 1040  880 S  0.0  0.0   0:00.00 run-parts
 7941 root      19   0  8704  960  820 S  0.0  0.0   0:00.00 50plesk-daily
 7942 root      18   0  8772  664  552 S  0.0  0.0   0:00.00 awk
 7943 root      16   0  150m  43m 6832 S  0.0  0.5   0:00.80 sw-engine
 7956 root      18   0 58164 2172 1760 S  0.0  0.0   0:00.01 statistics
12295 root      17   0  130m 2796 1760 S  0.0  0.0   0:00.00 crond
12296 root      15   0  8704 1032  880 S  0.0  0.0   0:00.00 run-parts
12601 root      18   0  8704  964  820 S  0.0  0.0   0:00.00 50plesk-daily
12602 root      15   0  8772  668  552 S  0.0  0.0   0:00.00 awk
12603 root      17   0  148m  42m 6456 S  0.0  0.5   0:00.90 sw-engine
15916 root      15   0     0    0    0 S  0.0  0.0   5:17.51 pdflush
15938 root      10  -5     0    0    0 S  0.0  0.0   0:50.98 kjournald
22896 root      15   0  111m  16m 3292 S  0.0  0.2   0:00.57 sshd
22914 root      15   0 66076 1600 1184 S  0.0  0.0   0:00.09 bash
24747 root      15   0 33764 1684 1348 S  0.0  0.0   0:00.09 couriertls
24749 popuser   15   0 40480 1680 1000 S  0.0  0.0   0:05.73 imapd
29419 popuser   15   0 40104 1312  992 S  0.0  0.0   0:00.09 imapd
29560 apache    15   0  336m  36m 4168 S  0.0  0.5   0:11.45 httpd
29577 apache    16   0  336m  36m 4184 S  0.0  0.5   0:20.90 httpd
29624 apache    16   0  336m  36m 4164 S  0.0  0.5   0:06.67 httpd
29625 apache    15   0  336m  36m 4184 S  0.0  0.5   0:09.35 httpd
29629 apache    15   0  357m  53m 4180 S  0.0  0.7   0:09.36 httpd
29630 apache    16   0  336m  36m 4168 S  0.0  0.5   0:09.06 httpd
29634 apache    15   0  336m  36m 4564 S  0.0  0.5   0:13.23 httpd
29635 apache    15   0  336m  36m 4532 S  0.0  0.5   0:06.05 httpd
29636 apache    15   0  336m  36m 4144 S  0.0  0.5   0:08.06 httpd
29641 apache    16   0  351m  49m 4212 S  0.0  0.6   0:13.22 httpd
29643 apache    16   0  336m  36m 4184 S  0.0  0.5   0:06.35 httpd
29647 apache    15   0  336m  36m 4172 S  0.0  0.5   0:10.77 httpd
29648 apache    15   0  343m  42m 4524 S  0.0  0.5   0:15.33 httpd
29650 apache    16   0  336m  36m 4156 S  0.0  0.5   0:09.24 httpd
29784 apache    16   0  336m  36m 4144 S  0.0  0.5   0:02.72 httpd
29785 apache    16   0  335m  36m 4124 S  0.0  0.5   0:01.65 httpd
29787 apache    15   0  336m  36m 4140 S  0.0  0.5   0:03.09 httpd
29789 apache    15   0  336m  36m 4152 S  0.0  0.5   0:03.55 httpd
29790 apache    15   0  336m  36m 4164 S  0.0  0.5   0:03.66 httpd
30093 apache    16   0  336m  36m 4132 S  0.0  0.5   0:01.21 httpd

Open in new window


As far as the StackOverflow thread is concerned, again, this is news to me.  I interpret the discussion as the points being somewhat controversial, rather than providing a definitive statement on the benefits of running two vs. one server.

@seth2740:

killing updatedb is perfectly safe

Actually, at the time of this writing, updatedb was no longer running on the server.

i would spend a little time looking at what mysql is doing (phpMyAdmin is a good tool for this)

I know you are probably going to laugh at my ignorance in this matter, but all I ever used phpMyAdmin for was to check structure and data in the tables, create and modify tables, import and export tables, etc.  How can phpMyAdmin help me to see what mysql is doing?

i would run iostat -p 3 5 which will run 5 times in 3 second intervals then stop; you could post the output of that for review

Sure, here it is (domain has been changed):
# iostat -p 3 5
Linux 2.6.18-194.17.1.el5 (example.com)       10/12/2013

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          11.33    0.04    9.89   26.48    0.00   52.25

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              94.34      1245.72       514.83  308607989  127541890
sda1              0.00         0.00         0.00        854          2
sda2              0.00         0.01         0.01       1525       1440
sda3             94.34      1245.71       514.83  308604690  127540448
sdb              12.50         3.14       598.85     778712  148354854
sdb1             12.50         3.14       598.85     776678  148354846

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           8.24    0.00    5.50   29.64    0.00   56.62

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              81.00       792.00       589.33       2376       1768
sda1              0.00         0.00         0.00          0          0
sda2              0.00         0.00         0.00          0          0
sda3             81.00       792.00       589.33       2376       1768
sdb               1.33         0.00       258.67          0        776
sdb1              1.33         0.00       258.67          0        776

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           7.16    0.00    5.08   28.98    0.00   58.78

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              74.00       752.00         0.00       2256          0
sda1              0.00         0.00         0.00          0          0
sda2              0.00         0.00         0.00          0          0
sda3             74.00       752.00         0.00       2256          0
sdb               0.00         0.00         0.00          0          0
sdb1              0.00         0.00         0.00          0          0

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           4.83    0.00    4.91   26.81    0.00   63.45

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              82.06       821.26       621.93       2472       1872
sda1              0.00         0.00         0.00          0          0
sda2              0.00         0.00         0.00          0          0
sda3             82.06       821.26       621.93       2472       1872
sdb               1.00         0.00       369.44          0       1112
sdb1              1.00         0.00       369.44          0       1112

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           5.67    0.00    5.00   30.08    0.00   59.25

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              95.00      1298.67       528.00       3896       1584
sda1              0.00         0.00         0.00          0          0
sda2              0.00         0.00         0.00          0          0
sda3             95.00      1298.67       528.00       3896       1584
sdb               0.00         0.00         0.00          0          0
sdb1              0.00         0.00         0.00          0          0

#

Open in new window


also vmstat 3 5 which will give additional statistics

Here are the results from that:
# vmstat 3 5
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 1  2    692  49460 1353900 3568636    0    0   156   139   39   27 11 10 52 26  0
 1  1    692  50980 1355836 3565732    0    0   277  3017 1503 2644 35 14 25 25  0
 2  2    692  51884 1356756 3566136    0    0   357   172 1170 1162 29  5 39 28  0
 1  1    692  50916 1357552 3566520    0    0   320   288 1137 1274 27  8 38 27  0
 3  1    692  49968 1358300 3567036    0    0   340   392 1180 1491 27  9 38 26  0
#

Open in new window

0
Seth SimmonsSr. Systems AdministratorCommented:
in phpMyAdmin if you click processes, you can see if there are any queries running
updatedb must have finished on it's own

i'm assuming the mysql database is on sda3 since that's where the majority of the i/o is

from the vmstat output, the first column (r) is the run queue.  if that number is ever higher than the total number of processing cores, then you have cpu bottleneck.  the si and so columns (swap in/swap out) will show any activity to the swap partition.  i would be concerned if it was constantly or frequently a non-zero value but you don't have that here so it doesn't appear to be an issue.

how is the disk system configured?  any raid?  when you have any kind of database (mysql, microsoft sql/exchange), it's always good to have the database on a raid 10 for best performance

the i/o wait is a concern though (wa column on the right from vmstat output and on cpu line in top) is hovering around the mid/upper 20's so that means the system is busy but spending a lot of time waiting on the disk.  i see that file copy is still running so it looks like that could be the bottleneck competing for disk resources

what is that statistics process in top?  the process state shows D (i/o wait) so wondering if that's a php script?
0
slubekCommented:
1. You can see your system load average in first line of top. It can be also viewed by uptime command.
2. From your top output I see two processes (PID 12617 and 16104) in D (uninterruptible sleep) state - they are probably responsible for slowing your system. Try to kill -9 them.
0
Seth SimmonsSr. Systems AdministratorCommented:
if a linux process is in state D you can't do kill -9 (sigkill) on it as it is in i/o wait
if the process was in R, S, or T state then kill -9 would work
0
OmniUnlimitedAuthor Commented:
@seth2740:

in phpMyAdmin if you click processes, you can see if there are any queries running

Wow, I never noticed that link.  Thanks!  As of this writing, this is the only thing I see when I click on that link (db names have been changed:

Show Full Queries 	ID 	User 	Host 	Database 	Command 	Time 	Status 	SQL query
Kill 	183090 	example_dbuser 	localhost 	example_db 	Sleep 	20 	--- 	---
Kill 	183095 	example_dbuser 	localhost 	example_db 	Sleep 	14 	--- 	---
Kill 	183118 	example_dbuser 	localhost 	None 	Query 	0 	--- 	SHOW PROCESSLIST 

Open in new window


i'm assuming the mysql database is on sda3 since that's where the majority of the i/o is

I would suspect you are right.

from the vmstat output, the first column (r) is the run queue.  if that number is ever higher than the total number of processing cores, then you have cpu bottleneck.

How do I find out how many processing cores I have available?

how is the disk system configured?  any raid?

sda and sdb are slave drives configured using standard linux partitioning and formating.  To my knowledge, there is no raid.

the i/o wait is a concern though (wa column on the right from vmstat output and on cpu line in top) is hovering around the mid/upper 20's so that means the system is busy but spending a lot of time waiting on the disk.  i see that file copy is still running so it looks like that could be the bottleneck competing for disk resources


Wow, I think you are absolutely right!  I am so glad you taught me those commands so I can monitor that more closely.

what is that statistics process in top?  the process state shows D (i/o wait) so wondering if that's a php script?

Actually, logging is enabled on this server (and we are monitoring a number of things) so I would suspect that statistics is simply completing the logging functions.

@slubek:

You can see your system load average in first line of top. It can be also viewed by uptime command.

Ok, so when I combine this with your last statement:

Load average from last 1, 5 and 15 minutes is your performance indicator.

How do I obtain this?  Do I run top -n 1 -b, then run it again after a minute, 5 minutes and 15 minutes, or is there a command that will automatically do this for me?

From your top output I see two processes (PID 12617 and 16104) in D (uninterruptible sleep) state - they are probably responsible for slowing your system. Try to kill -9 them.

The two processes to which you refer are with regards to the statistics (which I explained above is doing our logging) and the cp processes (which I explained is responsible for a massive transfer of files over to the slave drive.)  If these two processes are truly responsible for the slowdown, then I'm afraid there is little I can do about that right now.  It is necessary for both processes to arrive at their completion.
0
Seth SimmonsSr. Systems AdministratorCommented:
in top, if you press then number 1 it will expand/collapse the cpu info
by default it's collapsed so you see just the average of all of them; press 1 and will expand a line item for each core.  if, for example, you have 4 cores, you'll see 0, 1, 2 and 3
0
OmniUnlimitedAuthor Commented:
@seth2740:

Cool!  There are 4 processors, I didn't know if you were interested in the breakdown or not:

Cpu0  : 40.5%us, 59.5%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  :  0.0%us,  0.0%sy,  0.0%ni, 96.7%id,  3.3%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  :  0.0%us,  1.7%sy,  0.0%ni,  2.7%id, 95.3%wa,  0.0%hi,  0.3%si,  0.0%st
Cpu3  :  9.6%us,  1.0%sy,  0.0%ni, 87.1%id,  2.0%wa,  0.3%hi,  0.0%si,  0.0%st

Open in new window


So from what I am seeing, the run queue is not exceeding the number of processors.  I really think the I/O is the problem.
0
Seth SimmonsSr. Systems AdministratorCommented:
yes i was leaning towards that also but wanted to rule out any cpu bottleneck

if you were using raid with striping across 3+ disks, you would see a performance improvement in i/o so it makes more sense seeing those kind of i/o wait numbers when everything is only on 1 drive
0
OmniUnlimitedAuthor Commented:
@seth2740:

if you were using raid with striping across 3+ disks, you would see a performance improvement in i/o so it makes more sense seeing those kind of i/o wait numbers when everything is only on 1 drive

I agree.  I will need to see if we can upgrade to system to RAID.  So, do you have enough information to definitively say that it is an I/O problem?
0
Seth SimmonsSr. Systems AdministratorCommented:
considering everything is running from 1 drive, i/o wait is steadily high from both mysql activity and a file copy, yes

as far as putting in a raid solution, better to go with hardware raid with a decent controller.  whether you made changes to that system or go with a different system you will need to reinstall everything.  you could also convert to a virtual machine if that is an option and you aren't able to reinstall the web application and restore mysql database.  there are a couple ways you could go with this
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
slubekCommented:
How do I obtain this?
uptime and top show three numbers after "load average:". First is 1 minute, second - 5 minutes and third - 15 minutes average.
0
OmniUnlimitedAuthor Commented:
@seth2740:

whether you made changes to that system or go with a different system you will need to reinstall everything.

Hmm.  Sounds to me it might be better to start with a new server with raid hardware and controller and just transfer what we need to that, rather than trying to reconfigure what we currently have.

@slubek:

uptime and top show three numbers after "load average:". First is 1 minute, second - 5 minutes and third - 15 minutes average.

Well then, you already got the requested information.
0
OmniUnlimitedAuthor Commented:
I've requested that this question be closed as follows:

Accepted answer: 250 points for seth2740's comment #a39571969
Assisted answer: 250 points for Barthax's comment #a39566118
Assisted answer: 0 points for OmniUnlimited's comment #a39571979

for the following reason:

@Barthax: half the points go to you, because you did actually answer my original question.  Sorry I could not give all the points to you, but the expert help I received after you answered merits some sort of reward.

@seth2740: Your expertise has helped teach me a lot of valuable things about server technology.  I truly appreciate your willingness to share and to help resolve the situation that spawned the original question in the first place, i.e. the slowdown of our server.  Thank you very much.

And thanks to all the experts who participated.  You have truly helped to make EE what it is today, a place where you can get real life answers from real live experts.

Best Regards to all.

Jason
0
OmniUnlimitedAuthor Commented:
@Barthax: half the points go to you, because you did actually answer my original question.  Sorry I could not give all the points to you, but the expert help I received after you answered merits some sort of reward.  Also you did not answer my follow up question in comment ID: 39566211.

@seth2740: Your expertise has helped teach me a lot of valuable things about server technology.  I truly appreciate your willingness to share and to help resolve the situation that spawned the original question in the first place, i.e. the slowdown of our server.  Thank you very much.

And thanks to all the experts who participated.  You have truly helped to make EE what it is today, a place where you can get real life answers from real live experts.

Best Regards to all.

Jason
0
BarthaxCommented:
Thank you, Jason.  I expected the points to go elsewhere given the depth of answers given and I wouldn't have complained. ;)

A minor point I'd like to pick up on now that the question is closed. If you kill updatedb, you will have an orphaned temporary file in /var/lib/mlocate.  When updatedb is not running the directory should have just the one file: mlocate.db.  Killing updatedb will leave behind a mlocate.db.<temp> file which should be deleted.  Given that these files can be many hundreds of MB, killing it often is not a wise decision.  If you find yourself killing it often, instead just amend your cron.daily directory by moving mlocate file to the cron.weekly or cron.monthly.
0
OmniUnlimitedAuthor Commented:
Thanks Barthax, I've never been a fan of killing processes without knowing exactly what killing it would mean.  I appreciate the info.
0
PortletPaulfreelancerCommented:
at http:#a39568642 a URL was removed. Links to competitive question and answer sites are against the terms and conditions of Experts Exchange, and there is an increased monitoring for this underway.

Please avoid using such links.

Thanks,
PortletPaul
as Topic Advisor
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
MySQL Server

From novice to tech pro — start learning today.