Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 719
  • Last Modified:

Apache gets slow until restart - why?

We are running a web app written in perl via Apache 2.0 on Linux RHEL4 within an intranet.  Users complained that the app performance was slower than usual.   After testing everything from the network to the server drives, we found that the solution is to restart apache.  Every time users report performance problems and we restart Apache the users state that the app is very fast again for a day or so.

How would you recommend that we begin our search to figure out why restarting Apache is improving performance.

Thanks!

spiroc
0
cspiro
Asked:
cspiro
  • 6
  • 5
2 Solutions
 
AdamsConsultingCommented:
I would recommend doing an strace on the apache process. Do a ps command to find the PID of an apache child process that is "slow" and then do an strace on that PID with:

strace -p [pid]

This should allow you to see what system calls are taking a long time. Compare this output from an strace where everything is functioning correctly.

If you don't have strace available, install it with:

up2date strace
0
 
cspiroAuthor Commented:
In this case, 'slow' means that a web app screen takes 6 seconds to load versus 1 second.  So, by the time we would try to run an strace on a pid it would be gone.

Is there a more generic way to output straces of all active pages served by apache over a let's say a 5-minute period and then compare betwen a slow and fast period?
0
 
AdamsConsultingCommented:
You could try an strace -f on the parent apache process and dump stdout and stderr to a file. The -f has it follow children. I agree that this 6 seconds to troubleshoot the problem is going to make it difficult. :(
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
AdamsConsultingCommented:
Also, did this just start happening, and what changed around the time that it started happening?
0
 
cspiroAuthor Commented:
Would you be able to show the command that would accomplish "strace -f on the parent apache process and dump stdout and stderr to a file"?  Thanks!
0
 
AdamsConsultingCommented:
ps -efww |grep httpd

You want the httpd process that has a parent process of 1 and likely running as root. For example:

root     30680     1  0 05:42 ?        00:00:00 /usr/local/apache/bin/httpd
apache   30681 30680  0 05:42 ?        00:01:13 /usr/local/apache/bin/httpd
apache   30697 30680  0 05:42 ?        00:01:11 /usr/local/apache/bin/httpd

You want the PID from the first process, as the third column (parent process) is 1.

Then use that pid to run your strace, and direct the standard out to a log file, as well the standard error:

strace -f -p 30680 > output.txt 2>&1

Let that run until the incident reoccurs. Then press CTRL-c to exit strace and read the output.txt file.

Optionally, if you want to keep your session open to do this or if your session seems to be timing out, you can have it survive your session with:

screen strace -f -p 30680 > output.txt 2>&1

If you don't have screen installed, install it with:

up2date screen

If you use the method with screen, you can leave the session by typing:

CTRL-a CTRL-d

Then feel freel to log out, and the process will still be running, attached to a virtual terminal. To get back to the session, type:

screen -r

Then once attached, follow the instructions above by pressing CTRL-c to exit the strace.

I know this sounds confusing but you can't go wrong if you just follow my instructions. :)
0
 
AdamsConsultingCommented:
I forgot that you'll want to add the -t parameter to strace to log a timestamp if you'll be reviewing the logs later instead of in realtime

strace -t -f -p 30680 > output.txt 2>&1
0
 
cspiroAuthor Commented:
The client told met that the system was slow and I recorded the tracing below. It seems normal to me.
Process 32177 attached - interrupt to quit
14:13:12 select(0, NULL, NULL, NULL, {0, 337000}) = 0 (Timeout)
14:13:12 waitpid(-1, 0xbffffb70, WNOHANG|WSTOPPED) = 0
14:13:12 select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
14:13:13 waitpid(-1, 0xbffffb70, WNOHANG|WSTOPPED) = 0
14:13:13 select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
14:13:14 waitpid(-1, 0xbffffb70, WNOHANG|WSTOPPED) = 0
14:13:14 select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
14:13:15 waitpid(-1, 0xbffffb70, WNOHANG|WSTOPPED) = 0
14:13:15 select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
14:13:16 waitpid(-1, 0xbffffb70, WNOHANG|WSTOPPED) = 0
14:13:16 select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
14:13:17 waitpid(-1, 0xbffffb70, WNOHANG|WSTOPPED) = 0
14:13:17 select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
14:13:18 waitpid(-1, 0xbffffb70, WNOHANG|WSTOPPED) = 0
14:13:18 select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
14:13:19 waitpid(-1, 0xbffffb70, WNOHANG|WSTOPPED) = 0
14:13:19 select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
14:13:20 waitpid(-1, 0xbffffb70, WNOHANG|WSTOPPED) = 0
14:13:20 select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
14:13:21 waitpid(-1, 0xbffffb70, WNOHANG|WSTOPPED) = 0
14:13:21 select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
14:13:22 waitpid(-1, 0xbffffb70, WNOHANG|WSTOPPED) = 0
14:13:22 select(0, NULL, NULL, NULL, {1, 0} <unfinished ...>
Process 32177 detached

How would you interpret this?
0
 
AdamsConsultingCommented:
That looks like the strace from the parent process, I'm not sure why it didn't follow the child and show what the child was doing. Did you forget to type the "-f" parameter?
0
 
cspiroAuthor Commented:
I am including the -f parameter.  man strace shows that -f is the correct parameter for child processes but the output is what I showed you above.

However, I did notice something unusual.  The first apache process had a pid of 476 where all the others started after 19480.  When I straced only that pid I get
22:22:09 semop(9928711, 0xb7e42b44, 1 <unfinished ...>

Is that a lingering stuck pid?

Am I fishing too far??  Why isn't -f working??  Are we having fun yet??
0
 
cspiroAuthor Commented:
Turned out that hardware was failing.  I/O was hicupping.  Figured this out using ps repeatedely in a program and seeing the pending jobs pile up when activity was still low.
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

  • 6
  • 5
Tackle projects and never again get stuck behind a technical roadblock.
Join Now