Solved

Flat out server looks idle to me

Posted on 2006-06-30
5
209 Views
Last Modified: 2013-12-16
I have a couple of multi-threaded processes supposedly going flat out processing data, but top seldom shows anything to indicate that the processes are CPU bound.

Here's a typical display from top:
--------8<--------
11:45:25  up 21 days, 23:09,  3 users,  load average: 1.51, 1.65, 1.63
145 processes: 144 sleeping, 1 running, 0 zombie, 0 stopped
CPU0 states:   7.0% user  24.1% system    0.0% nice   0.0% iowait  68.2% idle
CPU1 states:   3.3% user   2.3% system    0.0% nice   1.3% iowait  91.4% idle
CPU2 states:   4.1% user  15.2% system    0.0% nice   1.4% iowait  78.1% idle
CPU3 states:   8.4% user   2.3% system    0.0% nice  20.4% iowait  67.1% idle
Mem:  8308812k av, 8006488k used,  302324k free,       0k shrd,  101100k buff
      3672472k active,            4107008k inactive
Swap: 2097136k av,       0k used, 2097136k free                 5340736k cached
--------8<--------

I guess that means they are I/O bound. What should I be looking at to confirm?
0
Comment
Question by:rstaveley
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
5 Comments
 
LVL 22

Expert Comment

by:pjedmond
ID: 17016951
You need to give more information on the type of process concerned. Then start looking at 'bandwidths' - A serial cable used in a communication process will almost definitely be the limiting process.

I have a file server with a 100M ethernet cable - Copying files off the server to other servers will never get the CPU above about 20% (1.4GHz AMD Athlon).

In order to get more processing out of the server, I added a second ethernet card and gave it another ip in order to increase the bandwidth available (or I could have added a 1G card if the wiring was capable of coping with it)

(   (()
(`-' _\
 ''  ''
0
 
LVL 17

Author Comment

by:rstaveley
ID: 17017337
Thanks for the response, pjedmond. The LAN could be the issue.

One of the processes is doing a substantial amount if I/O over NFS. The other processes all do local disk I/O.

The processes essentially do a lot of data extraction and conversion from data files. There is a daemon written in C++ and a Java applications running as a daemon.

(1) The C++ daemon accesses NFS. There is fairly lightweight encryption (Blowfish) and heavy-weight compression (BZip2).

(2) The Java daemon doesn't touch NFS, but gets the C++ daemon to fetch data for it. The Java daemon does a lot of analysis on the data and ultimately indexes information from it. [It is  Lucene application.]

The slowness is experienced in the inter-operation between one of the Java daemons and the C++ daemon. I haven't profiled it to find out which one is dragging its feet. I was hoping to get some sense from looking at system information available on the server.

There are two NICs in there There's no good reason why we shouldn't be using a Gigabit NIC.

Is there something in the /proc which would tell me what NICs I have?

Is there a test I can perform easily to confirm that the NIC is the bottle-neck?
0
 
LVL 17

Author Comment

by:rstaveley
ID: 17017536
BTW... I see high numbers in nfsstat, but donlt really understand how to interpret them.

--------8<--------
root@gse-mta-10:~# nfsstat -c
Client rpc stats:
calls      retrans    authrefrsh
597186811   697404     0      
Client nfs v2:
null       getattr    setattr    root       lookup     readlink  
0       0% 118888629 19% 70873   0% 0       0% 79509541 13% 0       0%
read       wrcache    write      create     remove     rename    
12252239  2% 0       0% 312135918 52% 25473775  4% 22108377  3% 0       0%
link       symlink    mkdir      rmdir      readdir    fsstat    
0       0% 0       0% 3238795  0% 3233045  0% 20275618  3% 1       0%
--------8<--------
0
 
LVL 22

Accepted Solution

by:
pjedmond earned 125 total points
ID: 17017539
ifconfig

will tell you that you have NICs and that they are running and what they are configured as.

dmesg | grep eth

should give you the names of your drivers.

To test if the NIC is the bottleneck, try creating more network traffic - does performance get worse? (Not that processing may increase in order to cope with more timeouts). Alternatively, create start a huge scp process from that server to another - again does performance increase...and again note that processing may increase.

Look at the 'nice' settings for the processors concerned. Perhaps you could give them a higher priority?

man nice

for more details.

Does getting the C++ process to access a local NFS speed things up? - Again eliminating (some of) the network potential bottlenecks?

Hopefully some of the above ramblings will be of use?

(   (()
(`-' _\
 ''  ''
0
 
LVL 17

Author Comment

by:rstaveley
ID: 17017666
I see 100 Mbps from dmesg.

> Hopefully some of the above ramblings will be of use?

Yes they are. Especially:

> ...try creating more network traffic
> Does getting the C++ process to access a local NFS speed things up?

I'll set up some experiments.
0

Featured Post

Use Case: Protecting a Hybrid Cloud Infrastructure

Microsoft Azure is rapidly becoming the norm in dynamic IT environments. This document describes the challenges that organizations face when protecting data in a hybrid cloud IT environment and presents a use case to demonstrate how Acronis Backup protects all data.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Ubuntu not booting - How get past GRUB? 3 81
how to configure linux OS using Ubuntu 7 69
PHP error function not working on AWS 10 124
Fuzzy search functionality DB2 UDB 3 33
Daily system administration tasks often require administrators to connect remote systems. But allowing these remote systems to accept passwords makes these systems vulnerable to the risk of brute-force password guessing attacks. Furthermore there ar…
I am a long time windows user and for me it is normal to have spaces in directory and file names. Changing to Linux I found myself frustrated when I moved my windows data over to my new Linux computer. The problem occurs when at the command line.…
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
Learn how to navigate the file tree with the shell. Use pwd to print the current working directory: Use ls to list a directory's contents: Use cd to change to a new directory: Use wildcards instead of typing out long directory names: Use ../ to move…

749 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question