Printing From out AIX Server Slow

laurin1
laurin1 used Ask the Experts™
on
We have an older IBM AIX (4.3) server that suddenly has become very slow when printing larger files. My team is very short on the skillset to manage this server (we're a Windows shop), and therefore we not even sure where to start. Can someone point in the direction on what to look for? We are printing to HP 4345 MFC's. Everything was working just fine until a few weeks ago.
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Most Valuable Expert 2013
Top Expert 2013

Commented:
Hi,

it seems that the AIX server is kind of oveloaded in some aspect.

Please log in to that machine and issue

uptime

Check "load average". Are the values elevated (greater than number of CPUs or even more)?

Next:

vmstat 1 10
You will get 10 lines of data. Look at the values under "us" and "sy" which is cpu % spent for user processes resp. kernel processes. Is any of the values high (beyond 20 - 30) in more than 3-4 consecutive lines?

Issue

ping hostname

against a well-known server hostname in your network. What is the value for "time=..."
Something beyond 50-200 ms?

If any of the above is true, first reboot the server soon and see if it gets better.

If all seems OK, issue

df
 
 and examine the columns under the headings "Free" and "%Used". Is "Free" very low somewhere (particularly for /var), or is "%Used" near 100 somewhere?

If so, clean up the concerned filesystem. Printing needs much free space under /var!

Also try

lpstat -a

(did this exist under 4.3?)

Do your printers show up very slowly, one by one, or is the output fast?

Please report your findings, and we'll see.

wmp

Author

Commented:
I really don't think it's overloaded. It was working just fine one day, and then suddenly now it takes 40+ seconds to start each print job. I've checked the CPU load you were referring to, but I checked it again just now and it is

0.02, 0.07, 0.12

vmstat only returns one line of data:

# vmstat
kthr     memory             page              faults        cpu
----- ----------- ------------------------ ------------ -----------
 r  b   avm   fre  re  pi  po  fr   sr  cy  in   sy  cs us sy id wa
 1  2 44945  3002   0   0   0  56  137   0 148 3668  72  3  2 94  1

No, the numbers are low (3 and 2).

Network is fine. Ping times are 0ms

We've restarted the server already, FYI.

lpstat -a  All printers show up quick.



Author

Commented:
Oh, for df /var has %75 used. I'm doing some stuff to clean it up.
Angular Fundamentals

Learn the fundamentals of Angular 2, a JavaScript framework for developing dynamic single page applications.

Author

Commented:
Here is the current DF output:

# df
Filesystem    512-blocks      Free %Used    Iused %Iused Mounted on
/dev/hd4          131072     35776   73%     1701     6% /
/dev/hd2         1572864    399344   75%    16368     9% /usr
/dev/hd9var       131072     77056   42%      402     3% /var
/dev/hd3          393216    378416    4%      651     2% /tmp
/dev/hd1          393216    137448   66%      931     2% /home
/dev/hd7rmi     24641536   4430760   83%    13085     1% /rmi

we know that rmi is a bit full. it's holding a database that is growing slowly over time. we can't do much about that right now, but this problem began almost overnight and that mount point is growing about 1% every 6 months.
Most Valuable Expert 2013
Top Expert 2013

Commented:
/rmi is not the problem.

Your /var filesystem which contains the spool directories (/var/spool ...)
has only 64 MB in total,  with (just now) roundabout 38 MB free space, and only 16 MB free when 75% utilized.

Could it be that the average size of your print jobs has grown over time?  How many printers do you run? How many print jobs? Have these numbers grown?

I'd strongly suggest enlarging /var, by at least doubling its size.

If you don't know how to check for available space or how to increase the size of a filesystem, please let me know.

(Btw. I wrote vmstat 1 10 , which indeed would have given 10 lines of output).

wmp


Author

Commented:
vmstat 1 10 (oops)

ok, that still shows no number above 11 (only 1 that high), most are 0-3 (both us and sy)

Increase the size of the filesystem? You mean add disks or re-allocate space?

Author

Commented:
We have this problem even if I just print out the motd. Print job size has not changed. We use some very standard print jobs that have not changed in years.
Most Valuable Expert 2013
Top Expert 2013

Commented:
>> You mean add disks or re-allocate space? <<

Not really. I mean using free space possibly available in rootvg to extend /var.

Check freespace using

lsvg rootvg

Free space is shown at the right beneath "Free PPs:"

The command to increase /var would be

chfs -a size=+additional_512_byte_blocks  /var

Note the + sign! It's important.

1 Megabyte is 2048 512 byte blocks.

So let's say you could afford giving 64 MB of the free space shown to /var, issue

chfs -a size=+131072 /var

Should the + sign not work under 4.3 (it's a long time since I used it),
just add the existing blocks and the planned additional blocks and issue (again for 64 additional MB):

chfs -a size=262144 /var

---------------------------------------------

40 seconds delay? That's a typical value when there are DNS problems. Do you run a Domain Name Server? Is it OK?
What is the response time of "nslookup hostname" ?

And to be sure there are no problems with the server hardware, issue

errpt | pg

and look for suspect entries, particularly those containing "P" under the "T" heading and "H" under the "C" heading.

wmp











Author

Commented:
So,  this says:

 TOTAL PPs:      1084 (69376 megabytes)
 FREE PPs:       996 (63744 megabytes)
 USED PPs:       88 (5632 megabytes)

Does that really mean we are only using 5 of our 64GB???

Author

Commented:
increased /var by three times (128MB.) df reports 16% used on /var now.

nslookup is fast. printers on pointed to IP addresses, fyi, not hostnames.

I actually had been looking at the log. The abnormal termination I believe is something we've had all along (because the software vendor has be regularly going and removing core dump files because of this.)

I am concerned about this unable to allocate space error. Did some looking for it, and can't find a solution or a cause:

C60BB505   0126123910 P S SYSPROC        SOFTWARE PROGRAM ABNORMALLY TERMINATED
369D049B   0121131510 I O SYSPFS         UNABLE TO ALLOCATE SPACE IN FILE SYSTEM
2BFA76F6   0119181510 T S SYSPROC        SYSTEM SHUTDOWN BY USER
9DBCFDEE   0119181710 T O errdemon       ERROR LOGGING TURNED ON
192AC071   0119181310 T O errdemon       ERROR LOGGING TURNED OFF
C60BB505   0105170610 P S SYSPROC        SOFTWARE PROGRAM ABNORMALLY TERMINATED

Most Valuable Expert 2013
Top Expert 2013

Commented:
Yes, you've been using less than 6 GB of rootvg. I hope that's not bad news!

As for the errorlog entry:

Please examine the full message. Issue:

errpt -a -j 369D049B

Btw. this error occurred at January, 21 2010.

If you're interested, examine your "Abnormal termination" problem with errpt -a -j C60BB505  
You should be able to locate the name of the affected executable there.

And your printing delays - do they happen all the time or intermittently? Are all printers affected?

This is going to get tricky, I fear.

wmp





Author

Commented:
Oh, yea, I'd looked at the detailed report as well.

---------------------------------------------------------------------------
LABEL:          JFS_FS_FULL
IDENTIFIER:     369D049B

Date/Time:       Thu Jan 21 13:15:52
Sequence Number: 108320
Machine Id:      000985FA4C00
Node Id:         rehabmgrnew
Class:           O
Type:            INFO
Resource Name:   SYSPFS

Description
UNABLE TO ALLOCATE SPACE IN FILE SYSTEM

Probable Causes
FILE SYSTEM FULL

        Recommended Actions
        USE FUSER UTILITY TO LOCATE UNLINKED FILES STILL REFERENCED
        INCREASE THE SIZE OF THE ASSOCIATED FILE SYSTEM
        REMOVE UNNECESSARY DATA FROM FILE SYSTEM

Detail Data
MAJOR/MINOR DEVICE NUMBER
000A 0007
FILE SYSTEM DEVICE AND MOUNT POINT
/dev/hd3, /tmp


The delays are consistent. Every single time Whether we are printing a whole page, or 3 lines of text.
Most Valuable Expert 2013
Top Expert 2013
Commented:
Well,
to clutch at any straw -
it might be that the /etc/qconfig.bin file is corrupted or fragmented.
Rename it for backup purposes - mv /etc/qconfig.bin /etc/qconfig.bin.save (No active printjobs present, if possible!)
then regenerate it with  qchk
You will now have a clean /etc/qconfig.bin, digested from your existing /etc/qconfig file.
Test it!
Next, how many printers do you run? Could you afford deleting the local queues (or at least one of them to test) at the AIX server and redefine them?
This would do some cleanup in /var/spool (particularly in /var/spool/lpd/pio).
Also try to get rid of possible garbage in /var/spool/qdaemon. Issue stopsrc -s qdaemon , delete everything under /var/spool/qdaemon, then issue startsrc -s qdaemon
wmp

Author

Commented:
I'll try those, but maybe this points to something....

When I go into SMIT and choose Change / Show Print Queue Characteristics, choose the print queue name, then choose 1 - Printer Setup, I get

1820-037 An internal error or system error has occurred.  See the log file for further information.

Shouldn't I be able to see these? Also, I tried to add a print queue, and I choose Network Printer (is that correct?), then HP, then 5si, and then it asks me if I want to make this server a BOOTP/TFTP server (what does that have to do with printing?), I say No, and then I get

1800-106 An error occurred:

0782-626 Opening the ODM class "sm_cmd_hdr" failed (odmerrno is 5908).
Use local problem reporting procedures.

What is the proper way to configure a printer attached by IP address? Are these errors relevant?
Most Valuable Expert 2013
Top Expert 2013

Commented:
Yes,
these errors are relevant.
Could it be that you accidentally deleted the directories and files under /var/spool/lpd/pio? I assume so.
Try to restore the data from a backup.
Besides that, "Network Printer" is correct for HP JetDirect compatible printers. Since HP JetDirect cards can act as a DHCP client, you are given the choice to make your AIX system a DHCP server. If you give static IP adresses to your printers, no need for DHCP.
If your printers are not HP Jet Direct compatible, choose "remote" as the attachment type, then continue with the choice for raw or filtered data, etc. You need to provide the hostname (IP) of the printer, and the name of it's print queue (device).
wmp
 
 
 

Author

Commented:
No, they are HP 4345 MFC's. I did not delete anything under PIO. I just checked and the copy I made of all these files before I began this and it's the same as what's in there.

Author

Commented:
This was the solution to ADD print queues (won't be able to view current):

http://www.rootunix.org/AIX/smitprt.txt

Author

Commented:
Here's an interesting fact. I printed a test. On the test page, the time printed and the time queued are exactly the same. That means the delay is between the print command and the queuing process, correct?
Most Valuable Expert 2013
Top Expert 2013

Commented:
Where was the test page generated? It it's at the printer itself - of course time queued and time printed cannot differ much, as the printer does both.

Author

Commented:
No, from the server.

However, success!! Well, sort of.

I created a new print queue, and it works just fine (almost as soon as I send the job, I can hear the printer start.)

The problem is now, since I can't see the setup of the old ones, there are some configurations I don't know how to set. We have a "condensed" version of every queue configured. Meaning, we have lp11 as the standard queue and a "condensed" formatted version at lp12. Why can I not see the current setup? That has to be stored somewhere.
Most Valuable Expert 2013
Top Expert 2013

Commented:
You can look at /etc/qconfig.
Additional attributes are displayed using  "lsvirprt". But that format it hard to read ...

Author

Commented:
/etc/qconfig does not exist.

Author

Commented:
never mind. I thought that was a directory.

Author

Commented:
That file does not show anything relevant to formatting. The other comand does, but I'm not sure how to duplicate that.
Most Valuable Expert 2013
Top Expert 2013
Commented:
I fear you will have to recreate the queues "by hand" using the saved information from lsvirprt.
You can use
lsvirprt -q queue_name -d device_namequeue_name.defs     # or whatever name you like
to save the output

Author

Commented:
Ok, sweet. I can do that and use Winmerge to compare the two files. I think I can do this (crosses fingers.)
Most Valuable Expert 2013
Top Expert 2013

Commented:
Did you succeed?
wmp
 

Author

Commented:
Yes. Deleting and re-creating the print queue fixed the problem.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial