Solved

APC Abends NW 3.12

Posted on 1997-12-09
3
569 Views
Last Modified: 2012-06-21
Has anyone experience any abends from the powerchute software?? Can it be caused due to power flux?? Can a bad ups cause software to crash?? Any comments would be greatly appreciated. BTW, I have called APC and downloaded their so-called KB doc.
0
Comment
Question by:Eric B
  • 2
3 Comments
 
LVL 4

Accepted Solution

by:
Zombite earned 100 total points
Comment Utility
Long and involved. Yes APC can cause abends.
From Novell.

Symptom

     Loading APCSMUPS.NLM (Powerchute Plus Smart-UPS monitor Version 4.00 January 1993)
     caused an Abend on the Netware 3.12 server 16MB RAM.

     The Abend message follows:

     Power Chute Plus Version4.00 January 1993

     Abend: Invalid Semaphore number passed to kernel
                    Power Chute Main process
     Loading SERVER -NA and then loading APCSMUPS.NLM still caused the same Abend.

     Copied SERVER.EXE from original disk and applied all latest patches but APCSMUPS.NLM
     would still abend.

     Solution

     After contacting APC they supplied the following information:

     The latest version of Powerchute Plus is 4.2.4 (at the time of this document's creation).
     Customers wishing to upgrade, can quote the part # AP9003-U, and it may be ordered from us at
     800-800-4APC.

     Description: Preliminary Steps to Resolving NetWare 3.1x File Server Abends
     with PowerChute

     If you are experiencing a file server abend (abnormal end) error message with NetWare 3.1x while
     using PowerChute plus or PowerChute v/s for NetWare, it is necessary that some preliminary
     steps be performed or checked before the problem can be properly dealt with by APC Technical
     Support. This is regardless of whether the abend error occurs immediately when
     PowerChute'sNLM is loaded at the file server or at random times long after the PowerChute NLM
     is loaded, either manually or via the file server's AUTOEXEC.NCF startup file.
     (Note: This document applies to NetWare 3.1x only.

     What Is An Abend?
     When an abend error occurs, the NetWare file server has crashed and NetWare users no longer
     have access to the server's files or services. An abend occurs when program execution is halted
     abnormally. There are many abend
     messages, but the three most common are GPPE (General Protection Processor Exception), Page
     Fault Processor Exception, and NMI (Non-Maskable Interrupt).
     These three errors are all processor exceptions, meaning that they are generated by the
     processor. Novell's NetWare merely reports the message.

     The NetWare 3 operating system continually monitors the status of various server activities to
     ensure proper operation. If NetWare detects a condition that threatens the integrity of its
     internal data (such as an invalid parameter being passed in a function call, or certain hardware
     errors), it abruptly halts the active process and displays an "abend" message on the
     screen. ("Abend" is a computer science term signifying an ABnormal END of program.)
     The primary reason for abends in NetWare is to ensure the stability and integrity of the internal
     operating system data. For example, if the operating system detected invalid pointers to cache
     buffers and yet continued to run, data would soon become unstable or corrupted. Thus, an
     abend is NetWare's way of protecting itself -- and users -- against the predictable effects of data
     corruption.
     Abend messages are usually caused by consistency check errors, which are internal tests placed
     in the NetWare operating system by Novell software engineers. The primary function of
     consistency checks is to ensure the stability and integrity of internal operating system data.
     Abend errors can also be caused by insufficient server memory, DMA (Direct Memory Access)
     conflicts, or hardware and software interrupts.

     Preliminary Steps to Troubleshooting an Abend:
     An abend can be caused by either hardware or software. It is easier and cheaper to troubleshoot
     the software first. The steps listed here may solve the server abend problem, and may also prove
     to be valuable preventative maintenance that will avert other problems. The first section covers
     PowerChute and UPS related server configurations which may lead a NetWare file server to
     abend.

     PowerChute/UPS Related Checkpoint Steps:

     1 ) UPS Hardware/Software Incompatibility. If using PowerChute plus, ensure that you are not
     attempting to interface the software to a Smart-UPS v/s model in smart signaling mode (i.e. via
     the black interface cable with APC part # 940-0024B or 940-0024C). PowerChute plus is not
     compatible with a Smart-UPS v/s in smart signaling mode. PowerChute v/s (which is packaged
     with the Smart-UPS v/s units) must instead be used if you prefer the software to function in
     smart signaling mode.
     If using PowerChute v/s, ensure that you are not attempting to interface the software to a
     standard Smart-UPS model (any generation) or a Matrix-UPS model in smart signaling mode (i.e.
     via the black interface cable with APC part # 940-0024B or 940-0024C). PowerChute v/s is not
     compatible with a standard Smart-UPS or Matrix-UPS in smart signaling mode. PowerChute plus
     must instead be used if you prefer the software to function in smart signaling mode.

     2 ) File Server Name Length. Check the length of the file server's name. If you are running v4.2.x
     of PowerChute plus for NetWare or any version of PowerChute v/s for NetWare and the file
     server's name is 36 characters long or greater (up to the maximum 47 characters allowed for
     NetWare server names), then the server will immediately abend after PWRCHUTE.NLM (for
     PowerChute plus) or PCVS.NLM (for PowerChute v/s) loads. This is a bug with PowerChute plus

     and PowerChute v/s and will be corrected with a future release. The only resolution at this time is
     to reduce the file server's name to 35 characters or less.

     3 ) Communication Driver Rating. Ensure that if your file server is utilizing 16550 UART devices
     on the serial communication ports, the switch NOFIFO is used at the time that the
     AIOCOMX.NLM is loaded for the serial port that the UPS is utilizing. The presence of the 16550
     UART device is only known with information echoed to the server console screen at the time
     that AIOCOMX.NLM is loaded. The following are sample load statements for AIOCOMX.NLM
     for both COM1 and COM2 on the NetWare file server:
     for COM1: LOAD AIOCOMX INT=4 PORT=3F8 NOFIFO <Enter>
     for COM2: LOAD AIOCOMX INT=3 PORT=2F8 NOFIFO <Enter>
     where NOFIFO stands for "no first-in first-out" and disables the buffering
     process utilized by high speed UART devices, along with reducing the maximum speed of the
     device from 19200 bps (bits per second) to 2400 bps. If you are not using a standard serial
     communication port and are rather interfacing the UPS to the file server via an expansion or
     multiport board, then it is crucial that you consult the documentation accompanying the board or
     the manufacturer of the board to determine if the board implements a high speed UART device
     and if so, to implement the method to disable FIFO on that device.

     4 ) PowerChute File Attributes. Check the attributes of all PowerChute files in the file server's
     directory PWRCHUTE. Ensure that all of the files are flagged Rw (read write) and A (archive
     needed). Entering the command FLAG from the network drive prompt at the workstation after
     making the PWRCHUTE directory as the working directory will display the attributes of all of the
     PowerChute files. If any of the files are flagged Ro (read only), they must be flagged otherwise.
     To do so, enter the following command from the network drive prompt at a NetWare workstation
     while the server's directory PWRCHUTE is the working directory:
     FLAG *.* RW A <Enter>
     If this step is needed, the PWRCHUTE.NLM (for PowerChute plus) or PCVS.NLM (for
     PowerChute v/s) must be unloaded and then reloaded at the file server's console prompt.

     5 ) A Moved PowerChute Directory. Check to ensure that the file server's directory PWRCHUTE
     has not been moved since installation without first unloading APCBKUPS.NLM or
     APCSMUPS.NLM (for v4.0.x or v4.1.x of PowerChute plus), PWRCHUTE.NLM (for v4.2.x of
     PowerChute plus), or PCVS.NLM (for PowerChute v/s). This could be a problem since a loaded
     PowerChute NLM expects the PWRCHUTE directory to be in a certain server location. If it
     cannot find this
     directory, server errors and even abends can occur. Most likely if the directory was moved since
     installation, the PowerChute NLM will not properly load when the server reboots after the abend
     unless the load command for the
     PowerChute NLM in the file server's AUTOEXEC.NCF was also modified before the abend.

     6 ) Tape Backup Conflict. Was a tape backup running at the time of the abend? If so, specifically
     exclude the PWRCHUTE directory and its files from the tape backup. The reason for this is that
     at least one file (PWRCHUTE.LOG)
     and possibly two files (PWRCHUTE.LOG and PWRCHUTE.DAT when PowerChute plus is
     operating in smart signaling mode) are always open when either APCSMUPS.NLM or
     PWRCHUTE.NLM (for PowerChute plus) or PCVS.NLM (for PowerChute v/s) is loaded and
     running at the file server. Tape backup utilities can run into some problems when attempting to
     back up open files; this is why no users should be logged into the file server during a tape
     backup. Cheyenne ARCserve, a third party tape backup package, is notorious for these conflicts.
     Make sure that you also have the latest version of the tape backup software. If you are not sure
     of this, you should contact the manufacturer of the tape backup software product.

     7 ) Corrupt PowerChute. The PowerChute NLM (APCSMUPS.NLM, APCBKUPS.NLM or
     PWRCHUTE.NLM for PowerChute plus or PCVS.NLM for PowerChute v/s) or any of the files in
     the directory PWRCHUTE may have become corrupted. You should reinstall PowerChute plus or
     PowerChute v/s and load the newly installed PowerChute NLM at the server console.

     8 ) Third Party NLM Conflict. PowerChute may be conflicting with another NLM that is loaded at
     the file server. Unload all third party NLMs one at a time until the abend seems to subside. You
     should then contact the manufacturer of
     any software under suspicion to ensure that you have the latest version of the NLM.

     9 ) Bad UPS Hardware (Smart Signaling Only). It is possible, though rare, that if any of the
     UPS-Link protocol generating components on the UPS's internal PCB board have become
     defective, errors can occur if the PowerChute NLM receives corrupt code from the UPS. These
     errors can lead the server to abend. This can be suspect, but not definitive, if the process
     mentioned in the abend message is the APC Polling process. The only resolution for this is to
     generate an RMA service order for the suspect UPS unit, although there is no definitive method
     to determine if the UPS signaling is faulty without extensive UPS-Link protocol troubleshooting.

     Non-PowerChute Related Operating System Checkpoint Steps:

     1 ) Update all LAN and Disk Drivers. Each manufacturer of LAN and disk cards must develop
     their own drivers. The only way to assure that you have the latest version of these drivers is to
     download them from the respective vendor. Even new hardware does not usually ship with the
     most current drivers. THIS STEP IS CRITICAL. You must be certain the drivers are the newest
     available from the respective vendor!!! Another part of this step is to have updated LAN
     support modules. These modules include MSM31?.NLM (where the ? represents the revision
     number or letter of the file) or MSM.NLM,
     ETHERTSM.NLM, and/or TOKENTSM.NLM (or any other TSM module that your system may
     require). You should also acquire the latest version of LANDR?.EXE (where the ? represents the
     revision number or letter of the file).

     2 ) Apply All the Latest NetWare Server OS Patches. There are known issues with the NetWare
     operating system and patches have been developed by Novell, Inc. to fix these issues. These
     patches may invariably solve other problems
     that they were unexpected to address. The patches are each grouped into a compressed,
     self-extracting executable file. The files needed to obtain these patches are as follows:

     The file revision's number or letter (as of October, 1996 the latest patches were 310PT1.EXE and
     312PT9.EXE).

     The patches with the letters "PT" in the patch name are patches which have "passed testing" by
     Novell Laboratories.
     These files can also be found on Novell's NSE Pro CD-ROM, which is the Novell
     NetWare Support Encyclopedia CD-ROM.

     APC strongly recommends that you download and install these patches. As a general rule, if a
     user calls Novell Technical Support, problem resolution and/or escalation is not possible until
     that user has installed and applied
     the aforementioned patches.

     3 ) Recopy SERVER.EXE. File corruption can happen to any file, even to the SERVER.EXE file,
     which basically is the operating system for the NetWare file server. A corrupt SERVER.EXE can
     be difficult to track down. For this reason, it is easier to perform this step than to find out, after
     much troubleshooting, that a corrupt SERVER.EXE was the problem. If the corruption were only
     in server memory, the solution would be to down and exit the server and then power off the
     machine and turn it back on.
     Just in case the corruption has been written to disk, copy a fresh copy of SERVER.EXE from the
     original disks or from a write protected working copy.
     The same idea applies to any other file or files in the SYS:SYSTEM or SYS:PUBLIC directory that
     may have become corrupted.

     4 ) Update CLIB, STREAMS, and SPX Files. CLIB.NLM is a library of functions that many Novell
     and third party modules, including the PowerChute NLM, use to access the operating system
     functionality. Because of this, CLIB.NLM changes
     often. STREAMS.NLM works in conjunction with CLIB.NLM, but does not change as often.
     You should check and ensure that both of these modules are the current version. SPXS.NLM is
     used for much of the server to workstation communications. This NLM should also be updated
     to the current version. Patches are available from Novell, Inc. if updates to these files are needed.
     These patches are grouped into compressed, self-extracting executable files.
     The files needed in order to obtain these patches and their respective locations are as follows:
             
     /cgi-bin/search/download?/pub/updates/nwos/nw312/

     Tfile revision's number or letter (as of January, 1997 the latest patches were LIBUPB.EXE and
     STRTL5.EXE).
     These files can also be found on Novell's NSE Pro CD-ROM, which is the Novell APC strongly
     recommends that these patches be downloaded and installed.

     5 ) Do a Virus Scan of the DOS and NetWare Partition. This should be habit
     during any troubleshooting.

     6 ) Other Things to Look at. Here is a list of items that have been known to
     cause server abends:
     - Power fluctuations at the power source (possibly due to failing UPS).
     - A failing computer internal power supply (not UPS).
     - A bad cooling fan (heat kills hardware).
     - A dry, hot or dusty environment can encourage hardware degradation and
     failure due to static electric discharge.
     - Check the server's error log (SYS$LOG.ERR located in the directory
     SYS:SYSTEM) for other clues.
     - Look for other problems that may end up being related. For example, lost connections, drive
     deactivation, climbing packet receive buffers, high dirty cache buffers, a high number of LAN
     errors, high utilization, etc.
     Another question to ask which may point you in the right direction is, "What changes have
     been made to the server environment lately?" You may be automatically inclined to say none,
     but may be incorrect. Did you recently
     increase the number of users for the server? Was there new software added? Was software
     upgraded? Is someone using software in a way different than it had previously been used? Is
     there new or different hardware? Have there
     been changes to the LAN, the routers, or the cabling? Have workstations or the file server been
     physically moved? Are there new printers on the LAN? Have SET parameters for the server been
     changed? Etc.....

     How to Run NetWare's Debugger Utility:
     If all of the above steps have been checked and/or acted upon and the problem continues, then
     it is time to escalate the issue. The next and final step is to run the NetWare Debugger utility. To
     do so, the server must abend again.
     When it does, execute the following steps:

     1 ) Copy down the full abend error message. This is very important.

     2 ) Do not reboot the server. Instead press the keys
     <Shift>-<Alt>-<Shift>-<Esc> at the server in that order (the first <Shift> and <Alt> keys to be
     pressed are those on the left-hand side of the keyboard; the second <Shift> key to be pressed is
     that on the right-hand side of the
     keyboard). Keep each key pressed until all keys are pressed. Then release the keys. This will
     bring up the NetWare Debugger utility on the server's screen.

     3 ) Enter a "?" (without the quotes) and then press the <Enter> key. NetWare debugger will
     report some vital information about which location in server memory the break occurred.

     4 ) If the abend is easily reproducible, it is suggested to abend the server two or three more
     times, executing steps 1 through 3 above each time. The reason for this is to find if the
     information reported by step 3 is similar after each abend.
0
 
LVL 7

Author Comment

by:Eric B
Comment Utility
I have seen this document before. I have all the latest patches and have spoken to APC. I thank you for the help. I will leave this posted in case some else has a concrete answer! If you think you deserve points for this I will gladly give you them.
0
 
LVL 4

Expert Comment

by:Zombite
Comment Utility
If you wish to leave the question posted, you should reject the answer. Points not really important - correct answer is.
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

This article explains in simple steps how to renew expiring Exchange Server Internal Transport Certificate.
Scam emails are a huge burden for many businesses. Spotting one is not always easy. Follow our tips to identify if an email you receive is a scam.
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now