Netware 6.5sp8 performance issues on DL380 G6

qvfps
qvfps used Ask the Experts™
on
I recently purchase an HP DL380 G6 to replace our current Novell server. Before I installed any software I ran burn in test for a couple of days and ran the full insight diagnostics several times.  

When I tried to do a test installation II ran into some issues with the P410i controller but I resolved those by upgrading the firmware on the controller to the most current version and downloading the correct .HAM drivers from HP.

I renamed our current server, changed the IP address and made sure everything was working.  I then did a clean install of Netware 6.5 sp8 and added the new server into the tree with the original server name and IP address.  

I thought everything was fine until I started copying files between the servers.  Other than being pretty slow which I initialy blamed on the original server there were no problems for the first few hours.  Then I started getting errors messages saying the file could not be copied the  \\server\volume is not accessible.  You might not have permission to use this network resource or Cannot copy filename:  The specified network name is no longer available.

If I logged on again I could access the file it failed on and restart the copy from there.    It would then run for a little while before failing again on a different file.  I tried upgrading the firmware on all components to the newest version and downloading and installing the Novell support pack from HP.  This seemed to work for a while.  I ran some file copies back and forth and I was getting good throughput and no errors.

Howeverwhen I resumed the actual file copies I ran into performance issues again.  I went back to the second PC where I did the testing and I could browse directories ok but then I started getting the   \\server\volume is not accessible message.  

 I was getting an IPX routing error on the console so i tried to open inetcfg which was very slow.  I ended up walking away before it came up.   I also tried to run dsrepair on the new server but it took forever and I ended up cancelling it.

Both the original and the new server are connected on gigabit ports  and the PCs I was using for testing are connected on 100 M ports on the same HP switch.  I had the status page up for the switch and was watching the traffic to get an idea of throughput,  to see if there was unexpected traffic anywhere and to watch for transmission errors.   The only ports which showed any activity were the ones I was using for testing and there were no errors at all reported on the switch

I opened a support call with HP but the only thing they said was that they would send someone out to replace the motherboard.  

Is there anything I have missed?  Any suggestions for troubleshooting the issue?
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®

Commented:
Sounds like the CPU is really busy doing something else.
Try bringing the box up at the lower load stages.  Also try -NA (no autoexec), -NS (no startup) and -NDB (no NDS - I think it's -NDB - but I could be wrong on that, doing it from memory - check with Novell's support site).
If you bring it up -NA and start loading things one at a time pausing for a minute or two between each you can isolate where the problem seems to be originating (I say seems because it is possible that the latest module loaded is okay - it's only when it interacts with something loaded before).
I am concerned about the IPX router errors - that could cause the CPU to be very busy - is it possible to solve that before trying the above?
Good luck.

Hope this helps.

Author

Commented:
Thanks for the reply.  I forgot to mention that I was checking the health monitor and I did not see any excessive CPU usage.    I will try and find the source of the IPX errors but I think they are being generated elsewhere and just displayed on the console.  
Systems Administrator
Commented:
IPX router errors are generated when two servers on the same network broadcast different information about the IPX network number. Therefore, these servers won't communicate, and network traffic will be slow or non-existent.

Check in Inetcfg if both server have the same IPX network number, or better yet, disable IPX for both servers, since you are already using IP..

Further, did you change IP addresses on the old server with remote manager? That way you can be sure that the change is also reflected in eDirectory. Otherwise, check the following document which mention a whole lot of places where the IP address has to be changed:

http://www.novell.com/coolsolutions/feature/486.html

And about renaming servers:

http://support.novell.com/docs/Tids/Solutions/10080951.html

Learn Ruby Fundamentals

This course will introduce you to Ruby, as well as teach you about classes, methods, variables, data structures, loops, enumerable methods, and finishing touches.

Author

Commented:
TheIPX  message is not coming from the local network or one of the servers.  It is coming from one of our other locations.   The error message appears on both the old and the new server.  It is only the new server which has the problem.   I will look into identify the issue as soon as possible.

I cant disable IPX because I have an old windows 2000 Terminal server which I need to keep around a little longer to run some software which uses gateway services for Novell which only works with IPX

I changed the IP address in inetcfg and while I havent looked at the above TID I did follow a TID from Novell on changing IP address and server names which had several files to check.

Commented:
Check you have not got STP enabled on the HP switch (switch it to RSTP).
I wonder if you've missed something when you changed IP addresses/server names.  I'm not aware of Novell ever saying use INETCFG to change an IP address for NetWare 6.5 (for NetWare 6 yes, but not for 6.5) as you're supposed to use the IP address management application to change the address so that the change is made to the "other" applications (tomcat, slp, nds, DNS resolution etc).
I guess a way to confirm you've not left something out:
1) Map a drive to sys:\ and search from the root of SYS for the old ipaddr/dnsname in the "containing text" field.
2) Use PKIDIAG.NLM to check the SSL certificates.
3) Check \etc\hosts and \etc\hostname have the right entries and start dsrepair.   Repair each servers address..
4) Check your services are registered with SLP (using slptool on the client)

Hope this helps

deroodeSystems Administrator

Commented:
You will have to run dsrepair on both servers, and run an automated repair several times until all errors are gone. It seems to me that eDirectory is confused about who is who in eDirectory, and most likely your performance problems come from there.

Also in Dsrepair check your replica configuration, and time synchronization.

I presume that IPX is only installed on the servers and not the workstation. Otherwise with the IPX configuration error you could very well be routing all traffic through the remote location...

Author

Commented:
I did run PKIDiag on the servers i renamed to make sure the certifiates were corret.   I ran DSRepair on all servers until There were no errors. and made sure everthing was in sync.    I edited the Hosts and Hostname files to make sure the IP address was correct.

IPX is only installed on the one Netware server the TS and the routers.  

I have removed the server from production until I get the issue resolved.  HP replaced the system board so I will see if that makes any difference.

Author

Commented:
The system board was replaced and I am still having issues.   I tried to do a large copy and while the throughput was a little better the copy failed when I tried to do a second copy.

I have been trying to resolve the IPX routing error but I can not identify the source of the message.   I have checked the other servers and the routers but I can not identify the IPX node.  Any suggestions on how to identify and resolve issue?
deroodeSystems Administrator

Commented:
What is the exact error message of the IPX router errors?

Check out the following:

SAP router configuration error detected
http://support.novell.com/docs/Tids/Solutions/10058319.html

Author

Commented:
Sorry I have not posted here for a few days now.  I am still having issues with the server.  I resolved the IPX errors without it making any difference.   I have been going back and forth with HP on this.  So far they say the issue is that I dont have the  battery backed write cache installed.  I haveit on order and hope to install it this weekend.  

I managed to stop the sever from locking up on file copies by enabling the write cache and setting it to 75% write / 25% read.   However while I am not locking up copying files now the throughput is not very good.  It starts out fine then it drops down to almost nothing for a while then shoots up for a short time before slowing down again.   So far I can not get it working near as good as the 6 year old Dell I am trying to replace.  

If there are anymore suggestions I would appreciate hearing them.  If not I will post my results after the battery backed write cache is installed.

Commented:
Another thing to check is drivers.  Have you got the right drivers loaded?  Are there more up to date versions available?  I seem to recall that HP offers SmartStart (I think it's called) to put drivers on for you - perhaps grab the latest one from HP's site and try that?  Also update the firmware on RAID controllers and BIOS and the like (I seem to recall HP offers a CD with the latest firmware on for you to download and install from) - worth doing too - just wondering if the version of the firmware you downloaded and installed at the beginning was buggy.  The SP from Novell usually has the latest drivers - are you 100% sure the SP went on okay?
Is there anything else connected to the switch that the server is connected to?  If there is, do these other things have performance issues too?  If they do not it eliminates networking side of things.  
Also in the netware client you can change various name services timeouts (the "thing" that says "The specified network name is no longer available") - see here for details: http://www.novell.com/coolsolutions/appnote/620.html  This may help to alleviate pressure on you for the interim.
Are you 100% sure that you don't have something else using this IP and/or IPX address?  What about duplicate DNS entries, server names or tree names?  If you have duplicate addresses/names it is quite possible that some data will go to one node and some of the data would go to the other node - which is a really bad thing.
Check that the network connection on both the server and the switch ports the server plugs into are set manually to the same setting (say 1000M full duplex) - don't depend on auto negotiation.
There are a few updates to NSS since SP8 including "fixed a problem where auditing of file close events would impact server performance negatively" - perhaps apply this fix NSS Update for NetWare 6.5 Support Pack 8 1.0 (http://download.novell.com/Download?buildid=AGY8vlaXt2g~) - that URL might not work I think they're dynamic.  Also there's a post SP8 TCPIP patch out that deals with abends in connection table lookups - might be worth applying.

Hope this helps

Author

Commented:
Thanks for the update.   I have made sure that I have all the latest BIOS, Firmware and SPs from the HP and have loaded them.  The newest smartstart CD does not come with Novell on it so you have to manually get them from the HP website.  

The original server I am trying to replace is connected to the same switch and the performance on that is a lot better the new new HP.   I have made sure that the IP address I used was unique and out of the DHCP range. so there would be no conflicts.  The IPX is only loaded on a WIndows 2000 TS and is used minimally but the IPX number is unique as well.

I have tried to manually set the port speed but while I can do it on the switch the HP Novell drivers do not allow you to fix it at 1000M.  I can set  it to 100M but that is it.

I will try the NSS update when I get a chance but I do not think that is the issue.  Right now I am just having poor performance writing to the server.

Author

Commented:
I think the issue is with the cache on the P410i controller.  I have tried adding the 256MB Battery Backed Cache Module but I can not get it to recognize the additional 256MB Cache.  I have set the Cache to 50% Read/50% write which seems to have helped but it still doesnt perform as well as the 6 year old Dell I am replacing.    

Right now I am regretting going with HP,  It has been nothing but problems.

Commented:
Well, after reading through your posts here, I finally got the answer as to why you have IPX installed at all, which is because you are likely using the Windows Services for Netware Client add-in for connectivity between the Windows environment and the NetWare environment, which forces you to use the IPX protocol.

Why not simply install the Novell Client software on the terminal server?  That way you don't need IPX at all and you will be able to use IP, the protocol that NetWare has preferred to use since v5.0.

If installing the Novell Client software isn't an option for some reason, which I can't see why it wouldn't be, I'd try configuring the NetWare server to be accessible by CIFS.

Reference - http://www.experts-exchange.com/Networking/Novell_Netware/Q_21659213.html?sfQueryTermInfo=1+client+novel+server+termin+window

The Microsoft Services for NetWare client is awful for providing connectivity to NetWare environments.  There is nothing wrong with installing the Novell client software on servers.  

However, there could be some detail about your environment that hasn't been disclosed as of yet explaining a necessity for using IPX if another server in your eDirectory tree is older than version 5.0.
I assumed you checked for duplex mismatch? Make sure to set both ends of the cable to AUTO, or hard set them to a matching duplex/speed. Also, what do your host files say about the IP addressing? Pardon if I missed anything from above, I read through it pretty fast.

Author

Commented:
Thanks for the help.  I am closing this even though the issue isnt fully resolve because I think it is a hardware issue.   I have a case open with HP on the Battery Backed Write Cache I purchased and why the server wont recognze it.   Thanks for all the suggestions.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial