intermittent "tree or server not found"

I'm having a bizzare problem with a client losing the connection to the server. I have a simple LAN with a single Netware 6.5 SP3 server with about 25 clients. Only one of these clients, running Windows 2003 Server (SP1) with Client 4.91 SP2 (acting as Terminal Server) loses the connection to the server typically over the weekend. All would be well for a couple of days and than the users call to report that they cannot login.

When this happens, trying to login returns a "Tree or server not found" error. The only way to get things back to normal is to reboot the client. I have tried uninsalling the NetWare Client and reinstalling the latest version, doing a "Repair local DR database" on the NetWare server but the problem persists.

As I said, the only client computer on the LAN experiencing this problem is the Terminal Server. Pinging the NetWare server when the problem occurs works fine. All was well for a couple of years now, the only thing I can recall doing on the client was installing Microsoft updates. I am trying to avoid having to rebuild it, so if anyone has any suggestions it would be most welcome.

Who is Participating?
PsiCopConnect With a Mentor Commented:
Hmmm... I don't think this has anything to do with eDirectory. Stop repairing databases - this is a communications issue.

Is there a software firewall on the TermServ host? That'd be the first thing I'd look at. I suspect that the TermServ is not getting good SLP information.

How does the TermServ get its IP address? Static assignment or DHCP? If DHCP, what is serving the addresses - the NetWare server or something else? Are the proper Options set up?

If, when you encounter the error, you can put the IP address of NetWare server in the "Server" dialog box of Client 32 and the problem vanishes, that would confirm this is an SLP issue.
ShineOnConnect With a Mentor Commented:
As PsiCop said, "Tree or server not found" is typically an SLP issue, or some other communications-related issue.  I concur that you shouldn't run dsrepair for a non-database issue.

One more thing to look at - was the Novell client installed Custom, with IP only?  If it was a "typical" installation, it may have installed both IP and IPX, and if so, could be getting "confused" between the protocols.  If that's the case, simply unbinding IPX from the client should suffice.

When you say, "The only way to get things back to normal is to reboot the client." - what does that mean?  Are you referring to the Windows Server 2003 server as "the client" or are you referring to the end-user's PC?

You say it started after you installed Microsoft updates on "the client."  Again, is "the client" the Windows Server 2003 Terminal Server?  What updates, specifically - was it W2K3 SP1, or had SP1 already been installed and this was post-SP1 hotfixes?  Can you post which hotfixes you installed? Check the Windows folder and make note of which hotfix folders have a create date of when you ran the updates, and let us know only those - we don't need stuff from a year or so ago... ;)

I think it's probably something that came with either SP1 or a hotfix, depending, but it happening over the weekend kind of bugs me too.  Is there a day and/or time when it happens?  If so, is there a scheduled event going on anywhere in the network, like a backup or a virus scan whatever?  Is there a chance that anyone is doing a shut down or a restart instead of a simple logoff when they end their terminal server session?
floyd99Connect With a Mentor Commented:
Is a slightly different problem,  but check this thread and see if the solution for that assists:

Cloud Class® Course: CompTIA Cloud+

The CompTIA Cloud+ Basic training course will teach you about cloud concepts and models, data storage, networking, and network infrastructure.

sgiurgeuAuthor Commented:
in reply to PSiCop's comments:

- there is no firewall on the TS

- TS has static IP address (remember, when the problem occurs, I can ping the NetWare server from TS)

- I concur that all this has to do with SLP; I'll wait for the next time it happens and investigate some more

in reply to ShineOn's comments:

- the Novell client has been installed with IP only

- when I refered to the "client", I meant TS; no other PC on the network has developed this problem

- after installing W2k3 SP1 all was well; the Terminal Server is pretty much up to date with all patches realeased by Microsoft. the most recently installed updates are:

I am not saying that one of those is causing the problem but judging by cause and effect, the updates are the only thing that have changed on the TS

- as far as when it happens, it's quite sporradic. Generally, I noticed it happening first thing in the morning and most comonly Monday morning. I asked all users to notify me from now on when they encounter the problem, even if it is afterhours so I should have more information shortly.

- the only thing that is going at night and might have something to do with this is the regular backup on the NetWare server

in reply to floyd99's comments:

- I already read that article (thanks for link, though); it doesn't apply to my problem. Rebooting the problematic computer (TS) fixes the problem right away (no repeated reboots are necessary)

As I said, I am more and more leaning towards SLP being the culprit. Next time when the problem happens I'll troubleshoot that way (unless I have users anxious to get on). I'll keep an eye on all these leads and keep you posted. Thanks all for the hints.
Do they access the TS as a TS client and log in to NetWare on their TS session, or is it using some sort of passthrough?  If the former, is it when they try to log in to NetWare on the TS session where it hangs on "tree or server not found?"  What happens if they put the server IP in the server name field as part of the NWGINA login?  (after pressing the "advanced" button, of course...)
sgiurgeuAuthor Commented:
The NetWare login is the primary login, so users first login in to NetWare and than in to Windows. I am waiting for the problem to happen again and will try using the IP address of the Netware server instead of its name under Advanced...  So far it's working fine since Monday morning, but I'm sure it will happen again.

Will keep you posted.
What I'm wondering is, how are the users using the terminal server?  Are they doing a passthrough or are they getting a new login screen when they connect to the terminal server?  I'm assuming it's the latter, and what's happening is that when they connect to the terminal server and enter their user information into the terminal server session's NWGINA, that's where they're getting the "tree or server not found" problem, correct?

I had to ask about the passthrough, because IIRC it is possible, using ZENworks, to have an RDP session icon that passes the logged-in user's info through and automagically establishes the TS session without additional login required.  At least, that's what I understand to be true...  
sgiurgeuAuthor Commented:
When the users establish the connection to the terminal server, the first thing that they see is NWGINA. They put the user name and password (for NDS) and if all is well they are in. For most of them, the Novell password is identical to their local Windows account password so there is no additional login. For those that don't have the two passwords synchronized, they receive a second login dialog to login in to Windows (i.e. locally; the terminal server is configured as a standalone server (not DC)). There is no ZEN.

When the problem occurs, they receive the "tree or server not found" right after NWGINA. As soon as they put the username and password into NWGINA they see the red N cursor trying to connect and after about a minute it comes back with "tree or server not found". They can then login into Windows (workstation only) if they wish but that doesn't help them much since the connection to the NetWare server could not be established. Trying to subsequently login in to NDS produces the same result. Only rebooting the terminal server fixes the problem.

I have just enabled SLP debugging on the server (at level 127) and redirected to the log file, so I'm ready for the next time it happens.
Any news?

Have you tried the bad address cache/bad server name cache settings that were discussed in the PAQ link floyd99 posted?
sgiurgeuAuthor Commented:
I enabled the SLP debug on the server and the problem has not re-occured since, so it is definitely SLP related. The other day I had to reboot the server and have not re-enabled SLP debug so I'm waiting for it to happen again at which point I will try using the server's IP instead of it's name.

Will keep you posted.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.