Solved

Terminal Server - stops accepting connections every so many days.

Posted on 2006-11-20
48
699 Views
Last Modified: 2012-06-18
I have a Terminal Sever running on a Windows 2003 Server with SP1; everyone so many days 6 - 11 days I have to restart the server because one of the buildings connecting via Remote Desktop gets the server time out message and can not reconnect until the server itself is restarted.  I have a VPN Tunnel connecting this site and the main site where the TS sits; to troubleshoot I restarted both firewalls - which didn't change anything, I reset and logged off the disconnected users as well and still nothing until the server is restarted.  I can still get connected via Remote Desktop on site and I have another building that can also remain connected it is this one location that not matter what I try I can not get them to reconnect unless I restart the TS server itself.  I just installed a service pack in regards to KB 923630 recently (11-14-06) and today the server needed a restart.  I read about the SP1 on Windows 2003 server could cause a problem but you would think it would affect everyone trying to remote desktop not just one site; there is nothing in the event viewer that has been helpful either nothing happens that is the same everytime a restart needs to be done.

Any suggestions would be greatly appreciated.  

The Hardware is an HP Proliant DL140 2 GB Ram 2.80 Ghz process
0
Comment
Question by:cormark
  • 25
  • 15
  • 6
  • +1
48 Comments
 
LVL 43

Expert Comment

by:Steve Knight
ID: 17982776
Sound slike a routing problem perhaps?  Can they PING the server at that time, and/or can it PING them?

If not does your server have more than one network interface, and perhaps a default gateway set on both interfaces?  If so might be just a case of removing one default gateway and if needed adding some static routes -- this sort of thing can often happen when Windows decides one default route is dead so it routes all traffic down the other route -- there should nearly always only be one default route.

If that's not the case please advise on if they lose comms altogether (PING as above) and will consider other things.

thanks

Steve
0
 
LVL 48

Expert Comment

by:Jay_Jay70
ID: 17982807
I am with steve. i do not beleive this is a Windows issue though, any mroe details on your WAN config would be good
0
 

Author Comment

by:cormark
ID: 17982857
Yes I can Ping the TS server no problem from their PC; the TS has only one network interface - default gateway is set on that one card.  I can not remember for sure but I believe I tried to get a physical VPN connection through Network Connection and that worked to get into the TS even thought there was a VPN Tunnel already established; which you would think reseting the firewall on both ends would correct the problem but it does and did not.  The only way to get them reconnected is restarting the TS server.   We have even watched the firewalls and they act normal they are set to keep alive....
0
 
LVL 43

Expert Comment

by:Steve Knight
ID: 17982926
Is this VPN tunnel between the two Windows boxes then or at the network level between two routers?  I presume the address ranges at each site don't overlap?

Can you give us an idea of how this connects together network wise please.
0
 

Author Comment

by:cormark
ID: 17982981
The VPN Tunnel is through two Watchguards main office has a Firebox1000 remote location has Firebox 50 the have a manual branch IPSec setup.  When the TS is down they still can get to the network drives so they do not lose anything network connections just Remote Desktop
0
 
LVL 48

Expert Comment

by:Jay_Jay70
ID: 17983018
so its like the just RDP packets are dropped....are you able to see if they are actually dropped, or does everything pass through to the other end and simply fail at the server - can you RDP to a different machine?
0
 
LVL 43

Expert Comment

by:Steve Knight
ID: 17983034
Very odd, so they have other comms to the same box over the VPN connection when the TS connection fails.  Do you still see connections getting as far as the server -- i.e. load network monitor on there and look for port 3389 activity or at a push use netstat -an | find "3389" to see what TS connections are coming through.

Jay_Jay, any ideas?
Steve
0
 
LVL 43

Expert Comment

by:Steve Knight
ID: 17983036
:-)
0
 

Author Comment

by:cormark
ID: 17983062
I can;t remember if I tried that, I believe I did and it worked fine, but again I tried so many things I can't remember them all I also can not leave them down for very long either to try and Troubleshoot...Remote Desktop to another workstation did work.  I would have to confirm that if the server stops responding again.
0
 
LVL 48

Expert Comment

by:Jay_Jay70
ID: 17983067
Way too many times we think alike....:)
0
 
LVL 48

Expert Comment

by:Jay_Jay70
ID: 17983076
if the RDP works to another machine then that blows my theory out of the water and points back to the server again.....I wonder if you have a dodgy NIC in the server
0
 

Author Comment

by:cormark
ID: 17983098
When they call and say they cannot connect I look on the TS Mananger and see all their connections -disconnecting.  I never tried the network monitor or netstat I can try this next time it occurs.  It is very stange and the event viewer gives no help nothing standings out other than there is about 1/2 between logging in the event viewer when someone calls to tell me they cannot connect - nothing before is even the same as the last time.
0
 
LVL 48

Expert Comment

by:Jay_Jay70
ID: 17983149
hmm very odd, you see all their sessions discconecting? when it disconnects the external ones, do local ones get booted as well
0
 

Author Comment

by:cormark
ID: 17987448
If it was a NIC issue I would think it would affect everyone.  I did make one change in the IPsec setting for this location that I notice was different from the other setting for my other location; I don't think this will make a different because I restarted both Firewall when this issue occured and it did not correct the probelm. I have a second nic should I try enabling this second nic and disable the first one...
0
 
LVL 48

Expert Comment

by:Jay_Jay70
ID: 17992575
i agree its a long shot - but had no other ideas!
0
 

Author Comment

by:cormark
ID: 17998498
I ran the netstat -an when the connections where established and working correctly.

TCP IP ADDRESS of the TS 192.168.0.110:3389 192.168.110.#:3817 or 3450 or 1720 or 1980 or 1202 or 1479...
0
 
LVL 43

Expert Comment

by:Steve Knight
ID: 17999543
That's expected, try it when its not working ans see if you can see anything coming in.  Also if you setup network monitor now you can see the port 3389 connections coming in then when ita not working run it with the same filters and see what it shows - it could be it never reaches the server.
0
 

Author Comment

by:cormark
ID: 18039325
Okay the TS did it again yesterday it's been 7 days since the last restart.  I ran the netstat when someone from that location was trying to get connected and did not see them coming in TCP:3389 - they were still showing established even though in the TS Manager they were disconnected; eventually they did drop and there was nothing in the netstat for these few ip address's for TCP:3389.  I can and could however from the TX server connecte Remotely to a Desktop that was trying to connect to the TS so the reverse worked from TS to the Remote location that could not connect.  The firewalls where showing communication between both boxes were active, I also tried to re-generate keys to see if that would do anything but none of this worked.  I had to kick the rest of the people off and restart the server - everything was fine after that!  HELP!!!! I have nothing else to go on at this point!
0
 

Author Comment

by:cormark
ID: 18040345
Could I since I have two NIC cards, enable the second NIC enter a static IP for that card and have the building that is having issue us that IP address? Would that cause conflicts with anything else?  
0
 
LVL 43

Expert Comment

by:Steve Knight
ID: 18041070
That would depend upon your infrastructure, i.e. if you have access to their LAN to make a connection directly or via a VLAN trunked over some other connection then yes.  You could also add a second card on your same local subnet and give it a different IP address and set these troublesome users up to it.

If there was no port 3389 connection showing up in netstat it does sound like it is getting lost before then ,i.e. it is a VPN type issue - I can;t recall you saying how the VPN connection between sites is created -- router to router or server to server?

Suggest you get network monitor running on there filtering and capturing all 3389 port traffic at the point it stops working and see what comes through.  Bit of a tutorial here on it, all I could find off hand:  http://www.windowsnetworking.com/articles_tutorials/Analyzing-Traffic-Network-Monitor.html
0
 

Author Comment

by:cormark
ID: 18041952
Thanks for this information - the two sites are setup with a VPN tunnel Watch Guard to Watch Guard.  The strange thing is the two firewalls show they have a connection - they have a "Keep Alive" going and at the time they can not connect the two firewalls are still communicating.  I also can without doing anything, from the Terminal Server connect to a PC that cannot do the reverse to the Terminal Server; I jumped on the Terminal Server saw these trouble users being disconnected I then did Remote Desktop from that Server onto on of these Trouble PC without a problem.  So that is why I dont see how it is the VPN Tunnel.  I have two sites "Oakton" which is the trouble site and "3rd" which is configured the same way VPN Tunnel Firewall to Firewall all the same settings and they do not have any issues.  I would think if it was a VPN issue restarting both firewalls would take care of it but it does not.  The Terminal Server just stops taking requests from this site.  Once i restart bame they all get connected until 7 days or so later.  I will try to network monitor.  The only real difference which doesn't make sense or I don't see how this would make a difference the "Oakton" is a T1 split Data/Phone where as the "3rd" is a T1 only data.  But that wouldn't explain why I could connect to them and a restart solves everything....
0
 
LVL 43

Expert Comment

by:Steve Knight
ID: 18042406
try a telnet to port 3389 on the ts from the dodgy site when it isn't  working and see what comes through too perhaps - a connection and a blank screen shows it is talking to something, the other option being getting a timeout back.

Still think its odd and bit of a loss what it may be, lets see what netmon gives us.
0
 

Author Comment

by:cormark
ID: 18044870
I installed the Network Monitor tools but I don't understand how to setup a filter - will it hurt if I start to capture all?  Can someone tell me how to setup a trigger just to capture the TS traffic?  The Local IP address is 192.168.0.110 the remote address is 192.168.110.1 (which is the firewall).
0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 
LVL 43

Expert Comment

by:Steve Knight
ID: 18045463
OK, sorry.  In Network monitor press F8 to get capture filter then click on Address Pairs in the list then you can click on the Address button under Add.
Make sure it says Include at the top then choose the two IP addresses from the list.  If the list doesn't include the ones you want click on Edit Addresses and you can Add them.  If so give it a name, enter the address and type as IP.

Or if you feel like it get hold of ethereal or whatever it is called now and load that on, much more options to play with, though I wouldn't say easy!

hth

Steve
0
 

Author Comment

by:cormark
ID: 18048379
Should I have this run now or wait a few days before I think it's going to happen?  
0
 
LVL 43

Expert Comment

by:Steve Knight
ID: 18048819
You could run one now to see what it looks like when it is working then run it again when the problem has occurred and you have a comparison.
0
 

Author Comment

by:cormark
ID: 18048855
Okay this is what I have in the capture filters:

INCLUDE TERMINALSERVER IP <-> OAKTON IP WATCHGUARD (192.168.110.1)
INCLUDE *ANY GROUP <-> OAKTON IP ADDRESS (AS ABOVE)
INCLUDE *ANY GROUP <-> OAKTON (ETHERNET)

DOES THIS WORK AND WILL IT GIVE ME THE INFORMATION/DATA I NEED?
0
 
LVL 43

Expert Comment

by:Steve Knight
ID: 18048883
Looks like it... you should see the traffic between the two then - does the traffic appear to come from the watchguard, or do we need the correct source address there?
0
 

Author Comment

by:cormark
ID: 18048971
If I pasued it to look at the results I see all the other Destination 1 or source a = Terminalserver destination 2 or source = all the IP from Oakton.  I do not see the reverse from Oakton IP to TerminalServer.  Which I believe I had it setup at one point to see both going and coming to TS.
0
 

Author Comment

by:cormark
ID: 18076483
Hello all, it was the 7th day yesterday and guess what, I needed to do a restart.  I did have the Network Monitor going but my administrator logged me off to do the restart before I could see what was being recorded.  Althought I did see something interesting - at that time they called there were quite a few of the following entried in the Watchguard Log:

2006-12-04-15:17:49 IP discard from 192.168.110.4 port 3680 to 63.218.5.136 port 80 TCP ACK FIN PSH
(Attempt to send TCP packet on an unopen port)
2006-12-04-15:17:49 IP discard from 192.168.110.4 port 3659 to 63.218.5.136 port 80 TCP ACK FIN PSH
(Attempt to send TCP packet on an unopen port)
2006-12-04-15:17:49 IP discard from 192.168.110.4 port 3645 to 205.177.95.56 port 80 TCP ACK FIN PSH
(Attempt to send TCP packet on an unopen port)
2006-12-04-15:17:48 IP discard from 192.168.110.4 port 3606 to 205.177.95.56 port 80 TCP ACK FIN PSH
(Attempt to send TCP packet on an unopen port)

I also seen some of these type enteries in the Network Monitor log - not sure what time it happened but the IP address information was the same.  Could this type of issue cause the server to stop taking requests?  If so wouldn't a restart to the Firewall reset something enough to have connections reconnect?  I would think this would be a bandwidth issue and not a server issues.  Any thoughts?
0
 
LVL 43

Expert Comment

by:Steve Knight
ID: 18077057
Well thats port 80's onto the internet I presume?  Is 192.168.110.4 the user as before you said "The Local IP address is 192.168.0.110 the remote address is 192.168.110.1 (which is the firewall)." so is it just that the user is losing ALL connections for some reason?  I don't suppose there are any port 3389 entries in the firewall logs?

Steve
0
 

Author Comment

by:cormark
ID: 18077151
There are no 3389 entries in the firewall log and yes you are correct the 192.168.110.4 is the adddress of the remote building 192.168.0.110 is the remote desktop and 192.168.110.1 is the firewall at the remote building.   I have the log from the network monioring software, I just don't know what it all means.  Is there a way to attach or email it to see if there is something in the log that I am missing?  I also did the netstat and seen the connections for a short time and then they just started dropping off.
0
 
LVL 43

Expert Comment

by:Steve Knight
ID: 18077558
I believe you can upload them at www.ee-stuff.com using your normal ee user/password.  ZIp them up and post a link here.

Steve
0
 

Author Comment

by:cormark
ID: 18094836
Well I currently have Network Monitoring running and a TCPViewer - Monday will be the 7th day.  I am hoping to be able to post some information / logs to help resolve this mistery.
0
 

Author Comment

by:cormark
ID: 18095289
I added points to this since it's been an issues since July.  If it does get figured out - I will be GREATFUL to all who helped!!
0
 

Author Comment

by:cormark
ID: 18159522
I am going to try to post the Network Monitor - again it's 7 days and a restart was just done.  I tried to restart the license server but that didn't corrrect it either.  I am desperate!!
0
 

Author Comment

by:cormark
ID: 18159621
Network Monitor files have been uploaded.  120806 was running this morning as well as when the disconnections occured.  Thanks!
0
 
LVL 43

Expert Comment

by:Steve Knight
ID: 18159772
Can you give us a link to them?  I'm actually not going to be able to look at anything until at least this time tomorrow probably now anyway

Steve
0
 

Author Comment

by:cormark
ID: 18159816
0
 

Author Comment

by:cormark
ID: 18160123
The files are from the Network Capture so the are .CAP file extensions.  I do not see where I can save it any other way.  As a side note I just connected the second NIC to our network; on Monday I am going to try and connect from the building that is having issues to this new IP address when they lose there connection to the server.  It is a holiday so I am not sure I will be able to accomplish this Remotely...I will give an up on Tuesday if they were able to connect via the second NIC.
0
 
LVL 43

Expert Comment

by:Steve Knight
ID: 18161997
Had a quick look at the files using Ethereal to read the CAP files.  The one from before the reboot does seem to show a large amound of incoming port 3389 traffic both from the local subnet and remoe 192.168.110.x subnet but no packets at all in the opposite direction.  This could be your filter or showing there is an issue at the Terminal Services end.

  No. Time        Source                Destination           Protocol Info
      1 0.000000    192.168.0.178         192.168.0.110         SMB      Trans2 Response<unknown>
      2 0.000000    192.168.110.68        192.168.0.110         TCP      2365 > 3389 [PSH, ACK] Seq=3421663824 Ack=989258531 Win=65519 Len=45
      3 0.000000    192.168.0.178         192.168.0.110         SMB      Trans2 Response<unknown>
      4 0.031250    192.168.0.17          192.168.0.110         TCP      ms-sql-s > 1923 [ACK] Seq=118042499 Ack=2296540248 Win=64391 Len=0
      5 0.031250    192.168.110.66        192.168.0.110         TCP      3055 > 3389 [ACK] Seq=1911779100 Ack=2720517220 Win=65003 Len=0
      6 0.062500    192.168.0.142         192.168.0.110         TCP      4285 > 3389 [PSH, ACK] Seq=3731810909 Ack=1285058908 Win=65535 Len=73
      7 0.062500    192.168.110.15        192.168.0.110         TCP      3708 > 3389 [PSH, ACK] Seq=3764859720 Ack=545479034 Win=64987 Len=45

The second capture (today) seems to show normal conversation to 3389 and replies back again to the source.

I'm non the wiser at the moment as to what is going on... did you say you can get to the server over RDP from the local network at the time, the above trace tends to show you couldn't?

Steve
0
 

Author Comment

by:cormark
ID: 18162055
Yes I can get to the Terminal Server through RDP when the other location can not.  I am on the same network 192.168.0.X which can connect when 192.168.110.X can not.

It could be my filter do you see it going back at all any time?  If not then I need to adjust my filter.  I could start the filter now when it is working and we can see if traffic is going back and forth if that would help at all. I also just downloaded TS Licensing utilitiesj LSReport & LSView. not sure they will help at all but at this point I have nothing else to try.  I don't remember if I mentioned that I connect the Second Nic and will try when the .110 address does not accept connections if the .123 will that would indicate hardware issues...I am ready to rebuild this server however that would take a lot of time and if in the end it didn't solve anything I would be really mad!  
0
 

Author Comment

by:cormark
ID: 18162065
Another thought I was looking at Remote Desktop Web Connections, any thoughts or experience on this?  Looking for any solution other than a scheduled task to restart my server every Sunday!
0
 
LVL 43

Accepted Solution

by:
Steve Knight earned 250 total points
ID: 18162298
OK, not much time to look at the moment.  I don't think the web connections will make much difference, it is just a web front end to rdp afaik, i.e. an activex control or whatever does the rdp instead of mstsc.exe

The second capture did show two way packets to / from 3389..... was that pre-reboot one when it wasn't responding because that would be about right then.  It also tends to prove nothing but the server is at fault as the packets hit the server on 3389 ok, just nothing responds back again.

It migth also be worth looking in Perfmon.  You can setup a system monitor or capture a counter log for terminal services or terminal services session objects.  The first you could do with terminal services and select Active and Inactive sessions.  For TS sessions you could perhaps look at Total errors and select All instances in the r/h column.

Steve
0
 

Author Comment

by:cormark
ID: 18162711
This file of 120806 was started in the morning and left running until I needed to restart the server, so it captured when it was working and at about 10:33 is stopped taking connections from .110.X address's.  I will setup the PerfMon to start on Saturday Monday being the holiday but I am going to do some things remotely to determine if I can make the connections on the other NIC.  
0
 

Author Comment

by:cormark
ID: 18386716
This issues has not been resolved.  The best we are doing is restarting the TS every weekend.  
0
 

Expert Comment

by:mbriese
ID: 38096976
We have exactly the same issue and watchguard VPN as well, but I don't see that being an issue.
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Preface Having the need * to contact many different companies with different infrastructures * do remote maintenance in their network required us to implement a more flexible routing solution. As RAS, PPTP, L2TP and VPN Client connections are no…
Many of us need to configure DHCP server(s) in their environment. We can do that simply via DHCP console on server or using MMC snap-in on each computer with Administrative Tools installed in a network. But what if we have to configure many DHCP ser…
Access reports are powerful and flexible. Learn how to create a query and then a grouped report using the wizard. Modify the report design after the wizard is done to make it look better. There will be another video to explain how to put the final p…
You have products, that come in variants and want to set different prices for them? Watch this micro tutorial that describes how to configure prices for Magento super attributes. Assigning simple products to configurable: We assigned simple products…

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now