Link to home
Start Free TrialLog in
Avatar of Andrej Pirman
Andrej PirmanFlag for Slovenia

asked on

VPN PPTP tunnel to Windows 2003 server dropping

Hi,

I have an issue with VPN connection to our company server farm.
One of our servers resides in DMZ zone and serves as application frontend server for OWA (also Exchange proxy for RPC over HTTP), VPN and some web sites. With OWA, Outlook and web sites there are no problems, while *some* of the users have prooblems with VPN.

Symptoms:
VPN uses simple PPTP protocol, and first connection is always OK. But after some time, usually after an hour or two, VPN tunnel stops responding - VPN connection still shows status "Connected", but none of remote resources are available anymore.
If I disconnect and try to reconnect imidiatelly, VPN dialup authentication stops with "Verifying username and password" and produces Error 628. I need to wait at least 5 minutes to reconnect sucessfully.
But even after 5 minutes, VPN brokes down again very quickly, after 5 or 10 minutes, so if I want to work remotelly for longer time, I need to wait at least 30 or 60 minutes before redialing VPN.

What I tried already:
- I checked VPN log on our server, but those failed retries and drops are not visible anywhere
- Checked Event Log under Security events, but no authentication errors are detected
- Since we were on 4 Mbps ADSL, I requested line improvements and now we are on 100 Mbps Optical fibre with all new networking equipment (previously ADSL modem -> D-LINK DFL-1600 firewall, now we have Fibre-to-Ethernet converter -> Cisco 2811 router & firewall)

Cisco is configured to accept and forward GRE protocol and port 1723 to our RRAS server.

I still don't know which part of VPN tunnel is causing problems - is it client side, maybe firewall, or our RRAS server.
Any idea how to diagnose the problem?
ASKER CERTIFIED SOLUTION
Avatar of ChiefIT
ChiefIT
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Andrej Pirman

ASKER

Very interesting reading, ChiefIT. I'll go through all these and return with foundings. Thanx.
Is portfast my issue:

I have helped about 20 people overcome portfast issues. Thought this explains portfast, it really doesn't say how it effects the new operating systems. Pay close attention where it takes 50 seconds to negotiate the handshake for the computer on a port. That 50 seconds is what times out the port. With the port frozen it takes an ample amount of time to unfreeze. So, it causes intermitant comms that are not seen in event logs or DCdiag reports real easily. Once in a while you will get a 5719 error for any of a number of different server services. This could include DNS, Netlogon, DHCP, AD authentication, FRS, ect. So, a 5719 error is usually a dead give away. But, if it doesn't exist, you still could have a portfast problem. Portfast being disabled can be service specific. So you may be able to ping, but DNS doesn't work.

Troubleshooting these errors with portfast is diffiuclt. But, new OS's need portfast enabled or they will time out on certain ports due to a NIC flood.

I was hesitant on providing this article because in a way I disagree with it. This article will lead you to believe porfast is not your issue. Portfast can certainly be this issue, but, it does explain portfast pretty well. It all has to do with this 50 second discovery process. On new OS's they can't wait that long to spit out the data packets. So, pay close attention to that.
http://tcpmag.com/qanda/article.asp?EditorialsID=277
Thanx for explanation, ChiefIT, I see PortFast is an issue maybe somewhere deeper in our topology.
But regarding VPN...hmmm...there is no need to discover new routes, since it is plain simple:

Client -> INTERNET -> Our company's Cisco FW -> D-link Switch -> RRAS server

I also tried replacing Cisco FW for D-link DFL-1600 FireWall, but same behaviour occured, no change at all.

Additionally, my main problem is that already ESTABLISHED VPN link is losing connectivity. Link is up, lease not expired, but remote site is not accessible via VPN anymore. Could STP PortFast problem reflect in such a behaviour? I doubt very much.
 
I have always liked this link when addressing a VPN issue:
http://www.microsoft.com/smallbusiness/support/articles/ref_net_ports_ms_prod.mspx

When you run into problems it is always nice to review the ports that need to be open for Key components of the 2003 server services. You probably have the recommended ports open, but it's always nice to doublecheck.  However, if you were having a misconfigured port, you wouldn't have communications at all.
_________________________________________________________________________________
Could STP PortFast problem reflect in such a behaviour? Certainly could. I have seen portfast be an issue on a single layered switch that communicates internally only. Correct me if I am wrong but it really sounds like you have intermittant DNS problems over your VPN link. You seem to be holding onto the IP address. Intermittancy like this is often a misconfigured switch and router especially if your key ports are opened to your remote client.

One thing I would love to know is if the D-link switch a managed switch? If not, we might want to concentrate on the router. Another thing I need to know is if you are using RRAS on the server to route over the server. So, in essence, you have a double NAT. NAT1 = Firewall, NAT2 = 2003 server

________________________________________________________________________________
Remember, I wasn't sure if the mode of operation was Cisco specific. Maybe D-link has the same problem. So, it might be best to configure the mode of operation between router and switches to be the exact same.
______________________________________________________________________________
The types of behavior you are seeing are indiciative of a misconfigured switch and/or router. Do you have a network engineer at your site to help you track these issues down?

Hi ChiefIT,
here are some clarifications:

- regarding PORTS I have opened GRE protocol on Cisco and D-link firewall, and port TCP 1723, also on both machines

- Double NAT? Nope. Only D-link Firewall is doing NAT. Servers do not have NAT configured.

- Intermittent DNS problems? Hmmm...I doubt very much, since remote resources are not accessible neither using NetBIOS names, neither direct via IP. Something else must be a problem.

- D-link switch is managed, yes. It is Layer 3 managed switch, but its configuration has not been altered from default, so everything is opened, nothing blocked.

Pain in the ass is D-link DFL-1600 Firewall, which configuration is in whole object-based. Each IP, NIC-port, roule...everything is an object, so there is no configuration screen where you could actually see which IP/port is allowed or routed where. And there are over 100 objects, named weird, like "ip_wan1" and "wan1_ip" and "ip1_wan" and I have troubles ditingushing what goes where.
But I'll dig into Firewall configuration, write a scheme and try to find original logic there.
Hi guys,
thank you for your support. I finally got stabilised optical link to my data center and got rid of ADSL. Also upgraded D-LINK DFL firmware, and now VPN forks like a charm. Seems it has been ADSL issue.
thanx for support, points will be split anyways.