VPN PPTP tunnel to Windows 2003 server dropping


I have an issue with VPN connection to our company server farm.
One of our servers resides in DMZ zone and serves as application frontend server for OWA (also Exchange proxy for RPC over HTTP), VPN and some web sites. With OWA, Outlook and web sites there are no problems, while *some* of the users have prooblems with VPN.

VPN uses simple PPTP protocol, and first connection is always OK. But after some time, usually after an hour or two, VPN tunnel stops responding - VPN connection still shows status "Connected", but none of remote resources are available anymore.
If I disconnect and try to reconnect imidiatelly, VPN dialup authentication stops with "Verifying username and password" and produces Error 628. I need to wait at least 5 minutes to reconnect sucessfully.
But even after 5 minutes, VPN brokes down again very quickly, after 5 or 10 minutes, so if I want to work remotelly for longer time, I need to wait at least 30 or 60 minutes before redialing VPN.

What I tried already:
- I checked VPN log on our server, but those failed retries and drops are not visible anywhere
- Checked Event Log under Security events, but no authentication errors are detected
- Since we were on 4 Mbps ADSL, I requested line improvements and now we are on 100 Mbps Optical fibre with all new networking equipment (previously ADSL modem -> D-LINK DFL-1600 firewall, now we have Fibre-to-Ethernet converter -> Cisco 2811 router & firewall)

Cisco is configured to accept and forward GRE protocol and port 1723 to our RRAS server.

I still don't know which part of VPN tunnel is causing problems - is it client side, maybe firewall, or our RRAS server.
Any idea how to diagnose the problem?
LVL 18
Andrej PirmanAsked:
Who is Participating?
ChiefITConnect With a Mentor Commented:
Sounds like a portfast issue. Portfast removes the discovery request and goes right to forwarding the packets. Newer operating systems need portfast on the ports they are on in order to work right. Since configuring portfast is a make/model specific design to the switches, you might have to contact your manufacturer to figure out how to program your switches to enable portfast for your computers.

A little explaination of spanning tree and portfast.

Event ID 5719, spanning tree portfast: (an error that might pop up with portfast)

Another thing it could be is the mode of operation this is on. As far as I know this is a Cisco quirk only. Cisco switches and routers need to be on the same mode of operation. You might thing a 100MB/full duplex router will talk with a swtich configured to Auto negotiate. It doesn't they either have to both be 100Mb/ full duplex, or both on Auto negotiate.

A third thing this could be is the service pack the Server is on. SP1 for 2003 server has a quirk in it where it has too few MTU channels to work well. This can cause intermittant problems with communicating. SP2 has some networking errors as well that I am still reviewing.
SP1 issues:

An error that may/may not pop up with SP1 problems:

All three of these errors can cause no event logs and still produce intermittant comms. The server looks happy but has a belly ache.
Andrej PirmanAuthor Commented:
Very interesting reading, ChiefIT. I'll go through all these and return with foundings. Thanx.
Is portfast my issue:

I have helped about 20 people overcome portfast issues. Thought this explains portfast, it really doesn't say how it effects the new operating systems. Pay close attention where it takes 50 seconds to negotiate the handshake for the computer on a port. That 50 seconds is what times out the port. With the port frozen it takes an ample amount of time to unfreeze. So, it causes intermitant comms that are not seen in event logs or DCdiag reports real easily. Once in a while you will get a 5719 error for any of a number of different server services. This could include DNS, Netlogon, DHCP, AD authentication, FRS, ect. So, a 5719 error is usually a dead give away. But, if it doesn't exist, you still could have a portfast problem. Portfast being disabled can be service specific. So you may be able to ping, but DNS doesn't work.

Troubleshooting these errors with portfast is diffiuclt. But, new OS's need portfast enabled or they will time out on certain ports due to a NIC flood.

I was hesitant on providing this article because in a way I disagree with it. This article will lead you to believe porfast is not your issue. Portfast can certainly be this issue, but, it does explain portfast pretty well. It all has to do with this 50 second discovery process. On new OS's they can't wait that long to spit out the data packets. So, pay close attention to that.
Improved Protection from Phishing Attacks

WatchGuard DNSWatch reduces malware infections by detecting and blocking malicious DNS requests, improving your ability to protect employees from phishing attacks. Learn more about our newest service included in Total Security Suite today!

Andrej PirmanAuthor Commented:
Thanx for explanation, ChiefIT, I see PortFast is an issue maybe somewhere deeper in our topology.
But regarding VPN...hmmm...there is no need to discover new routes, since it is plain simple:

Client -> INTERNET -> Our company's Cisco FW -> D-link Switch -> RRAS server

I also tried replacing Cisco FW for D-link DFL-1600 FireWall, but same behaviour occured, no change at all.

Additionally, my main problem is that already ESTABLISHED VPN link is losing connectivity. Link is up, lease not expired, but remote site is not accessible via VPN anymore. Could STP PortFast problem reflect in such a behaviour? I doubt very much.
I have always liked this link when addressing a VPN issue:

When you run into problems it is always nice to review the ports that need to be open for Key components of the 2003 server services. You probably have the recommended ports open, but it's always nice to doublecheck.  However, if you were having a misconfigured port, you wouldn't have communications at all.
Could STP PortFast problem reflect in such a behaviour? Certainly could. I have seen portfast be an issue on a single layered switch that communicates internally only. Correct me if I am wrong but it really sounds like you have intermittant DNS problems over your VPN link. You seem to be holding onto the IP address. Intermittancy like this is often a misconfigured switch and router especially if your key ports are opened to your remote client.

One thing I would love to know is if the D-link switch a managed switch? If not, we might want to concentrate on the router. Another thing I need to know is if you are using RRAS on the server to route over the server. So, in essence, you have a double NAT. NAT1 = Firewall, NAT2 = 2003 server

Remember, I wasn't sure if the mode of operation was Cisco specific. Maybe D-link has the same problem. So, it might be best to configure the mode of operation between router and switches to be the exact same.
The types of behavior you are seeing are indiciative of a misconfigured switch and/or router. Do you have a network engineer at your site to help you track these issues down?

Andrej PirmanAuthor Commented:
Hi ChiefIT,
here are some clarifications:

- regarding PORTS I have opened GRE protocol on Cisco and D-link firewall, and port TCP 1723, also on both machines

- Double NAT? Nope. Only D-link Firewall is doing NAT. Servers do not have NAT configured.

- Intermittent DNS problems? Hmmm...I doubt very much, since remote resources are not accessible neither using NetBIOS names, neither direct via IP. Something else must be a problem.

- D-link switch is managed, yes. It is Layer 3 managed switch, but its configuration has not been altered from default, so everything is opened, nothing blocked.

Pain in the ass is D-link DFL-1600 Firewall, which configuration is in whole object-based. Each IP, NIC-port, roule...everything is an object, so there is no configuration screen where you could actually see which IP/port is allowed or routed where. And there are over 100 objects, named weird, like "ip_wan1" and "wan1_ip" and "ip1_wan" and I have troubles ditingushing what goes where.
But I'll dig into Firewall configuration, write a scheme and try to find original logic there.
Andrej PirmanAuthor Commented:
Hi guys,
thank you for your support. I finally got stabilised optical link to my data center and got rid of ADSL. Also upgraded D-LINK DFL firmware, and now VPN forks like a charm. Seems it has been ADSL issue.
thanx for support, points will be split anyways.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.