We help IT Professionals succeed at work.

PPPOA disconnects with large file uploads (only)

Last Modified: 2013-12-24
We have been experiencing intermittent/temporary ADSL PPPOA dropouts at a couple of sites.
Typically the ADSL modem is an Actiontec GT701 although we have also used an old Wirespeed and a Cisco 1710.

The test to recreate the problem:
Set up a continuing ping to a couple of remote sites.
Start a large file ftp transfer OUT of the local site.  i.e. an "upload".  
(Outgoing emails with larger attachments seem to cause the problem during normal operations).

With the ftp transfer going we see a few ping dropouts and then a total dropout.
The modem drops the PPPOA connection when this happens.
The problem solves itself after 20, 30 or 40 seconds - depending on the particular configuration of modem, etc. and this interruption time will be rather constant from outage to outage.

This happened at one site in July and we made it better / acceptable by changing the modem from PPPOE to PPPOA and assigning a public IP address to the modem.  

Then it started happening a couple of weeks ago at another site about 50 miles away.
The telco and the ISP are the same at both sites and things become common at some point before or when reaching the ISP equipment.  

I have good contact / communication with the ISP and only third-hand via the ISP with the telco.
We are all scratching our heads as to what must be happening and what we might do about it.

Rule out all of our office site equipment.
I can connect a bare laptop directly to the modem and recreate the problem.
Watch Question

This one is on us!
(Get your first solution completely free - no credit card required)
Fred MarshallPrincipal


OK - this is interesting although I don't necessarily understand all the terms.
I *do* understand SNR.

This situation looks like this:

The telco provides the copper and ... whatever else.
The ISP is separate and provides the PPPOA as I understand it.
The ISP is stumped.
I believe the DSLAM is with the telco.  So the DSLAM port and DSLAM questions would have to be put to them.

The ISP is indeed the "service provider" and all the telco provides is the line.
The ISP is responsible for the DSL although it flows through telco equipment.
Thus, the prodding needs to be through the ISP but likely lands with the telco.

There is no "Master Telco Frame" as nearly as I can determine.  The buildings are rather small.  We have switched modems, checked the lines, etc.


Press2EscSystems Integrator
This one is on us!
(Get your first solution completely free - no credit card required)
Fred MarshallPrincipal


OK - I can answer most of your questions off the top of my head:

First, we were running the modem in PPPOE bridge mode so it looks like a switch port and we get multiple static public IP addresses that way and has no IP of its own.
Then, on a huch, we switched to PPPOA, assigning one IP to the modem that is outside our block of addresses.  This seemed to help.  Actually, this was *the* solution for a couple of months at the first site to fail.  Then the second site failed and this "fix" didn't work.
Well, we never did isolate the root cause, so the fix was a bandaid to begin with.  And, thus my question here at EE.

The multiple public IP addresses are "connected" via a single ADSL connection - so the devices share the bandwidth.  Most of the time the bandwidth is modest anyway.

VPN: yes on a couple of the IPs; no on others; like this:
VPN #1 and VPN#2 on an RV042 with its own IP.
3rd party VPN on a new Cisco VPN router with its own IP (not under our control / no access).
Internet firewall on a Juniper Networks SSG5 with its own IP.
... it just worked out best this way although we might have used one fewer IP addresses by running VPNs through the Juniper.  The RV042 VPNs just work and preceded the Juniper box.

For testing:
Continuously ping an address on the 3rd party VPN remote subnet.  This is critical to operations so it makes sense to test it.
continuously ping google through the firewall.
initiate a large file upload using ftp - through the firewall.
the ping times increase from around 100ms to over 1,000msec in some cases during the upload.
pings drop out perhaps more frequently but not *too* frequently if it's going to *not* fail.
pings drop out first one then a few then all of them continuously for 30 to 40 seconds in a "failure".  This causes the 3rd party VPN to drop out along with everything else.
Then the connection is automatically remade and the pings return.

We aren't dropping packets regularly unless two things are happening:
1) the "problem" exists - when generally means it is there for days or longer and maybe never goes away.  I don't know because we've always "fixed" it or it fixed itself after a number of days/weeks.
2) there is a large file upload happening like an email with a large attachment.  Most other things don't cause dropouts.

Packet losses increase from nil but slightly during an upload if there's not going to be a failure.

I dont' have a tracert to google but could generate one if necessary.  Right now I don't know that I can make the failure occur.  It would only work pre-failure anyway.

The Cisco "modem" seems to have a bit better results but it too has failed on occasion.

I can get results from the Actiontec but I'm not sure it generates line stats.  
I don't have access to the Cisco "modem".
I'm not sure but I think the Cisco is still in use at the one site it's located.
I can try to access the Actiontec there - if I don't see it then it's not connected / in use.

Yes, we've run speakeasy.net tests often enough.  One site has 3M down and 500k up.  Another has around 1700k down and 500k up ... more or less.  Both of these sites have failed at one time or another.

The ISP is Reachone.  The telco is Centurytel.  
This is in SW Washington state.

After you've absorbed this, let me know what I can do to improve the information, including any of the specific things you've already asked for.

Fred MarshallPrincipal


While it should be obvious because this is about PPPOA dropouts:

Rarely one will see packet losses on one and not on the other but not when there's a failure of course.  When one ping stream stops responding altogether, so does the other - coinciding with a PPPOA disconnect.
Press2EscSystems Integrator

Thanks for the info...

From your speedtest results, I am suspect you may potentially be having a provisioning issue a/o line capacity issue.  Unfortunateky, it does not appear the Actiontec has a line spec listing...  

I would encourage you to contact the tech helpdesk and ask them for your specific line characteristcs (e.g., noise, attenuation, capacity, provisioning, etc)...  When you get this critical info, post it. If applicable, I can dycpher the info..

BTW, is the DSL line provisioned for 3M/? or 6M/512?  Is Reachone (ISP) simply a reseller and Centurytel owns the equipment (e.g., DSLAM, NOC, etc)?.

Judging by your network needs and multiple subnets, you are very likely pushing (beyond?) you connection capacity..  How many total workstation, servers, routers/firewalls, etc are sharing the DSL line?

My initial request for a tracert is because this data may indicate some latency a/o potential TTL issues.

Fred MarshallPrincipal


I need to "package" this information so that I can pass it to the ISP who will pass it to the telco as apprpriate.

Reachone is an ISP with servers and routers providing a variety of network services - IP address blocks, email, doman hosting, etc. etc.

Centurytel is a telco that provides communications (including ADSL) from the customer to an interim geographical point from which Reachone takes over on their own equipment / links, etc.\

It's my understanding that the PPPOA termination is in a Reachone router.

The initial office to fail has 8 or 9 workstations.
I have described the devices sharing the DSL lines.
The second office to fail has 6 workstations.
Same description.

DSLAMs are Centurytel's.

Tracing route to google.com []
over a maximum of 30 hops:

  1    <1 ms    <1 ms    <1 ms  192.168.xxx.xxx
  2     1 ms     2 ms     1 ms  dsl-238-81.satsop.reachoneinternet.net [216.177.
  3    42 ms    45 ms    42 ms  dsl-230-1.satsop.reachoneinternet.net [216.177.2
  4    46 ms    46 ms    46 ms  fa0-0-cr01-pdxp.reachoneinternet.net [216.177.25
  5    50 ms    47 ms    55 ms  ip65-47-24-97.z24-47-65.customer.algx.net [65.47
  6    47 ms    52 ms    47 ms  p4-3-0.mar1.beaverton-or.us.xo.net [
  7    52 ms    52 ms    51 ms  p5-1-0-1.rar1.seattle-wa.us.xo.net [
  8    71 ms    73 ms    69 ms  p5-0-0.RAR2.SanJose-CA.us.xo.net []
  9    68 ms    71 ms    69 ms []
 10    69 ms    68 ms    70 ms []
 11    69 ms    72 ms    74 ms []
 12    71 ms    70 ms    70 ms
 13   244 ms    76 ms   197 ms
 14   154 ms   131 ms   150 ms
 15   153 ms   195 ms   158 ms
 16   151 ms   144 ms   137 ms
 17   144 ms   146 ms   148 ms  jc-in-f99.google.com []

Trace complete.
Press2EscSystems Integrator

fm, the trace looks good, no TTL issues..

reachone is your reseller, so i'm guessing you must go thru them (?). in any case, i would guess that reachone's clout with the isp, is predominately dependent on their $ize and contract.

as stated earlier, contact the your provider and ask them for all your "line stats" aka dsl footprint.  also, have them verify your provisioning... hopefully, their tech support will understand the ramifications of the specs as they may relate to your issue.  if not, post 'em...

Fred MarshallPrincipal


Yes.  We are Reachone's customer.

Have sent the request.
Fred MarshallPrincipal


Here are the stats from the modem at our end.  This seems to be the best that they can suggest.  I note that the Downstream margin of 29 (I believe that's dB) and the Upstream margin of 19 never change .. for whatever that's worth.

I have found a workaround that at least I have control over.  We have a managed switch just downstream from the modem.  It has QoS capability.  It seems that if I do traffic shaping for upstream traffic bandwidth and set it *just below* the delivered speed from the otherwise unlimited link speed, then the ping times don't increase and the dropouts don't occur during a large file upload.  Without this internal bandwidth limit, the problems occur.  I don't think anyone will notice the difference in speed as it's "close".

Now, philosophically, I don't think that *we* should have to limit the bandwidth in order for the link to work properly - but a fix is a fix.

cat /proc/avalanche/avsar_modem_stats

7 DSL Modem Statistics:
SL Modem Stats]
      US Connection Rate:     608     DS Connection Rate:     3584
      DS Line Attenuation:    4       DS Margin:              29
      US Line Attenuation:    8       US Margin:              19
      US Payload :            647461440       DS Payload:             51335995

      US Superframe Cnt :     21648590        DS Superframe Cnt:      21648590
      US Transmit Power :     11      DS Transmit Power:      13
      LOS errors:             0       SEF errors:             0
      Frame mode:             3       Max Frame mode:         0
      Trained Path:           0       US Peak Cell Rate:      1433
      Trained Mode:           3       Selected Mode:          1
      ATUC Vendor ID: 1095516994      ATUC Revision:          1
      Hybrid Selected:        1

      [Upstream (TX) Interleave path]
      CRC:    0       FEC:    0       NCD:    1
      LCD:    0       HEC:    0

      [Downstream (RX) Interleave path]
      CRC:    0       FEC:    0       NCD:    0
      LCD:    0       HEC:    0

      [Upstream (TX) Fast path]
      CRC:    2       FEC:    34      NCD:    0
      LCD:    0       HEC:    0

      [Downstream (RX) Fast path]
      CRC:    1       FEC:    0       NCD:    0
      LCD:    0       HEC:    0

TM Stats]
      Good Cell Cnt:  13488780
      Idle Cell Cnt:  514246659

      Good Cell Cnt:  10694999
      Idle Cell Cnt:  3100166526
      Bad Hec Cell Cnt:       0
      Overflow Dropped Cell Cnt:      0

AR AAL5 Stats]
      Tx PDU's:       1900386
      Rx PDU's:       1854629
      Tx Total Bytes: 574556314
      Rx Total Bytes: 435557320
      Tx Total Error Counts:  0
      Rx Total Error Counts:  7243

AM Stats]
      Near End F5 Loop Back Count:    38822
      Near End F4 Loop Back Count:    0
      Far End F5 Loop Back Count:     1
      Far End F4 Loop Back Count:     0

Is it possible the router is set to "oversubscribe" the link?
Fred MarshallPrincipal


Thank you!
I'm not sure what router you're referring to.  Here is the setup:

ISP gateway to assigned block of IP addresses.
PPPOA link
Local modem
"Internet Switch"  Linksys SWR208 managed switch.
Firewall-------VPN-------------VPN <<<<<public IP addresses on upstream side

(The VPNs are on separate devices for a number of reasons but conceptually could be implemented in the Juniper SSG5).

Now, it seems like we already had the problem .. but I can't recall for sure .. so we replaced our original "internet switch " with the SWR208 managed switch so we could monitor things.  Never did see dropped packets there.  But, the problem was clearly apparent (yet intermittent from site to site and from month to month) after installing these switches in place of "dumb" switches.
The VPNs don't have much traffic and don't seem to correlate with the problem.
It's the firewall upstream traffic that can be demonstrated to cause trouble.
The Juniper firewalls were introduced  somewhat recently too.  So maybe that would be the router you're referring to.

I don't know what "oversubscribe" means exactly so I don't know where to look.  In concept, OK, but in practice what other terms / settings might attach to that idea?  Or, where would you look in an SSG5 if you happen to know that one? I've certainly not deviated from the default settings in terms of QoS or whatever else it might be.

This one is on us!
(Get your first solution completely free - no credit card required)
Press2EscSystems Integrator

I dont think it is a DSL line capacity issue... because your snr & attentuation stats look great.
With the exception of AR AAL5 (ATM Layer5) Stats error count of 7243 under Rx Total Error, everything looks in order.    Also, not sure of the ramifications of running a PPPoA session over an ethenet connection.  
I am beginning to suspect some excessive LAN traffic effecting the over-all availability of WAN bandwidth that is being provided to your via your ISP...  Kinda like the way a P2P or some trojans can eat up network bandwidth..   I would start checking firewalls port and log file in search of for unwanted / unknown traffic...   P2E
Unlock the solution to this question.
Join our community and discover your potential

Experts Exchange is the only place where you can interact directly with leading experts in the technology field. Become a member today and access the collective knowledge of thousands of technology experts.

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.


Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.