Link to home
Start Free TrialLog in
Avatar of Brian Withun
Brian WithunFlag for United States of America

asked on

Diagnose Slow and Failed downloads over WLAN - Part III

A followup question to my earlier question:
https://www.experts-exchange.com/questions/27949262/Diagnose-Slow-and-Failed-downloads-over-WLAN-Part-II.html

In that question, DavisMcCarn observed that:

"the wired machine is sending 80 times faster than the receiver and filling the memory buffer in the Trendnet creating the stall"

This could well be true, but maybe not.  The reason I think this may not be true is that when the device downloads the file through the AP, it takes several minutes and has an effective throughput of approximately 100KB/sec.

I can download that same file from the same server through the same AP using my Windows laptop rather than the "device" and the file downloads in a matter of seconds, at a rate closer to 4MB/sec.

If Windows7 can download the file at 4MB/sec, why would the AP "clog up" and introduce delays when the device is downloading the file at a mere 100KB/sec?
Avatar of Brian Withun
Brian Withun
Flag of United States of America image

ASKER

or is it precisely BECAUSE the device is so slow that data tends to back up on the AP?

Server ----> AP ----> Windows7 laptop (4MB/sec)

Server ----> AP ----> Device (100KB/sec)

If the server sends quickly, and Win7 receives quickly, all is well.
If the server sends quickly, and the Device receives slowly, the AP clogs up?

Is that what's happening?
It depends - what card is in the receiving 'device'?

What speed card I mean?
Is it a wireless G(54M) type wireless card like in the laptop?

Also does this '5 second' delay happen once during the transfer? Or multiple times?

Also like advised on previous posts in your previous questions - have you moved the channel to channel 1 or 11? Cause in your captures there are multiple networks running on CH5, 6 & 7 already - so this can cause interference no matter what the device...
The receiving device has the zcomax XG-180MU (an IEEE 802.11g 54Mbps module)

The delay happens often throughout a large file transfer.

I can completely ignore other networks on or near channel 6.  I've tested with them present, and busy, present and idle, and not present.  They are not the primary factor in this problem as the characteristic delays occur independent of them.

I HAVE seen a "noticeable" improvement in my trials after having slowed down the transmitting NIC on the server to 10Mbit/Half Duplex.

The problem isn't gone, but it does seem to be qualitatively better.  Fewer 5s gaps.  Much fewer and farther between.

I'm still testing.  I'd like to be able to throttle down the rate which IIS is serving this file.
Right, going to go at this a different way(your comment got me thinking)

Server ----> AP ----> Windows7 laptop (4MB/sec)

Server ----> AP ----> Device (100KB/sec)

Now looking at this here's my understanding...

Server to laptop is pushing at 4MB/sec - therefore time it takes to do transfer is quick
Server to device is 100KB/sec - therefore time to transfer is WAY slower

Now if there is ANY type of interference in the mix, you will get delays - why? Cause since its taking a very long time to do the transfer, there's more chance that nearby interference will interrupt the transfer, thus you get breaks in comms, and the 3-5 seconds the transfer needs to recover...

So next question - why is the transfer so slow? Is this by design? Surely if its got a 54M card it should be able to push the transfer at same speed as the laptop?

Also you say that other networks aren't in use and therefore can't be the source of interference - have to disagree there to be honest, so for the sake of arguement can you change the channel to 1 or 11 and see if it makes any difference? Can't do any harm and will at least eliminate one more variable from the equation - diagnosing issues is hard, and taking other variables out of the loop is the first step in getting to root cause...
One of the reasons the transfer is slower is the way the circuitry is laid out.  The data through the 802.11 module has to cross a particular point twice, so it interferes with itself.  This is being designed away, but for now it is a limitation.

I will alter the AP to operate on channel 11 and recapture the download.  I do not expect to see anything different, but as you say, for completeness it is worth a shot.

It appears that limiting the transmit speed of the server is having an effect on the number of gaps that appear in the application logfile.  Slowing it down reduces the errors.

Do consumer access points like this TrendNET thing actually store-and-forward data?  Do they have buffers that can actually fill up?
What's the exact model of the Trendnet?

AP's are normally simple mini-switch type devices, but again depends on the model, obviously there are AP's that can handle 25-50 users without hassle, so its all related to the model to say whether its fit for the traffic or not...
The AP is a TEW-652BRP.

Attached is the Ch11 capture, and a corresponding application log.

The application log summary:

>  82.9 K/s
> gaps=42 time=36.9s
> chunks=38183 avg=0.00005s max=5.15s




There were a total of 38,183 network reads performed by the application and each read was requesting (and receiving) 436 bytes.

42 of these took longer than 0.01 seconds, the longest one taking a whopping 5.15 seconds.  These added up to a total of 36.9 wasted seconds on the 16MB file.

There were 6 instances where the read took ~5 seconds.  There were 2 instances that took about 1 second.  The rest were somewhere between 0.01 and 0.63s.
Cap29.txt
Cap29.zip
Sorry, I've been busy most of the day...ok I've looked at this capture and there's only 1 thing I can see that may help...

Just before the 5 second gap there's always a DUP ACK happening - DUP ACK's happen when there's a loss of comms - i.e. receiver missed some packets and asks for a retransmission etc...

So how to find out why this happens - hard to say exactly...

One thing you can do - do a capture using the laptop, I mean transfer file from server to laptop and run capture - can you see any similar drops/gaps? If not then we have to conclude that some part of the code that you are writing is causing this drop, since if the laptop capture is clean then your infrastructure is probably in good shape...

As for finding the issue I'm unsure how I can assist - but again I see the transfer rate is slow - is this by design? Or why are you transfering at below 100KB/sec?
I think 100KB/sec is as fast as this architecture can go.  The device wasn't originally intended to be a wireless device and so when the call came, the zcomax module was fit in as best it could.  The entire board is being rearchitected with wifi as part of the design.  It will go faster then.

I would originally have agreed with you that because-the-laptop-is-faster-and-error-free that the problem is in the device -- something about the way it's talking is wrong.

I am not so sure anymore.  I think the problems may be cropping up because it's "slow" not because it's got a logic problem.

The laptop won't have the problems because the laptop performs a fast transfer.  At least this is my belief.

I will do the by-laptop file transfer capture.  That is the real measure.  Compare the by-device capture to the by-laptop capture and look at the traffic, right?  That should leave no more questions.  I'm just not skilled enough to look at two such captures and draw meaningful conclusions.
ASKER CERTIFIED SOLUTION
Avatar of Shane McKeown
Shane McKeown
Flag of Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I'm closing this issue as unresolved, but the discussion has been helpful and informative.
Thank you for your thoughts.