Link to home
Start Free TrialLog in
Avatar of Brian Withun
Brian WithunFlag for United States of America

asked on

Diagnose Slow and Failed downloads over WLAN

I have a situation I am trying to first understand, then diagnose.

In a nutshell, I've got a "device" which is having problems downloading a large file over a wireless network.  I want to first figure out what the device is doing wrong, then fix it.

overall I have four components:

> A TrendNET wireless "router" (broadcasting SSID='flash' on channel 6)
> A "Server" (WinXP) wired to the above router
> A Win7 "laptop" connected to 'flash' SSID
> A wireless "device" connected to 'flash' SSID

Using the laptop I can download a large file from the server over the wireless network and it works quickly and reliably.  This demonstrates that fast clean downloads are possible.

Then, using the device, I can download the same large file over the same wireless network.  This download is both slower and more prone to failure.  I want to discover why.

I have used wireshark to record both a download from the laptop and a download from the device.  There are certainly differences in the traffic, but I'm not familiar enough with packets to know what I'm looking at.

Can anyone tell me:

A ) Have I sufficiently explained the problem ?

and

B ) What expert services or individuals exist to analyze wireless packets and prescribe specific changes to address a problem like this ?

or

C ) What should I be looking at, or what further actions can I take to narrow down the problem and possibly diagnose this on my own?



WLAN NOTE:
These components are all nearby and there are few if any competing wireless signals.  I consider this to be a CLEAR and CLOSE wireless environment.

DEVICE NOTE:
This device contains both firmware and software governing its operation.  I have the flexibility to change both, if I knew what specific changes would improve its wireless performance.  I suspect that the device can perform better than it currently does.

SERVER NOTE:
The server is mostly at idle.  It has no other users, no significant loads on its CPU or disk.

DOWNLOAD ERROR NOTE:
When the device is downloading, I can see that it exchanges packets with the server.  These packets contain 1460 bytes of data coming from the server during the download.  Most of the time these packets are flowing smoothly and fast enough.   Occasionally there is a FIVE SECOND GAP in data coming from the server.  It appears that the server has simply paused.  As often as this happens, it almost always lasts 5 seconds, which is suspicious.  At the end of the 5 second "outtage" the server will send a packet of data to the device, but that packet is never a full (1460b) packet, but is instead something smaller like 152 bytes.  Does that mean anything?  These 5-second gaps do not occur when the laptop downloads the file.  Only when the device downloads the file.

ATTACHMENT NOTE:
The attached wireshark capture is an exerpt that shows 5 packets coming from 192.168.10.143 (the server) of lengths 1556, 1556, 248, 1556, 1556 respectively.  Notice the small packet in the middle?  It should probably have been 1556 just like the others.  It also took about 5 seconds to arrive where the others took around 0.01 seconds each.
hiccup.txt
SOLUTION
Avatar of John
John
Flag of Canada image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I'd also like to know what 'device' is - any reason for the 'vagueness' for this?

Also who owns the SSID 'pumademo'? Its running on channel 7 and possibly is interfering with the traffic, have you switched your wireless to channel 11 by any chance?

From the capture you've got the majority of SSID's in your area are running on channel 6 - so switch to channel 1 or 11 and see if you have the same issues...
Avatar of Brian Withun

ASKER

DEVICE

The reason I haven't bother to explain this device is that you've never heard of it before, so explaining it won't help.  It's a small hardware component called a puma, and we manufacture it in small numbers for specialty purposes.

As for packet fragmenting, I don't think that's the case.  I have seen the puma download an entire file wirelessly without any problems, and without any fragmenting.  It's rare, but it has happened.  More often it will encounter this 5 second delay, then recover.  This not usually fatal, but it is a big waste of time and time is an extremely important factor where these pumas are used.  If changing the firmware can save a second, I'll do it.

I would think that if fragmenting were occurring, it would occur always.  Even then, there'd be no reason for a full 5 second delay when it happens...
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Ok, so the device is a 'homemade' device that you design and program I assume? Which means its probably related to the firmware if so...

Doesn't answer the question about your networks though - you've a few SSID's running on CH6, and 2 on CH5 and CH7 - this will always lead to interference, so in terms of diagnosing this further you need to seperate out to CH1 or CH11 and re-test...
One other thing I can see...there are 2 conversations happening here

192.168.10.143 is talking to 192.168.10.199 &
192.168.10.6 is also talking to 192.168.10.199

Since we know 143 is your server - what is 199?
The conversation between 10.6 and 10.199 is happening during your '5 second window'

See my attached screenshot from the wireshark capture...
conversations.jpg
I have run this experiment at a time when there were no other (known) access points on channel 6.  I see that the captured traffic that I provided does show competing signals, but I will ask you to ignore those because this 5s gap occurs even when they are absent.

The "pumademo" SSID is another in-house network and I'm certain that even when it is present, there is no traffic on it.

The attached TXT file illustrates the problem quite clearly, I think.  Over the course of downloading a 16MB file there is an accumulated waste of at least 46.0 seconds distributed throughout the transfer in the form of large gaps (~5s) and smaller ones.

I do write software for this device.  I'm checking the clock before and after each (network) FileRead() operation.  The way it works now is that each 'read' is requesting 730 bytes, until the file is fully received.  (this number was chosen because larger numbers tended to not return as many bytes as I've asked for, so instead I ask for only as many bytes as tend to be available at any given time)  The call happens 22,808 times and on average it takes 0.00063 seconds to complete.  On 31 instances this call took an unusually long time (referred to as "gaps") -- as long as 5.02 seconds!  That is a long time to wait for a few hundred bytes to show up.

I don't really have the option to try a new wireless driver on this device.  What I can do is point to something in the packets that I think is wrong and ask my firmware team to change it.  I was hoping that sniffing the packets would turn up something obvious, but the only observation I've been able to make is that I know very little about wireless network packet traffic.
192.168.10.6

This is the laptop -- it is not involved in the download.  The puma device is acting wirelessly, and so I am interacting with it wirelessly from my laptop.  When it downloads a file from the server, it is also interacting with my laptop -- mostly just populating the console window I have opened to it so I can watch it's download progress.

I have established that this 5s gap occurs even when I am not interacting with the device from my laptop.  The device creates an internal logfile which I can inspect after-the-fact so see that the download encountered the same number and magnitude of 'gaps' during its download.

Is there something wrong with the way the device is ACKing the bytes received from the server?  Why are there two ACKs for each data packet received?

Why must both packet #2 and packet #4 exist in my capture?
I had omitted this attachment in my earler post -- apologies for that
summary.txt
192.168.10.143 - Server (84:c9:b2:37:bd:9f)
192.168.10.199 - device (00:19:70:7a:c1:d5)
192.168.10.6 - laptop (d0:df:9a:28:2e:76)
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
You've already been helpful in the discussion.  I'll be opening a new question shortly with an entire transfer to examine.

I will be referencing this question from the new question.
Thank you - I was happy to help.
..... Thinkpads_User
Away for night but will check in with u in morning. Cheers