Link to home
Start Free TrialLog in
Avatar of ScottNero
ScottNero

asked on

Strange slow networking - unidirectional

I have two machines:

Machine "A"
NT Server 4.0 SP6a
DLink DFE-538TX 10/100 NIC

Machine "B"
NT Workstation 4.0 SP6a
Realtek RTL-8139 10/100 NIC

The machines were connected via a 250 foot long cable with a non-switching hub in between. Apparently communications were so slow that they were unusable. The hub was removed and the cable was changed to a cross-over cable, with no effect. The cable was ripped out and re-run. After it was re-run, a professional analysis was done on it, showing perfect available bandwidth, impedance, etc. So the cable is not the problem. This is when the problem was dumped in my lap.

Upon further investigation, I discovered that the problem only existed in one direction. Transferring data from Machine B to Machine A worked fine. A 6 megabyte file, transferred from Machine B to Machine A (either from command prompt via a network share, Explorer, or FTP session logging into IIS on Machine A) completes in a couple of seconds. However, transferring the same file from Machine A to Machine B takes upwards of 45 seconds.

The only network installed is TCP/IP. Bindings are Windows default bindings.

Pinging either machine from the opposite machine works perfectly. However, if the packet size is increased:

ping -l 4096 machineb

...it begins to drop packets. If the packet size is increased to 10240 bytes, approximately 95% of the ping packets are dropped.

Running a network analysis tool shows that there are no problems with the network packets themselves - no runt packets, no CRC errors, nothing - everything looks just perfect.

Changing the NICs from AUTO to 100 mB Full duplex/100 mB Half duplex/10 mB Full duplex/10 mB Half duplex has no effect.

All NIC drivers are the most recent versions available, and I did attempt to uninstall and reinstall them as well as re-applying SP6a on both machines. Nothing had any effect.

DLink tech support doesn't think it has anything to do with them or their drivers. Unfortunately, the machines are off-site (out of the country) and the only access I have to them is via PCAnywhere, so swapping NICs or plugging a laptop in to one of them is not an option.

Anyone have any suggestions?
ASKER CERTIFIED SOLUTION
Avatar of SysExpert
SysExpert
Flag of Israel image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of CyberPup
CyberPup

You may want to look at the transmit theshold on the NICs.  If you can get someone to watch the lights on the hub as data is transferred, ask them to see if you are getting a large number of collisions.  If that is so, slow down the DLink NICs speed and adjust the transmit threshold in the NICs properties.  I have found in 100MB cards, the transmission speed is so high that the collision rate increase to the point that data will not move at all.  You cannot even print a document.  So, try reducing the speed on the DLINK, adjust the buffers and transmit threshold as see if that improves the thoughput.

Avatar of ScottNero

ASKER

I'll have a look at the transmit threshold, but a) there is no hub, it's a straight crossover cable between NICs, and b) bringing the speed down to 10 MB did not make any difference (except that it took a lot longer to transfer data).
Actually a Crossover cable should make things worse not better since there is no control over collsions, traffic, etc. Also, one other thing you may try if you have one available is a switch rather than a hub
With two PC's, a crossover cable, no other traffic, and 10240 byte ping packets being the ONLY traffic, there should not be collisions causing dropping of 95% of the packets. I think it's something other than collisions happening.
Well, Scott I hate to say it becuase of the problem you have with the machines being located out of the country, but.... I would almost guarantee the problem is in the NIC card.  That is the only logical explaination.  I will add one more suggestion just in case though.  Is either of the machines running any antivirus? If so, disable it and test the network again.  Some antivirus programs, especially Mcaffee, will slow a network to a crawl.  If you have a PGP program running, that will do it as well.
One other test to try...

set all NICs AND the Hub to the same settings-- 100/Full

Your hub may be continuously "autosensing" trying to find the connection speed.
What Speed - MEM and HD Types are you using on each Machine.

Try downloading HARDINFO  a shareware utility. It will analize the hardware and test the network drive on the other PC.

If you are using PCanywhere - do you have access to both PC's

If not try having them install the PCANYWHERE on the other PC - set it to host / Netbuei  and set up a second PCany session to it and run the same tests.

I concur that one of your nic is probably flakey. Can you send them a pair of identical NICS  and have them replace them. (drivers won't change)
Check IEEE 802.1Q/p NIC settings -> try to disable support for 802.1Q/p
I am gonna stick with the bad NIC solution.  I fixed a machine today that had the same problem.  Would boot up, log onto the network but would not browse.  Response was very very slow to zero after the first browse.  This was a new NIC card, a D-link, BTW.  Swapped it out and viola! browses just fine.
I'm leaning towards the bad NIC myself. I think I've done all I can via PCAnywhere, it looks like I'm flying to South Dakota (yeehaw!) on Monday to get my hands on this problem. I'll let you know what I find...
Bring 2 spare NICs, with the newest drivers !

Good Luck !
I'd check the virus checker to see if the slow machine is scanning all files and not just program files.

HW
There is no virus checker. These are two secure machines with no outside network access. They are responsible for handling, dispensing and reconciling cash, so they are VERY secure, and there is zero chance for a virus to get on to them.

In any case, a virus scanner would not cause the problems we are seeing here. The problem is definitely network related, not file system related.
If there is no anti-virus software, then I'll agree that thats not the problem.

Disable TCP/IP on the two machines and run NetBEUI only.  This will tell you if its a TCP/IP problem and if it works and there is no need for TCP/IP, you could live with it.

Whatever the case, only run one protocol, TCP/IP or NetBEUI.

HW
I can't run NetBEUI for security reasons. I have to run TCP/IP because the application uses TCP/IP for communications. It is running TCP/IP as the only protocol.
Hmm, you said that these two machines were secure and I assume that this means that they are on a wire that no other machines are connected to, so I'm at a loss as to how the network protocol can have anything to do with security.  If anything, TCP/IP opens you up to more security holes (ports, etc).

But its your network so I don't want to tell you what to do.  As a test, why don't you run NetBEUI for just a test and that way you would at least rule out the TCP/IP issues.

If you are running TCP/IP for communications reasons, then I have to guess that there are other machines involved (you have to be communicating with something don't you.) So I'd revisit the virus issue.  

If your going to go to SD then I'd suggest bringing 2 2Com cards.  Also check all the cables to make sure they are wired correcly.

I'd also check to make sure that both machines are running the latest versions of IE (5.5 v2) and that they both have the same security patches installed.

Good luck

PS, if you're running PC Anywhere - Norton Anti-Virus is built into it.

HW
It's not up to me what network protocols they run, it's dictated by corporate standards. There are only two machines on the network (with a single crossover cable between them) so there is no security risk. It uses TCP/IP because that is what the application uses (direct port connections between apps) to communicate.

The reason for the security is that one of the PC's is embedded in a cash dispensing machine, and were someone to hack into the network and analyze the protocol for long enough, the (remote) potential is there for him to force the machine to start spitting out thousands of dollars in cash.

The cable is wired correctly, and we had it analyzed to confirm that the cable is in good condition.

And by the way, Norton Antivirus is not included with any version of PCAnywhere I've ever seen (including V10).
THOUSANDS OF DOLLARS DID YOU SAY ?? Give me the number and I'll try it with my PCAnywhere......Trust me...I'm a techie !! he he

I would agree with the majority and bet my last dollar on faulty or suspect NIC.....I recommend sticking to a good brand name (ie Intel, 3Com) and use the same make/model on both machines..Makes support a little easier.

Good luck.
CR
OK, I'm back from two days in Wonderful South Dakota (with two travel days thrown in). First off, let me say...there is a LOT of corn in South Dakota. :)

And to the guy who mentioned PCAnywhere security - in order for us to get in, we first have to call them, they then plug in the phone line, and their people watch what we do the entire time we're online. :)

In any case, the first thing I did was plug my laptop in instead of the server. The problem did not exist on the laptop. I then uninstalled the NIC from the server, pulled the NIC out, and replaced it with a different type of NIC (DLink DFE-530TX). Loaded its drivers, rebooted, and hey presto, the problem had vanished.

I'm going to try the problem (DFE-538TX) NIC in another machine here to see if it is physically defective, or if it was some combination of drivers. I'll let you all know once I have a result...
Well, I'm glad you got this solved !
I would not even keep the NIC for production.
It may work fine in another machine for a while and then creap out again.
I would label it suspect, and only use it for non-critical testing.

You had to fly all the way to corn country for a $30 NIC !
I would not chance that again....

I hope this helps !
Well, I plugged the NIC into another PC today, and wouldn't you know it - it works perfectly, even interacting with another Realtek like it was on site. So what was the cause? Who knows, but it was definitely related to that one NIC. SysExpert, you were the first one to mention the NIC, so the points go to you.

Thanks for all the suggestions, guys!