Question

TCP connections hang in SYN_SENT

Asked by: pfenerty

Hello,

I have an intermittent TCP connection problem that affects multiple PCs on a home LAN:

Background:
- all TCP connections from the LAN are masqueraded through an IPCop Linux firewall/gateway
- IPCop is a stock 1.4.6 install ... no addons or modifications
- packets leaving the gateway are routed onto the WAN by way of a Motorola SB3100 Cable Modem
- normal firewall/gateway/LAN behavior was observed for several months before problem was first noticed

Problem:
- all outbound TCP connections intermittently do not complete for a given PC
- all such connection attempts during the problem period remain in state SYN_SENT in gateway connection table
- ping and traceroute to WAN destinations work OK as always during the problem period
- problem occurs typically in firefox 1.5.0.1 on fully patched WindowsXP
- problem also reproduced in netscape 4.7 on Linux 2.2.16 (!)
- typically only one PC is affected at a time
- i.e., other PCS on the LAN routinely establish TCP connections OK while the affected one cannot
- frequency of occurrence is typically several times per week, but typically not more than once per day

Problem Resolution:
- TCP connections, for the affected PC, begin routinely completing again after any of:
  (a) Rebooting the affected PC ... never the gateway
  (b) waiting sufficiently ... tens of minutes to hours
  (c) connection flooding ... multiple rapid repeat browser page reload requests from the affected PC ... 20 to 30 typically

Troubleshooting, so far:
- winXP PCS run Norton AV, Spybot S&D, AD-Aware SE
- IPCop firewall runs rkhunter
- (unreplied) outbound SYN packets from the affected PCs appear on the firewall WAN interface (!)
- these SYN packets appear to be well-formed, at least as far as I can tell, and seem to match subsequent, successfully SYN_ACKed packets

Discusssion:
Initially I thought this would be a Microsoft problem. But then I saw it occur on an old Linux box. So, after packet-sniffing the gateway LAN interface during the problem, and seeing, coming from the affected PC, first only a successful (UDP) DNS transaction, and then followed by groups of three unreplied TCP SYN request packets, one group for each time the connection is tried, I thought that for sure I'd find dropped packets at the firewall. But after inserting log messages up and down the gateway's netfilter chains, none of which caught anything, I eventually moved the sniffer to the gateway's WAN interface, and found there the same three lonely unreplied TCP SYN request packets that had been visible on the LAN side:

WAN interface packet capture:
- packets are sniffed against filter 'host 66.249.81.99' ... google_news server
- google_news server IPaddress determined just prior to packet capture using a non-affected PC
- STEP 1: unreplied SYN packets captured by google_news browser page request from affected PC
- STEP 2: affected PC is rebooted
- STEP 3: completed TCP connection packets captured as per STEP 1
- no changes made to the gateway, or to the laptop running ethereal, other than to start and stop packet capture, during above STEPs

So, where do the SYN packets go? Why are they ignored, intermittently? Is another subscriber on my cable feeder line hijacking them? Perhaps more to the point, at least initially, is if the packets are properly SNATed at the firewall, as they appear to be, how can the problem appear, at the WAN interface, to be localized to a single host on the LAN? And for different PCs, at different times?

??

Thanks so much!
Paul

****************************************************************************************************
From the above mentioned packet capture session, here's the first outbound SYN packet that remains UNREPLIED, such that the connection hangs in state SYN_SENT:

** Note that it is the 7th captured packet for the session. The captured browser page request was preceded by a single google_news 'ping' from the gateway, to verify last minute-reachability (3 ICMP ping requests, 3 replies).


No.     Time        Source                Destination           Protocol Info
      7 32.246799   72.134.170.173        66.249.81.99          TCP      3186 > http [SYN] Seq=0 Ack=0 Win=65535 Len=0 MSS=1460

Frame 7 (62 bytes on wire, 62 bytes captured)
    Arrival Time: Feb  3, 2006 12:57:16.931563000
    Time delta from previous packet: 30.040320000 seconds
    Time since reference or first frame: 32.246799000 seconds
    Frame Number: 7
    Packet Length: 62 bytes
    Capture Length: 62 bytes
    Protocols in frame: eth:ip:tcp
Ethernet II, Src: Shuttle_3a:ca:6f (00:30:1b:3a:ca:6f), Dst: USRoboti_40:54:70 (00:c0:49:40:54:70)
    Destination: USRoboti_40:54:70 (00:c0:49:40:54:70)
    Source: Shuttle_3a:ca:6f (00:30:1b:3a:ca:6f)
    Type: IP (0x0800)
Internet Protocol, Src: 72.134.170.173 (72.134.170.173), Dst: 66.249.81.99 (66.249.81.99)
    Version: 4
    Header length: 20 bytes
    Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00)
        0000 00.. = Differentiated Services Codepoint: Default (0x00)
        .... ..0. = ECN-Capable Transport (ECT): 0
        .... ...0 = ECN-CE: 0
    Total Length: 48
    Identification: 0xa07e (41086)
    Flags: 0x04 (Don't Fragment)
        0... = Reserved bit: Not set
        .1.. = Don't fragment: Set
        ..0. = More fragments: Not set
    Fragment offset: 0
    Time to live: 127
    Protocol: TCP (0x06)
    Header checksum: 0xd39b [correct]
        Good: True
        Bad : False
    Source: 72.134.170.173 (72.134.170.173)
    Destination: 66.249.81.99 (66.249.81.99)
Transmission Control Protocol, Src Port: 3186 (3186), Dst Port: http (80), Seq: 0, Ack: 0, Len: 0
    Source port: 3186 (3186)
    Destination port: http (80)
    Sequence number: 0    (relative sequence number)
    Header length: 28 bytes
    Flags: 0x0002 (SYN)
        0... .... = Congestion Window Reduced (CWR): Not set
        .0.. .... = ECN-Echo: Not set
        ..0. .... = Urgent: Not set
        ...0 .... = Acknowledgment: Not set
        .... 0... = Push: Not set
        .... .0.. = Reset: Not set
        .... ..1. = Syn: Set
        .... ...0 = Fin: Not set
    Window size: 65535
    Checksum: 0x6f71 [correct]
    Options: (8 bytes)
        Maximum segment size: 1460 bytes
        NOP
        NOP
        SACK permitted


****************************************************************************************************
From the above mentioned packet capture session, here's the first outbound SYN packet that is successfully SYN_ACKed, after reboot of the affected PC, such that a connection attempt completes:

No.     Time        Source                Destination           Protocol Info
      1 0.000000    72.134.170.173        66.249.81.99          TCP      3230 > http [SYN] Seq=0 Ack=0 Win=65535 Len=0 MSS=1460

Frame 1 (62 bytes on wire, 62 bytes captured)
    Arrival Time: Feb  3, 2006 13:14:38.485541000
    Time delta from previous packet: 0.000000000 seconds
    Time since reference or first frame: 0.000000000 seconds
    Frame Number: 1
    Packet Length: 62 bytes
    Capture Length: 62 bytes
    Protocols in frame: eth:ip:tcp
Ethernet II, Src: Shuttle_3a:ca:6f (00:30:1b:3a:ca:6f), Dst: USRoboti_40:54:70 (00:c0:49:40:54:70)
    Destination: USRoboti_40:54:70 (00:c0:49:40:54:70)
    Source: Shuttle_3a:ca:6f (00:30:1b:3a:ca:6f)
    Type: IP (0x0800)
Internet Protocol, Src: 72.134.170.173 (72.134.170.173), Dst: 66.249.81.99 (66.249.81.99)
    Version: 4
    Header length: 20 bytes
    Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00)
        0000 00.. = Differentiated Services Codepoint: Default (0x00)
        .... ..0. = ECN-Capable Transport (ECT): 0
        .... ...0 = ECN-CE: 0
    Total Length: 48
    Identification: 0xa8cd (43213)
    Flags: 0x04 (Don't Fragment)
        0... = Reserved bit: Not set
        .1.. = Don't fragment: Set
        ..0. = More fragments: Not set
    Fragment offset: 0
    Time to live: 127
    Protocol: TCP (0x06)
    Header checksum: 0xcb4c [correct]
        Good: True
        Bad : False
    Source: 72.134.170.173 (72.134.170.173)
    Destination: 66.249.81.99 (66.249.81.99)
Transmission Control Protocol, Src Port: 3230 (3230), Dst Port: http (80), Seq: 0, Ack: 0, Len: 0
    Source port: 3230 (3230)
    Destination port: http (80)
    Sequence number: 0    (relative sequence number)
    Header length: 28 bytes
    Flags: 0x0002 (SYN)
        0... .... = Congestion Window Reduced (CWR): Not set
        .0.. .... = ECN-Echo: Not set
        ..0. .... = Urgent: Not set
        ...0 .... = Acknowledgment: Not set
        .... 0... = Push: Not set
        .... .0.. = Reset: Not set
        .... ..1. = Syn: Set
        .... ...0 = Fin: Not set
    Window size: 65535
    Checksum: 0x7873 [correct]
    Options: (8 bytes)
        Maximum segment size: 1460 bytes
        NOP
        NOP
        SACK permitted

*****************************************

I found no Topic Area labelled 'TCP/IP', which would have been my first choice. So I have chosen "Linux Networking" (for the firewall gateway referenced below). Perhaps 'Broadband' would be a better choice? I suppose that depends on where the problem turns out to be.

I rate this 500 points not so much for urgency, as I have lived with this for a month or so by now. But I rate it pretty-damn-difficult, because I thought for sure that I'd have it figured out long ago.

This Question has been solved and asker verified All Experts Exchange premium technology solutions are available to subscription members.

Subscribe now for full access to Experts Exchange and get

Instant Access to this Solution

  • Plus...
  • 30 Day FREE access, no risk, no obligation
  • Collaborate with the world's top tech experts
  • Unlimited access to our exclusive solution database
  • Never be left without tech help again

Subscribe Now

Asked On
2006-02-08 at 13:05:02ID21729170
Tags

syn_sent

,

tcp

Topic

Linux Networking

Participating Experts
2
Points
500
Comments
15

Trusted by hundreds of thousands everyday for fast, accurate and reliable tech support.

  • "The time we save is the biggest benefit of Experts Exchange to Warner Bros. What could take multiple guys 2 hours or more each to find is accessed in around 15 minutes on Experts Exchange." Mike Kapnisakis, Warner Bros.
  • "Our team likes having a resource that is more secure than just using Google and most experts using this service really know their stuff. It's nice to look here first versus using Google." Dayna Sellner, Lockheed Martin
  • "Anytime that I've been stumped with a problem, 9 out of 10 times Experts Exchange has either the accepted solution or an open discussion of the potential solution to the problem." Kenny Red, eBay Inc.

See what Experts Exchange can do for you.

Got a question?

We've got the answer.

Experts Exchange has been collecting answers to technology questions since 1996…3 million and counting! If you have a question, chances are we already have your answer.

Screenshot of Experts Exchange Knowledgebase

Need individual assistance?

Our experts are ready to help.

If you can't find the exact answer you're looking for, ask our exclusive community of 50,000 experts. You’ll get a personalized answer from a trusted professional.

Screenshot of Experts Exchange Knowledgebase

Want to learn from the best?

Read articles from industry experts.

Thousands of free tech tips, tricks, how-to’s and tutorials are available in our peer reviewed articles section. See for yourself how smart our experts are, no login required.

Screenshot of an Article

Working on a long term project?

Store your work and research.

Save solutions to your questions, answers you’ve discovered through searching plus helpful articles in your personal knowledgebase for easy future access.

Screenshot of Experts Exchange Knowledgebase

Access the answers to your technology questions today.

Subscribe Now

30-day free trial. Register in 60 seconds.

What Makes Experts Exchange Unique?

Members of the expert community talk about why the experience at Experts Exchange is different than what you will find anywhere else.

Trusted by the world's most respected brands.

image of each brand's logo

Faithfully serving IT professionals since 1996.

Experts Exchange Logo

Try it out and discover for yourself.

Subscribe Now

30-day free trial. Register in 60 seconds.

Related Solutions

  1. SYN_SENT
    hi there, we have a small network of 30 pcs and 2 servers, everything started when one of the users told me that his pc was slow so i went and took a look at it and this is what i found. everytime the user would try to open an spread sheet from the server the pc would h...
  2. Hijack this and spybot
    Hi, Recently had a trojan virus on my system, im running windows xp. My virus scan detected it and deleted it but it keeps coming back. Ive run Hijack this, deleted the lines connected to the virus, and they keep coming back. Ran Spybot and the same happens i just cant shift...
  3. SYN Flood to Host
    Hi, I have a problem with my wireless network. I use a wireless Router/ADSL Modem. Sometimes i can't get connection to the internet, it says connection "timed out" . When I check the security log on my router it showing me : 17:10:57 **SYN Flood to Host** 2...
  4. Syn flood to host from inside the network
    Hi folks I've had a look at other questions relating to this but cannot see anything specific to my problem, so I'm hoping you can shed some light on it. This is the content of the alert message from my router: Time: 03/28/2007, 12:36:55 Message: SYN Flood to Host So...

Free Tech Articles

  1. WARNING: 5 Reasons why you should NEVER fix a computer for free.
    It is in our nature to love the puzzle. We are obsessed. The lot of us. We love puzzles. We love the challenge. We thrive on finding the answer. We hate disarray. It bothers us deep in our soul. W...
  2. SCCM OSD Basic troubleshooting
    SCCM 2007 OSD is a fantastic way to deploy operating systems, however, like most things SCCM issues can sometimes be difficult to resolve due to the sheer volume of logs to sift through and the dispe...
  3. Migrate Small Business Server 2003 to Exchange 2010 and Windows 2008 R2
    This guide is intended to provide step by step instructions on how to migrate from Small Business Server 2003 to Windows 2008 R2 with Exchange 2010. For this migration to work you will need the fo...
  4. Create a Win7 Gadget
    This article shows you how to create a simple "Gadget" -- a sort of mini-application supported by Windows 7 and Vista. Gadgets can be dropped anywhere on the desktop to provide instant information, ...
  5. Outlook continually prompting for username and password
    There have been a lot of questions recently regarding Outlook prompting for a username and password whilst using Exchange 2007. There are a few reasons why this would happen and I will try to cover t...
  6. Backup Exchange 2010 Information Store using Windows Backup
    There seems to be quite a lot of confusion around the ability to backup Exchange 2010 using the built in Windows Backup feature. This stems from the omission of this feature prior to Exchange 2007 s...

Cloud Class Webinars

  1. Avoiding Bugs in Microsoft Access
    Alison Balter takes and in-depth look at avoiding bugs in Access. In this webinar you will learn about using the immediate window to debug your applications, invoking the debugger, using breakpoints to troubleshoot, stepping through code, setting the next statement to execute, ...
  2. Top 10 Best New Features in Visio 2010
    Scott Helmers gives live demonstrations of the top 10 new features in Visio 2010. This webinar will teach you how to create compelling diagrams by adding shapes to the page with a single click, linking the shapes in a diagram to data in Excel (or SQL Server, or SharePoint), ...
  3. IT Consultant Business Secrets Revealed
    Michael Munger, Experts Exchange tech pro and IT consultant, pulls back the curtain on his very successful businesses and answers question on every IT consultant and business owner should know about. He shares secrets on what he did to solve the 5 most common problems in IT, ...
  4. Disaster Recovery and Business Continuity
    Quest CTO, Mike Billon, gives an overview of the steps involved in building a dunamic disaster recovery plan. Through case studies and an examination of software/hardware tooles for monitoring and testing, you'll gain a better understandin of where you are, where you want ...
  5. Organize Your Visio Diagrams with Containers and Lists
    Scott Helmers uses cross functional flowcharts, wireframe diagrams, data graphic legends and seating charts to teach you: how to ustilize all three new structured diagram components in Visio 2010, the best practices for organizeing shapes in previous version of Visio, how to organize ...
  6. How to Us Objects, Properties, Events and Methods in Microsoft Access
    Alison Dalter gives an in-depbth look at objects, properties, events and methods in Microsoft Access. In this webinar you will learn about using the object browser, referring to objects, working with properties and methods, working with object variables, understanding the ...

Join the Community

Give a Little. Get a Lot.

Join the community of experts here and help other tech pros by answering question in your area of expertise. You can earn FREE access to all Experts Exchange's premium features and resources.

Join the Community

Answers

 

by: dbardbarPosted on 2006-02-13 at 03:36:22ID: 15940339

I have a suggestion, which might explain the situtation.
Could it be that you have a machine in your network with the same IP as your GW? It might also be a network printer, a switch, or anything else with an IP.

This could explain why somtimes a machine can send SYN packets, but is not getting anything back. The PC might be sending the trafffic to a wrong MAC address, instead to the MAC address of the GW. Waiting sufficently or rebooting the machine will clear it's ARP cache, and next time it's sends an ARP request it will get the reply from the correct machine (your GW).


Hmmm... Actually, reading again through what you wrote (quite a lot... :-), you are saying that you saw the SYN packets on the external interface of your GW. That would seem to rule that out.


Still, what you are describing does sound similar to an identical IP problem.
Try to have a look at the MAC addresses of your PCs and GWs, and look at the arp caches of all the relevant machines before and after the problem occurs.

 

by: pfenertyPosted on 2006-02-13 at 12:15:12ID: 15944605

Hi, and thanks for your suggestion.

I've checked around, and all the IPaddress assignments look right, with no duplicates. Arp caches all seem OK, with MAC addresses matching assigned IPaddresses, but then there's no sign of the problem right now. I will check this on an affected machine and report back at the next episode.

For what it's worth, your suspicion reminded me of the somewhat misguided exercise I tried upon first bringing the gateway online. It was to be an upgrade for my original gateway, a 486 linux 2.2 ipchains firewall. When I prepared to switch over to the new gateway, my cable modem Configuration Manager claimed only a "Max 1" for "Known CPE MAC Address", and had long ago learned the old gateway MAC address. Several years earlier, when I first signed up for the cable service, Max allowable MACs had been 3. So to make sure that the new gateway wouldn't get locked out by the cable modem, I decided to clone the old gateway MAC address, as per

ifconfig eth1 down hw ether old_gateway_MAC_address
ifconfig eth1 up

But then to assure a clean DHCP assignment for the new gateway, I brought both the gateway & modem down, and then back up again. Of course at that point the gateway came back up with its actual MAC address, not the cloned one, and the modem simply learned it and never complained. Afterwards, I assumed that the cloning was 100% non-persistent, and never thought about it again, until now. The old gateway MAC address isn't in there somewhere, is it?

Otherwise, the only not-quite-right configuration I can see is that a few of the machines that are not used to go out to the WAN still have the old gateway IPaddress as the default route. One of those machines is the winNT 4.0 PDC for the windows boxes, which also runs an SNMP agent that I once played around with, that still attempts to discover and then poll the network, but doesn't find many of the machines, including the gateway.

Thanks again.

 

by: pfenertyPosted on 2006-02-18 at 12:00:26ID: 15990182

Had an episode yesterday. arp cache on the affected machine looks OK ... only entry is the gateway.

 

by: RapidDelpPosted on 2006-02-26 at 17:33:22ID: 16051889

I have been having funny connection problems in similar manner that ajusting the MTU helps.  It does not seem to be the right kind of problem for all of your symptoms, but it might be worth running up the packet size with ping from the effected machine to find out the max size might give you some insight.

ping -s 500
ping -s 1700

 

by: pfenertyPosted on 2006-02-26 at 17:55:00ID: 16051960

Thanks, I will give it a try, next opportunity. Wish I could trigger this problem, at my convenience, but so far, I have to wait for it to find me. Once or twice per week is all I get. The next time around i'd like to compare outgoing SYN request packets from the affected machine, with one not affected. Since all the LAN machines get SNATed, seems pretty funny that all but one get through ok ...

 

by: pfenertyPosted on 2006-03-15 at 10:54:05ID: 16197238

Seems like the MTU issue would not apply here. In general, once a TCP connection is established, and the client asks for something big, and the server sends it back with the "don't fragment" bit set, then maybe some router along the path, unable to handle the large packet, and forbidden to fragment it, might return an ICMP notification instead, which maybe gets blocked somewhere, and so the requested object disappears without a trace.

Such "black hole router" problems can show up as

"really weird problems which can mainly be described such that everything works perfectly from your firewall/router, but your local hosts behind the firewall can't exchange large packets. This could mean such things as mail servers being able to send small mails, but not large ones, web browsers that connect but then hang with no data received, and ssh connecting properly, but scp hangs after the initial handshake. In other words, everything that uses any large packets will be unable to work."

http://iptables-tutorial.frozentux.net/chunkyhtml/x4700.html

But in my case, after requesting a TCP connection with a SYN packet, I get no SYN/ACK back from any server, no matter what, for the duration of the episode. Since no connection is ever established, there's never any client request for any payload, large or small. The missing SYN/ACK packet, which somehow never arrives at my gateway's WAN interface, for any of the SYN request re-tries, is on the order of 60 bytes ... not a fragmentation target. After the episode passes, TCP connections establish as before the episode, and as expected.

The most puzzling part of all this is that such an episode only occurs at one host on the LAN, while all other hosts remain unaffected. This is puzzling because the gateway/router SNATs the IPAddresses for all the hosts on the LAN, such that by the time an outbound packet appears at the WAN, all the LAN hosts are indistinguishable from each other, at least by way of source address.

By way of demonstration, here are diff files for two cases of SYN request packets for a host that has become unable to establish TCP connections.

CASE 1 compares the SYN request packet for a single host:  (<) connection hangs in SYN_SENT, compared with (>) connection becomes established. The host was rebooted inbetween connection attempts in order to 'fix' the problem. The packets compared here are the packets included in my original post.


4,8c4,8
< Frame 7 (62 bytes on wire, 62 bytes captured)
<     Arrival Time: Feb  3, 2006 12:57:16.931563000
<     Time delta from previous packet: 30.040320000 seconds
<     Time since reference or first frame: 32.246799000 seconds
<     Frame Number: 7
---
> Frame 1 (62 bytes on wire, 62 bytes captured)
>     Arrival Time: Feb  3, 2006 13:14:38.485541000
>     Time delta from previous packet: 0.000000000 seconds
>     Time since reference or first frame: 0.000000000 seconds
>     Frame Number: 1
24c24
<     Identification: 0xa07e (41086)
---
>     Identification: 0xa8cd (43213)
32c32
<     Header checksum: 0xd39b [correct]
---
>     Header checksum: 0xcb4c [correct]
37,38c37,38
< Transmission Control Protocol, Src Port: 3186 (3186), Dst Port: http (80), Seq: 0, Ack: 0, Len: 0
<     Source port: 3186 (3186)
---
> Transmission Control Protocol, Src Port: 3230 (3230), Dst Port: http (80), Seq: 0, Ack: 0, Len: 0
>     Source port: 3230 (3230)
52c52
<     Checksum: 0x6f71 [correct]
---
>     Checksum: 0x7873 [correct]


CASE 2 compares the (<) SYN request packet for a host unable to establish TCP connections, with a (>) SYN request packet for a host that can connect, and does. Both hosts are on the same LAN, and both packets are captured at the shared gateway's WAN interface. The connection requests were made within minutes of each other. The broken host remained broken before, during, and after the unaffected host successfully connected.


2c2
<     Arrival Time: Mar 12, 2006 07:38:57.224563000
---
>     Arrival Time: Mar 12, 2006 07:36:22.077999000
21c21
<     Identification: 0x68a7 (26791)
---
>     Identification: 0xa865 (43109)
29c29
<     Header checksum: 0xbd82 [correct]
---
>     Header checksum: 0x7dc4 [correct]
34,35c34,35
< Transmission Control Protocol, Src Port: 3146 (3146), Dst Port: http (80), Seq: 0, Ack: 0, Len: 0
<     Source port: 3146 (3146)
---
> Transmission Control Protocol, Src Port: 3423 (3423), Dst Port: http (80), Seq: 0, Ack: 0, Len: 0
>     Source port: 3423 (3423)
48,49c48,49
<     Window size: 16384
<     Checksum: 0x118b [correct]
---
>     Window size: 65535
>     Checksum: 0x0e8a [correct]


... not very different ... ports, IDs, checksums ... !!

 

by: RapidDelpPosted on 2006-03-16 at 06:06:15ID: 16204955

Paul,

You clearly have a deep understaning of what is going on in the relm that you can observe.

I am just going to try some semi-random observations to see if they trigger other paths to expore. Maybe this will help.

From your descriptions, I think you are saying that the packet (SYN ACK) does not get back to the WAN interface, so it is either not being generated by the far side (google news) or is getting eaten on the way back (the real WAN or your cable plant).  Can you confirm that the sniffer that you have been reporting is on the WAN side of the NAT function.  I think that it is from reading what you have said.
Also, is it far side host independant? (google news and all other TCP hosts)

You also say that rebooting the local machine, or waiting will make it go away (start working again). Does the rebooting make it start working again every time?  or is it just the time duration (that it takes to reboot)?

You can ICMP ping the remote host from the non-working machien,  can you tcp-ping it?  (OK a simple question to ask, but you need a tool that I do not have and although a google for "tcp-ping download"  comes back with some possibilties, I have not tried any of these.

If someone on the cable plant is taking over, you would still see the syn ack.

OK, just thoughts. feel free to abandon, or continue.  This is interesting.



 

by: pfenertyPosted on 2006-03-16 at 11:58:07ID: 16208802

Thanks for continuing to chew on this ... I deeply appreciate any new ways to think about what's going on here. And yes, 'interesting' is one of several descriptors I apply to this.

There is no doubt about where the sniffing occurs. I plug a laptop into a hub along with the router WAN interface and the cable modem, and borrow an unused IPAddress for the session. All sniffed packets have the DHCP assigned IPAddress of the WAN interface.

And yes, it matters not what's on the far end. Any browser request for any URL eventually times out during an episode, as do mail client download requests.

I can't say for sure how recovery occurs, and so the possibility certainly exists that it's ultimately only timeout related, and everything else is coincidence. I do know that in the early days of troubleshooting this, it seemed to persist longer. Often I would go off and explore various network elements for awhile, and come back to find the problem remained. These days I only take data at the WAN interface, and exercise the broken machine more as a result. Twice recently the machine recovered before I was done. There have been times when the problem occurs while I am in no mood to be interrupted by a troubleshooting session, and so instead I simply hit the page refresh button some 20 - 30 times, at which point normal operation returns.

Regarding tcp-pinging, my next edisonian-approach troubleshooting step (have to wait for the problem to come back ... i have never been able to trigger it) is to use linux netcat, from a non-broken host on the LAN, during an episode, and attempt to establish connections on both (a) the same port number that the broken machine is attempting to connect from, and (b) a numerically very different port. The idea being that it would at least make some sense if this turns out to be socket related, such that it's the half-association, as seen from the router, that breaks.

More to the point, what I'd like to find, at least as far as understanding what's going on, is that half-associations for a -range- of ports break. Looking at the router's connection table during one episode, I saw that all the SYN_SENT hung connections started around port 3150, and continued on sequentially up through around 3186. Such behavior might explain why exercising the broken machine 'fixes' the problem ... i.e., by simply running out the broken range.

Not sure how happy i'll be, from a security standpoint, to discover such a recurring mode of operation. My man-in-middle paranoia expressed earlier about sharing the cable feed with my neighbors came from working backwards from the 'solution' ... i.e., either my SYN packets don't reach their destination, or the returning SYN/ACKs don't reach me, for a targeted set of connections. "Paranoia is just reality on a finer scale." - Philo Gant

 

by: pfenertyPosted on 2006-03-16 at 13:04:13ID: 16209569

Just occurred to me, as I posted the above, that maybe the broken port range is fixed, and not variable. So using netcat, I just sent a SYN request from port 4000 to google news and got a SYN/ACK right back. By the way, i first noticed the problem at google news, which loads a fairly lightweight webpage, reliably and rapidly, usually, which is why i continue to test there. Otherwise it's arbitrary, except that whenever i have no idea what's going on, i try to keep some set of parameters as unchanging as possible. Plus it's always nice to see what's going on in the real world.

Then I repeated the test from port 3160, one that had show up in the broken range earlier and, yep, ethereal saw only three lonely SYNs, and zero SYN/ACKs.

Now it looks like I know what's going on ... but ... how, and why ?

Is this a broken cable modem?

 

by: pfenertyPosted on 2006-03-18 at 11:07:21ID: 16225945

Not that I can imagine how a modem could break only certain bits within a stream, but i'd sure like to discover that this problem originates from within my facility, and it sure seems to be on the WAN side of my router. Wishful thinking.

For what it's worth, the problem does not seem to spill over into UDP: when I replace the IPAddress for google news with 'news.google.com', I get back the UDP DNS lookup for both cases: netcat source port 4000, and netcat source port 3160. But I get no TCP SYN/ACK for that latter case. Works great from port 4000.

Last time this occurred on its own, from a browser, I watched the router's connection table to try to better define the range of broken ports. The first hung SYN_SENT connection originated from port 3127, but 3126 was used locally. Refreshing the browser page in steps, I saw a hung connection at 3194, and then the first established connection at 3199.  Not a very exciting range ...  not even a power of 2. The binary gets mildly interesting, rolling from 3194 to 3199:

3194 110001111010
3195 110001111011
3196 110001111100
3197 110001111101
3198 110001111110
3199 110001111111

... but that's only mirrored in four bits on the other end:

3127 110000110111
3126 110000110110
3125 110000110101
3124 110000110100
3123 110000110011
3122 110000110010
3121 110000110001
3120 110000110000

... three months into the problem, grasping at bits ...

 

by: pfenertyPosted on 2006-03-18 at 11:35:39ID: 16226090

mapped out the broken port range using netcat:

3126 ok
3127 ng
...
3198 ng
3199 ok

... knowledge is power

 

by: dbardbarPosted on 2006-03-18 at 11:54:56ID: 16226180

And, above from 3199, it works OK?
If so, perhaps you should change the range of source ports on the machines

http://www.microsoft.com/technet/community/columns/cableguy/cg1205.mspx


There's also a simple way to do so on Linux.

 

by: pfenertyPosted on 2006-03-18 at 12:26:59ID: 16226326


Amazing what you can find, once you know what to look for:

Learn How Your ISA Server Helps Block MyDoom Traffic

Affected Ports

Table 1 lists affected ports known to be used by MyDoom. You should block those ports. This data is current as of 01:24:53, Monday, February 09, 2004.
#      Port Number      IP Protocol      Known to Be Used by MyDoom?

1       3127-3198        TCP               Yes


http://www.microsoft.com/isaserver/support/prevent/mydoom.mspx

 

by: pfenertyPosted on 2006-03-18 at 12:30:14ID: 16226343

thanks to all for listening, dbardbar came up with a workaround, so it's ok with me if you get the points.

-paul

 

by: dbardbarPosted on 2006-03-18 at 12:33:12ID: 16226361

20120131-EE-VQP-002

3 Ways to Join

30-Day Free Trial

The Experts

98% positive feedback on 31,087 answers since March 2000. angeliii is a Microsoft Most Valuable Professional for his work with MS SQL Server & Develoment.

He has also proven his knowledge of Visual Basic Programming, PHP Scripting and Oracle Databases.

The Experts

97% positive feedback on 10,752 answers since July 2000. lrmoore has more than 18 years experience in the networking industry.

The six-time Mircosoft MVPs specialties include firewalls, virtual private networking, and network management.

Testimonials

"...and excellent source for support... Kind of like having your very own IT dept." Electriciansnet

Testimonials

"I was apprehensive at signing up at first. However... it has already made my life as an IT administrator much easier." JaCrews

Testimonials

"WOW! You guys have great, active, and knowledgeable people on here." moore50

Business Clients

Business Clients

In the Press

"If you’ve got a question... Experts Exchange can supply an answer.”

In the Press

"...an invaluable aid for both IT professionals and those who require tech support."

In the Press

"where IT professionals provide quick answers on just about any topic"

Business Account Plans

Loading Advertisement...