Hello,
I have an intermittent TCP connection problem that affects multiple PCs on a home LAN:
Background:
- all TCP connections from the LAN are masqueraded through an IPCop Linux firewall/gateway
- IPCop is a stock 1.4.6 install ... no addons or modifications
- packets leaving the gateway are routed onto the WAN by way of a Motorola SB3100 Cable Modem
- normal firewall/gateway/LAN behavior was observed for several months before problem was first noticed
Problem:
- all outbound TCP connections intermittently do not complete for a given PC
- all such connection attempts during the problem period remain in state SYN_SENT in gateway connection table
- ping and traceroute to WAN destinations work OK as always during the problem period
- problem occurs typically in firefox 1.5.0.1 on fully patched WindowsXP
- problem also reproduced in netscape 4.7 on Linux 2.2.16 (!)
- typically only one PC is affected at a time
- i.e., other PCS on the LAN routinely establish TCP connections OK while the affected one cannot
- frequency of occurrence is typically several times per week, but typically not more than once per day
Problem Resolution:
- TCP connections, for the affected PC, begin routinely completing again after any of:
(a) Rebooting the affected PC ... never the gateway
(b) waiting sufficiently ... tens of minutes to hours
(c) connection flooding ... multiple rapid repeat browser page reload requests from the affected PC ... 20 to 30 typically
Troubleshooting, so far:
- winXP PCS run Norton AV, Spybot S&D, AD-Aware SE
- IPCop firewall runs rkhunter
- (unreplied) outbound SYN packets from the affected PCs appear on the firewall WAN interface (!)
- these SYN packets appear to be well-formed, at least as far as I can tell, and seem to match subsequent, successfully SYN_ACKed packets
Discusssion:
Initially I thought this would be a Microsoft problem. But then I saw it occur on an old Linux box. So, after packet-sniffing the gateway LAN interface during the problem, and seeing, coming from the affected PC, first only a successful (UDP) DNS transaction, and then followed by groups of three unreplied TCP SYN request packets, one group for each time the connection is tried, I thought that for sure I'd find dropped packets at the firewall. But after inserting log messages up and down the gateway's netfilter chains, none of which caught anything, I eventually moved the sniffer to the gateway's WAN interface, and found there the same three lonely unreplied TCP SYN request packets that had been visible on the LAN side:
WAN interface packet capture:
- packets are sniffed against filter 'host 66.249.81.99' ... google_news server
- google_news server IPaddress determined just prior to packet capture using a non-affected PC
- STEP 1: unreplied SYN packets captured by google_news browser page request from affected PC
- STEP 2: affected PC is rebooted
- STEP 3: completed TCP connection packets captured as per STEP 1
- no changes made to the gateway, or to the laptop running ethereal, other than to start and stop packet capture, during above STEPs
So, where do the SYN packets go? Why are they ignored, intermittently? Is another subscriber on my cable feeder line hijacking them? Perhaps more to the point, at least initially, is if the packets are properly SNATed at the firewall, as they appear to be, how can the problem appear, at the WAN interface, to be localized to a single host on the LAN? And for different PCs, at different times?
??
Thanks so much!
Paul
**************************
**********
**********
**********
**********
**********
**********
**********
****
From the above mentioned packet capture session, here's the first outbound SYN packet that remains UNREPLIED, such that the connection hangs in state SYN_SENT:
** Note that it is the 7th captured packet for the session. The captured browser page request was preceded by a single google_news 'ping' from the gateway, to verify last minute-reachability (3 ICMP ping requests, 3 replies).
No. Time Source Destination Protocol Info
7 32.246799 72.134.170.173 66.249.81.99 TCP 3186 > http [SYN] Seq=0 Ack=0 Win=65535 Len=0 MSS=1460
Frame 7 (62 bytes on wire, 62 bytes captured)
Arrival Time: Feb 3, 2006 12:57:16.931563000
Time delta from previous packet: 30.040320000 seconds
Time since reference or first frame: 32.246799000 seconds
Frame Number: 7
Packet Length: 62 bytes
Capture Length: 62 bytes
Protocols in frame: eth:ip:tcp
Ethernet II, Src: Shuttle_3a:ca:6f (00:30:1b:3a:ca:6f), Dst: USRoboti_40:54:70 (00:c0:49:40:54:70)
Destination: USRoboti_40:54:70 (00:c0:49:40:54:70)
Source: Shuttle_3a:ca:6f (00:30:1b:3a:ca:6f)
Type: IP (0x0800)
Internet Protocol, Src: 72.134.170.173 (72.134.170.173), Dst: 66.249.81.99 (66.249.81.99)
Version: 4
Header length: 20 bytes
Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00)
0000 00.. = Differentiated Services Codepoint: Default (0x00)
.... ..0. = ECN-Capable Transport (ECT): 0
.... ...0 = ECN-CE: 0
Total Length: 48
Identification: 0xa07e (41086)
Flags: 0x04 (Don't Fragment)
0... = Reserved bit: Not set
.1.. = Don't fragment: Set
..0. = More fragments: Not set
Fragment offset: 0
Time to live: 127
Protocol: TCP (0x06)
Header checksum: 0xd39b [correct]
Good: True
Bad : False
Source: 72.134.170.173 (72.134.170.173)
Destination: 66.249.81.99 (66.249.81.99)
Transmission Control Protocol, Src Port: 3186 (3186), Dst Port: http (80), Seq: 0, Ack: 0, Len: 0
Source port: 3186 (3186)
Destination port: http (80)
Sequence number: 0 (relative sequence number)
Header length: 28 bytes
Flags: 0x0002 (SYN)
0... .... = Congestion Window Reduced (CWR): Not set
.0.. .... = ECN-Echo: Not set
..0. .... = Urgent: Not set
...0 .... = Acknowledgment: Not set
.... 0... = Push: Not set
.... .0.. = Reset: Not set
.... ..1. = Syn: Set
.... ...0 = Fin: Not set
Window size: 65535
Checksum: 0x6f71 [correct]
Options: (8 bytes)
Maximum segment size: 1460 bytes
NOP
NOP
SACK permitted
**************************
**********
**********
**********
**********
**********
**********
**********
****
From the above mentioned packet capture session, here's the first outbound SYN packet that is successfully SYN_ACKed, after reboot of the affected PC, such that a connection attempt completes:
No. Time Source Destination Protocol Info
1 0.000000 72.134.170.173 66.249.81.99 TCP 3230 > http [SYN] Seq=0 Ack=0 Win=65535 Len=0 MSS=1460
Frame 1 (62 bytes on wire, 62 bytes captured)
Arrival Time: Feb 3, 2006 13:14:38.485541000
Time delta from previous packet: 0.000000000 seconds
Time since reference or first frame: 0.000000000 seconds
Frame Number: 1
Packet Length: 62 bytes
Capture Length: 62 bytes
Protocols in frame: eth:ip:tcp
Ethernet II, Src: Shuttle_3a:ca:6f (00:30:1b:3a:ca:6f), Dst: USRoboti_40:54:70 (00:c0:49:40:54:70)
Destination: USRoboti_40:54:70 (00:c0:49:40:54:70)
Source: Shuttle_3a:ca:6f (00:30:1b:3a:ca:6f)
Type: IP (0x0800)
Internet Protocol, Src: 72.134.170.173 (72.134.170.173), Dst: 66.249.81.99 (66.249.81.99)
Version: 4
Header length: 20 bytes
Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00)
0000 00.. = Differentiated Services Codepoint: Default (0x00)
.... ..0. = ECN-Capable Transport (ECT): 0
.... ...0 = ECN-CE: 0
Total Length: 48
Identification: 0xa8cd (43213)
Flags: 0x04 (Don't Fragment)
0... = Reserved bit: Not set
.1.. = Don't fragment: Set
..0. = More fragments: Not set
Fragment offset: 0
Time to live: 127
Protocol: TCP (0x06)
Header checksum: 0xcb4c [correct]
Good: True
Bad : False
Source: 72.134.170.173 (72.134.170.173)
Destination: 66.249.81.99 (66.249.81.99)
Transmission Control Protocol, Src Port: 3230 (3230), Dst Port: http (80), Seq: 0, Ack: 0, Len: 0
Source port: 3230 (3230)
Destination port: http (80)
Sequence number: 0 (relative sequence number)
Header length: 28 bytes
Flags: 0x0002 (SYN)
0... .... = Congestion Window Reduced (CWR): Not set
.0.. .... = ECN-Echo: Not set
..0. .... = Urgent: Not set
...0 .... = Acknowledgment: Not set
.... 0... = Push: Not set
.... .0.. = Reset: Not set
.... ..1. = Syn: Set
.... ...0 = Fin: Not set
Window size: 65535
Checksum: 0x7873 [correct]
Options: (8 bytes)
Maximum segment size: 1460 bytes
NOP
NOP
SACK permitted
**************************
**********
*****
I found no Topic Area labelled 'TCP/IP', which would have been my first choice. So I have chosen "Linux Networking" (for the firewall gateway referenced below). Perhaps 'Broadband' would be a better choice? I suppose that depends on where the problem turns out to be.
I rate this 500 points not so much for urgency, as I have lived with this for a month or so by now. But I rate it pretty-damn-difficult, because I thought for sure that I'd have it figured out long ago.