Link to home
Start Free TrialLog in
Avatar of Jonathan Anglesea
Jonathan AngleseaFlag for Afghanistan

asked on

Linux connection over WAN

I have a remote site that is connected to our main site using a WAN between two Cisco routers. The Cisco GRE tunnel is up and stable.
At the remote site I have a mix of Windows 7 and Linux PCs on a single 192.168.37.x subnet with mask 255.255.255.0. The gateway is 192.168.37.10.

At my central site I have two RDS servers on a single subnet. The servers IPs are 192.168.39.246 and 192.168.39.222. with subnet mask 255.255.255.0. I use EIGRP and RIP across both subnets and the route tables in the routers are good. The gateway is 192.168.39.10.

At the remote site all of the Windows 7 PCs can remote access both servers using mstsc.exe.
At the remote site only some Linux PCs can access both servers using rdesktop/ping.

I have three out of five Linux PCs at the remote site that can only access the server with IP address 192.168.39.222. They cannot reach the server address 192.168.39.246 using rdesktop or even ping. The server firewalls are the same. The Linux installations are identical, having been cloned from the same image.  This was working until about a week ago.  Now these three machines can only see the one server. There are no ACLs on the Cisco routers that would cause this. There are no firewall settings on the Linux boxes that I can see.  I have numerous Linux PCs on the central site - all can see both servers. If I try reaching the remote PCs from the ~246 server, only the two that can reach it can be pinged or accessed using VNC. The Linux PCs that cannot see it cannot be reached.

Any assistance would be welcome as I am stuck here.
Avatar of noci
noci

Is there anything in the failing systems logfiles (/var/log/....)
kernel log, auth log, daemon log, messages? (possibly split in more detailed logging?)

In the failing systems did you try a traceroute -I (dash capital I, lot lowercase l) besides the ping?
tcptraceroute (might need an install ) can help to trace where traffic for port: 3389 is going to.

Dit anything change on the PC's around the moment when they stopped working?
Was the firewall turned on on the PC's ?
Disks full?
does dmesg show more than the kernel log?
confirm the segment and the default gateway

what is the IP

netstat -rn

if you are local on the system to which you can not remote, can this system access anything external to itself?

In short, an errand incorrect entry could result in this issue.
IP address segment netmask default gateway
a screw up in the segment netmask could explain the issue

try this, place a system and have it use an IP  245 or 247 can it then connect to the 246 using MSTSC?
If yes, segment netmask is likely the issue.
using 245 247 will likely place the new system within the direct scheme no matter what segment netmask is used.

The simplest is to check the basics in such things. IP, netmask, default gateway.
Then you can confirm the firewall rules, etc. and expand out.
You seem to have exhausted all external checks, but have not returned to check on the basic, system's own configuration.
Avatar of Jonathan Anglesea

ASKER

I checked the systems' IP configurations first off - I should have confirmed that. Remember there are two PCs that do not connect so it is not just one bad mask. They get their configs from the Cisco router that is acting as a DHCP server. To be sure I changed their configs to have manual set IP addresses, netmasks and gateways and back. Also changed their IP addresses from the "normal" to different addresses. Still no connectivity.

One observation re the traceroute. I tried this on the working and non-working boxes. No difference - neither find the target. But if I use -I then the working one does, the not working doesn't.

working:

Using -I switch on traceroute

traceroute to 192.168.39.246 (192.168.39.246), 30 hops max, 60 byte packets
 1  192.168.37.10 (192.168.37.10)  1.392 ms  1.477 ms *
 2  * * *
 3  * * *
 4  * * *
 5  * * *
 6  *  -----redacted----- (192.168.39.246)  38.493 ms  38.472 ms

not working:
traceroute to 192.168.39.246 (192.168.39.246), 30 hops max, 60 byte packets
 1  192.168.37.10 (192.168.37.10)  1.392 ms  1.477 ms *
 2  * * *
etc until 30 attempts reached.

 


dmesg shows a lot of info but nothing that jumps out as a potential causes.
The traceroute not returning data is not surprising it uses a big set of UDP portnumbers which are often blocked.
traceroute -I uses ICMP which "should" not be blocked. At least your local gateway reponds so they start of in the right direction but apparently are filtered further along the route..

I did ask for tcptraceroute (windows: tcptrace.exe) because the protocol you use is neither icmp (it actualy kindof does)  nor udp.
It uses TCP on port 3389. So even if the UDP/ICMP packets get blocked somewhere, the TCP/3389 Should not.
Please try tcptraceroute you might need to install it.

I get the impression the problem is at the central site.
confirm your VPN, connection.
Another option is to change the IP on the 246 one to another IP and see if it can be accessed.
I think we all working on a similar premise on trying to determine the issue.
presumably you also checked to make sure the 246 IP has the port listened on
netstat -an | find /i ":3389"

the other possibility is that the RDP port was changed and this is why attempts to port 3389 fail.


one the other side you can use nmap to scan the 246 IP for open ports....
The VPN is good. I have Linux PCs on the remote site that can connect to ~246, as can all the Windows PCs on the remote site. All the Linux PCs can connect to other hosts on the central site, including the ones that cannot connect to ~246. I have not created a host-specific firewall rule on ~246 to block some PCs on the remote site, which is the only way I would be able to recreate this behaviour intentionally. I have checked the firewall on ~246 and it is no different from the firewall on the central server ~222 that all the Linux PCs on the remote site are able to connect to. Ping, traceroute and rdesktop are all failing to connect. What I take from this is that the connectivity is not failing due to port, protocol or server issues.

At the central site all the PCs access the ~246 server using rdp and there are over 20 Linux PCs, 20 Windows PCs all connecting easily and without error so I do not think the server is the problem.
So you may want to try to see where the traffic for RDP goes. TCP/3389.

tcptraceroute   192.168.39.246 3389

may help (tcptraceroute != traceroute.   https://github.com/mct/tcptraceroute
ASKER CERTIFIED SOLUTION
Avatar of Jonathan Anglesea
Jonathan Anglesea
Flag of Afghanistan image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thank you both, I learned some new tricks here. Just not a definitive fix, as it self-resolved....dammit.