Solved

IP Masquerade problems in RH Linux

Posted on 2002-03-31
34
394 Views
Last Modified: 2010-03-17
I have a RH Linux 6.2 box, configured to be an Internet gateway/firewall. I have the default IP Masquerading modules that came with 6.2. I dial out to the ISP using "ifup ppp0".

I have five machines that can access the gateway and the Internet (one/two at a time - slowly). My XP, w2000, and NT4  boxes all reach the Net through the gateway just fine. I have downloaded ISO CD's from them too.

My problem comes with the RH 7.2 server and RH 7.2 workstations. I can "surf" the net (through the RH 6.2 gateway) but frequently lock the gateway up. I then have to "ifdown ppp0"/"ifup ppp0" the gateway to get the 7.2 boxes to continue on the Internet. I can't download any files over about 50KB (via http) without locking up the gateway.

I need to stop this RH7.2 to 6.2 lockup problem.
0
Comment
Question by:emherman
  • 21
  • 8
  • 3
  • +2
34 Comments
 
LVL 40

Accepted Solution

by:
jlevie earned 300 total points
ID: 6909514
That sounds like either a software problem on the gateway box or some sort of hardware/OS mismatch. The first thing I'd do would be to make sure that the hardware configuration is sane. Linux, especially 6.2, isn't very happy when two devices wind up using the same IRQ. In some cases one or both of the devices in conflict will not work at all, or they may partially work.

The first thing I do with any box is to see if I can disable PnP mode in the BIOS. Frequently that's all that's necessary to restore sanity to the configuration. The next thing I'll do in the case of an on-board serial port or an ISA modem is to see if I can reserve the IRQ(s) used for "legacy" or "ISA" use to prevent a PCI card from trying to also use that IRQ. Also in the case of a modem card I'll either disable one of the on-board serial ports so as to free up that IRQ for the modem, or I'll see if I can set the modem's IRQ to be some unused IRQ.

After that exercise I'll look at the output of 'dmesg | grep -i irq" (right after a boot) and the contents of /proc/interrupts and the contents of /proc/pci. From those sources I'll make a list of IRQ assignments so I can check for any resource conflicts.

After resolving any hardware configuration issues the next thing I'll look at will be the status of the OS. If the box isn't up to date with respect to the RedHat errata I'll either use up2date to install all applicable updates or I'll download the updates and manually apply them. Using up2date is the easiest and simplest method, but since I can't do that with all of my systems (some don't have Internet access) I've written a script to apply the updates. If you'd like a copy, send an email to jim@entrophy-free.net, referencing this question so I'll know which script you want, and I'll send to to you.

FYI: If everything is in order with the gateway router there's no problem with Linux clients going through to the Internet. In this case I'm not overly surprised that the windows boxes work and the RedHat box has problems. TCP/IP isn't exactly a "native language" to windows, whereas it is to Linux. Windows boxes tend no to use TCP/IP to it's full potential and Linux does. That alone could cause the gateway to choke if there were other problems, per above.
0
 
LVL 1

Author Comment

by:emherman
ID: 6910127
I'll take a crack at your solutions tonight. thanks.
0
 
LVL 51

Expert Comment

by:ahoffmann
ID: 6910336
listening ..
0
 
LVL 1

Author Comment

by:emherman
ID: 6914602
OK this is what I know:

Using the Gateway (brand) GP6-350 (which is my Linux workstation), the BIOS was set to "Plug and Play O/S -- NO". I had both motherboard com ports enabled, but I disabled one of them (the one that represents com 2). The motherboard has an embedded Ensoniq sound chip which I use. The computer has a USR 56k ISA hardware modem with jumpers set to "plug and pray".

---------------------------------

Here are the results of "dmesg | grep -i irq":

PCI: Using IRQ router PIIX [8086/7110] at 00:07.0
Serial driver version 5.05c (2001-07-08) with MANY_PORTS MULTIPORT SHARE_IRQ SERIAL_PCI ISAPNP enabled
ttyS00 at 0x03f8 (irq = 4) is a 16550A
ttyS01 at port 0x02f8 (irq = 3) is a 16550A
PIIX4: not 100% native mode: will probe irqs later
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide1 at 0x170-0x177,0x376 on irq 15
PCI: Found IRQ 9 for device 00:07.2
usb-uhci.c: USB UHCI at I/O 0x1440, IRQ 9
PCI: Found IRQ 10 for device 00:0f.0
eth0: Lite-On 82c168 PNIC rev 33 at 0xd087f000, 00:A0:CC:3D:19:98, IRQ 10.
PCI: Found IRQ 11 for device 00:0c.0
es1371: found es1371 rev 4 at io 0x1400 irq 11

------------------------------

Here are the results of "cat /proc/interrupts":

           CPU0      
  0:    3858229          XT-PIC  timer
  1:        384          XT-PIC  keyboard
  2:          0          XT-PIC  cascade
  8:          1          XT-PIC  rtc
  9:          0          XT-PIC  usb-uhci
 10:      30526          XT-PIC  eth0
 11:       3029          XT-PIC  es1371
 12:      32664          XT-PIC  PS/2 Mouse
 14:      18617          XT-PIC  ide0
 15:      79719          XT-PIC  ide1
NMI:          0
ERR:          0

---------------------------------

Here are the results of "cat /proc/pci":

PCI devices found:
  Bus  0, device   0, function  0:
    Host bridge: Intel Corporation 440BX/ZX - 82443BX/ZX Host bridge (rev 3).
      Master Capable.  Latency=64.  
      Prefetchable 32 bit memory at 0xf8000000 [0xfbffffff].
  Bus  0, device   1, function  0:
    PCI bridge: Intel Corporation 440BX/ZX - 82443BX/ZX AGP bridge (rev 3).
      Master Capable.  Latency=128.  Min Gnt=140.
  Bus  0, device   7, function  0:
    ISA bridge: Intel Corporation 82371AB PIIX4 ISA (rev 2).
  Bus  0, device   7, function  1:
    IDE interface: Intel Corporation 82371AB PIIX4 IDE (rev 1).
      Master Capable.  Latency=64.  
      I/O at 0x1460 [0x146f].
  Bus  0, device   7, function  2:
    USB Controller: Intel Corporation 82371AB PIIX4 USB (rev 1).
      IRQ 9.
      Master Capable.  Latency=64.  
      I/O at 0x1440 [0x145f].
  Bus  0, device   7, function  3:
    Bridge: Intel Corporation 82371AB PIIX4 ACPI (rev 2).
      IRQ 9.
  Bus  0, device  12, function  0:
    Multimedia audio controller: Ensoniq ES1371 [AudioPCI-97] (rev 4).
      IRQ 11.
      Master Capable.  Latency=96.  Min Gnt=12.Max Lat=128.
      I/O at 0x1400 [0x143f].
  Bus  0, device  15, function  0:
    Ethernet controller: Lite-On Communications Inc LNE100TX (rev 33).
      IRQ 10.
      Master Capable.  Latency=64.  
      I/O at 0x1000 [0x10ff].
      Non-prefetchable 32 bit memory at 0xf4000000 [0xf40000ff].
  Bus  1, device   0, function  0:
    VGA compatible controller: ATI Technologies Inc 3D Rage Pro AGP 1X/2X (rev 92).
      Master Capable.  Latency=66.  Min Gnt=8.
      Non-prefetchable 32 bit memory at 0xf5000000 [0xf5ffffff].
      I/O at 0x9000 [0x90ff].
      Non-prefetchable 32 bit memory at 0xf4100000 [0xf4100fff].


------------------------------------

Before I go any farther, I don't know how to tell if there is anything but an obvious intereference there. How does it look to you?

0
 
LVL 40

Expert Comment

by:jlevie
ID: 6914748
Well, the most obvious thing that leaps out at me is that I see nothing that looks like eth1. So there's definitely something wrong with the hardware configuration.

Since I can't see any resources assigned for eth1, I'd guess that it's "hiding behind something". Could I see what 'ifconfig eth1' shows? The interrupt ought to be in that output.
0
 
LVL 16

Expert Comment

by:The--Captain
ID: 6915268
Jlevie's comment seem on the level (sorry, I've been waiting to make that pun for ages).  Another thing to try - disable that sound card and USB controller (if you are not using them), and any other hardware that is not in use.  I am also interested in the output of ifconfig -a, but for different reasons...  I have seen boxes in the past that give excessive ethernet collisions/errors (but only when talking to specific other machines) until enough hardware was swapped out of them to make them behave - I am wondering if this is one of those cases.

Cheers,
-Jon
0
 
LVL 1

Author Comment

by:emherman
ID: 6915433
This was the results of "/sbin/ifconfig -a". The reason that /sbin/ifconfig eth1 didn't work is that the ethernet card is assigned to eth0. BTW - these results are coming from the RH7.2 workstation (know as "pig"). Cow is the 7.2 server and "troll" is the RH6.2 gateway.

eth0      Link encap:Ethernet  HWaddr 00:A0:CC:3D:19:98
          inet addr:192.168.1.17  Bcast:192.168.1.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:3750 errors:1 dropped:0 overruns:0 frame:0
          TX packets:2963 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100
          RX bytes:1043305 (1018.8 Kb)  TX bytes:287599 (280.8 Kb)
          Interrupt:10 Base address:0xf000

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:182 errors:0 dropped:0 overruns:0 frame:0
          TX packets:182 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:12410 (12.1 Kb)  TX bytes:12410 (12.1 Kb)
0
 
LVL 1

Author Comment

by:emherman
ID: 6915442
192.168.1.17 is the address for pig, 192.168.1.1 is the address for cow, 192.168.1.5 is the address for troll
0
 
LVL 40

Expert Comment

by:jlevie
ID: 6915456
Hmm, I think there's been a bit of confusion here. The data that I asked to see should have all come from the gateway box. I was wrong in asking about eth1. Looking back at the question I see that you are using PPP for the Internet link. For some reason I got it into my head that the gateway had two ethernets.

What does 'ifconfig -a' on the gateway show?
0
 
LVL 1

Author Comment

by:emherman
ID: 6915501
eth0      Link encap:Ethernet  HWaddr 00:A0:CC:D0:9F:8F  
          inet addr:192.168.1.5  Bcast:192.168.1.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:281529 errors:0 dropped:0 overruns:0 frame:0
          TX packets:238860 errors:3 dropped:0 overruns:0 carrier:3
          collisions:0 txqueuelen:100
          Interrupt:10 Base address:0xd400

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:3924  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0

ppp0      Link encap:Point-to-Point Protocol  
          inet addr:xxx.xxx.xxx.xxx P-t-P:xxx.xxx.xxx.xxx Mask:255.255.255.255
          UP POINTOPOINT RUNNING NOARP MULTICAST  MTU:1500  Metric:1
          RX packets:12508 errors:0 dropped:0 overruns:0 frame:0
          TX packets:13670 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:10

I "xxx.xxx.xxx.xxx" the internet address because they had valid external IP addresses. If you need them I can e-mail them to you.
0
 
LVL 1

Author Comment

by:emherman
ID: 6915518
I'm streaming real audio to my w2000 box now and operating as I normally do with my mail client on the Linux workstation. If I get the connection to fail. I'll get another "ifconfig -a" on the gateway (troll).
0
 
LVL 40

Expert Comment

by:jlevie
ID: 6915580
Hmm, you are showing carrier loss/TX errors on the ethernet controller. That's not normal and is probably an indicator of something wrong.

What are you using in the inside network? A hub or switch and what make/model? You could have a bad port or cable, could you swap ports and/or cables?
0
 
LVL 1

Author Comment

by:emherman
ID: 6915686
I run three machines on the lower floor into a Netsurf 8 port "10/100 switch hub". I connect the "uplink" port (lower) to port 1 (upper) on another Netsurf 8 port switch (different model) and link to three more machines on the upper floor. Yeah they are no-name switches.

Currently I have 248862 TX packets and still only three errors. I'd like to wait to see the failure again to see what it does. Then I'll swap ports/cables.

Let me try to download something right quick..
0
 
LVL 1

Author Comment

by:emherman
ID: 6915787
OK, I knew I could make it fail easily.

I went to download a program (http) 902k in size using Netscape 6.2. I got 40k of it and then it blew the Internet connection. When this happens I telnet to the gateway "troll" (I know I need to run SSH) and the login prompt stalls (only when this happens). It will take two to three minutes to get the login prompt (versus about one second) from a remote computer. Once I get that, I can login and "ifdown ppp0" and "ifup ppp0" and things will go again. Sometimes I need to restart the gateway (troll) to get it to go.

If I run to the lower level, I can access the box directly, and login immediately... even when stalled.

All internet access is now stopped until I reset the ppp0 connection.

250565 TX packets and still only three errors.

** This is one of the times where I can't reestablish the ppp0 connection and I have to reboot...

On the completion of a reboot, I have 59 TX packets and 1 error and 1 carrier.

After the reboot, everything is fine
0
 
LVL 40

Expert Comment

by:jlevie
ID: 6915922
The login delay is most likely due to a DNS timeout. Since the Internet link is hosed the gateway can't get to DNS to check for a reverse lookup of the IP of the telnet client.

From what you described it doesn't sound like it's an ethernet problem. This sounds more like a software problem.

Has the gateway had all applicable RedHat errata applied to it? Or is it still running the 'as installed' packages?
0
 
LVL 1

Author Comment

by:emherman
ID: 6915948
OK, I knew I could make it fail easily.

I went to download a program (http) 902k in size using Netscape 6.2. I got 40k of it and then it blew the Internet connection. When this happens I telnet to the gateway "troll" (I know I need to run SSH) and the login prompt stalls (only when this happens). It will take two to three minutes to get the login prompt (versus about one second) from a remote computer. Once I get that, I can login and "ifdown ppp0" and "ifup ppp0" and things will go again. Sometimes I need to restart the gateway (troll) to get it to go.

If I run to the lower level, I can access the box directly, and login immediately... even when stalled.

All internet access is now stopped until I reset the ppp0 connection.

250565 TX packets and still only three errors.

** This is one of the times where I can't reestablish the ppp0 connection and I have to reboot...

On the completion of a reboot, I have 59 TX packets and 1 error and 1 carrier.

After the reboot, everything is fine
0
 
LVL 1

Author Comment

by:emherman
ID: 6915950
OK, I knew I could make it fail easily.

I went to download a program (http) 902k in size using Netscape 6.2. I got 40k of it and then it blew the Internet connection. When this happens I telnet to the gateway "troll" (I know I need to run SSH) and the login prompt stalls (only when this happens). It will take two to three minutes to get the login prompt (versus about one second) from a remote computer. Once I get that, I can login and "ifdown ppp0" and "ifup ppp0" and things will go again. Sometimes I need to restart the gateway (troll) to get it to go.

If I run to the lower level, I can access the box directly, and login immediately... even when stalled.

All internet access is now stopped until I reset the ppp0 connection.

250565 TX packets and still only three errors.

** This is one of the times where I can't reestablish the ppp0 connection and I have to reboot...

On the completion of a reboot, I have 59 TX packets and 1 error and 1 carrier.

After the reboot, everything is fine
0
What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

 
LVL 1

Author Comment

by:emherman
ID: 6915969
Packages are as installed. However, I shut down several unneeded services. How do I get updates with a text based box? I'm still a point and click kind of guy. :-)
0
 
LVL 40

Expert Comment

by:jlevie
ID: 6916153
It's easy enough to get the updates, the problem comes in manually installing them. If you can give me a day or so I'll bring my 6.2 update script up to the current set of errata. It makes the job of applying the updates fairly easy. Send me an email (jim@entrophy-free.net) and I'll return the script to you.

Downloading the updates takes a while. You can be working on that in the meantime. Pick someplace on the 6.2 box where you have about 500Mb of free space, preferrably other that / or /usr. Then do:

# mkdir /where-theres-room/updates
# cd /where-theres-room/updates
# ncftp ftp.redhat.com
...
ncftp / > cd /pub/redhat/linux/updates/6.2/en/os
Directory successfully changed.
ncftp ...inux/updates/6.2/en/os >get -RT i386 i586 i686 images noarch

That will effectively mirror those dirs that contain errata that could be needed for your system. Not everything that will be downloaded will be used, but since I can't tell ahead of time exactly what is installed the script will intelligently attempt all of the updates, skipping any that don't correspond to an installed package.

If you're interested I also have a script for 7.2.
0
 
LVL 1

Author Comment

by:emherman
ID: 6916201
OK on the 2.1GB drive I have 797MB available in the /usr directory. This is really a single function server so there is only one user on it.
0
 
LVL 1

Author Comment

by:emherman
ID: 6916236
...ncftp!  That's pretty cool!!!
0
 
LVL 1

Author Comment

by:emherman
ID: 6916241
I have AMD K6 333 on that box and it shows as an Intel 586 so I opted NOT to get the i686 files. If I need them please let me know.
0
 
LVL 1

Author Comment

by:emherman
ID: 6916614
FYI - I have the gateway box (troll) FTPing as you had said. No masquerading. The FTP process is working fine.
0
 
LVL 40

Expert Comment

by:jlevie
ID: 6916665
The update script can be modified to not require the 686 files, so it's okay not to download them.
0
 
LVL 51

Expert Comment

by:ahoffmann
ID: 6917472
your problem looks pretty similar to one I fixed a couple of weeks ago: in my case the NIC was the culprit.
I also got similar behaviours whith some switch/hub.

I know of a problem with the Linux driver for Intel NIC up to kernel 2.4.12 (probably 2.4.15).
Not shure what your
> PCI: Found IRQ 10 for device 00:0f.0
> eth0: Lite-On 82c168 PNIC rev 33 at 0xd087f000, 00:A0:CC:3D:19:98, IRQ 10.
is, but it might be worth just replacing your NIC and then try again.

FYI: I also have seen a NIC which stated unexpectly flooding my switch with millions of different MACs, so the switch stops working properly (behaves like a hub then). I didn't dig deeper in this problem, means if it was a hardware problem of the NIC, or a driver problem. Replacing the NIC solved it.
0
 
LVL 1

Author Comment

by:emherman
ID: 6921471
I'm having problems downloading the files that you asked (jlevie). I did just change network cards from a Netgear FA310TX (Lite-On) to a Zonet (Realtek compatable) NIC. I'll try to resume downloading the updates and see how things go.
0
 
LVL 1

Author Comment

by:emherman
ID: 6921479
I also changed the gateway's (troll) port in the switch from #3 to #8. Cable looked undamaged and was prefabricated cat-5.
0
 
LVL 51

Expert Comment

by:ahoffmann
ID: 6921726
AFAIK both NICs are low(est) cost, just keep in mind ...

BTW, there was a very intersting test of NICs in german magazin c't 6/2002. The benchmark compares sevaral common used NICs on Linux and Windoze.
0
 
LVL 1

Author Comment

by:emherman
ID: 6921834
Yeah, I realize they are cheap NIC's. I have an (ISA) Intel PRO NIC with a chip code of "FA82595TX" on the shelf. I was tempted to drop that one in. So far so good on the Realtek.
0
 
LVL 1

Author Comment

by:emherman
ID: 6956829
OK - an update. I downloaded the updated 6.2 files for the gateway (troll) and I have run the install script emailed to me from jlevie. I am testing the network in regards to my downloading problem and getting the 7.2 workstation files via Red Hat's "up2date". I'll update as I know something new.
0
 
LVL 40

Expert Comment

by:jlevie
ID: 6957996
Check your email... There's a significant problem with the updates of your 6.2 system.
0
 
LVL 1

Author Comment

by:emherman
ID: 6960865
I ran the updated script sent by jlevie. The first time through, the updates were "successful", however, they didn't stop the problem. I had some dependancy problems to fix and was sent a second script. I hastily ran the second script (not following the notes at the beginning of the script) and managed to create a non-bootable computer. :-(

Unfortunately, since this is a production computer for my small LAN, I had to get it up and running quick. I ended up reinstalling the RH6.2 OS and starting over. I had to write over my /usr directory to have enough space on the 2.1g drive. This means that I have to get the updates again from Red Had (56k).

On the RH 7.2 workstation, I did successfully run "up2date" and got it updated to "2.4.9-31" (i686). However, this did not change the problem.

Some observations that may (or may not) change your thinking of the problem:

- I downloaded, via FTP at a command prompt (no X), the entire pile of updates for the gateway (troll) with only five disconnects... which could have come from the ISP. This was direct from the gateway to Red Hat.

- I downloaded under KDE, all the files for the RH7.2 workstation with about the same number of disconnects... which also could have come from the ISP. I was using the Red Hat "up2date" program in KDE. This was masqueraded from the workstation, through the gateway, to Red Hat

- The gateway provides IP Masquerading for all of the workstations. NT, w98, and w2000 boxes do not appear to have the lockup problem. It only apears to happen when the 7.2 workstations try to access the Internet via browsers. I'm not sure if getting the mail using the Linux boxes causes any lockups.

- I would get the problem when trying to get an HTTP (I think) download for a Netscape 6.2 program. I can also get it when actively "surfing", but it is not as predictable. I can get it using Konquerer too.

- I had to buy the CD for Netscape 6.2 since I could not successfully download it.

- I have problems downloading files from ANY web site using the Linux based browsers.

- When the lockup occurrs, I telnet in to the gateway from the 7.2 workstation and reset the Dial-up connection to the ISP. It takes about a minute or two to get a login prompt.
0
 
LVL 1

Author Comment

by:emherman
ID: 7198582
My network situation changed to the point that this question is no longer valid.

Cable became available in the area and I connected to it and dropped the dial-up. I went with other firewall options so the whole Linux 6.2 thing is also no longer applicable.

I would like to delete this question so I don't close out the question and have an erroneous one that others might purchase.

I thought there was an easy way to delete a question, but I can't seem to find it. Can someone tell me how to do it?

BTW - Thank you for all who have helped me on this question..
0
 

Expert Comment

by:CleanupPing
ID: 9078521
emherman:
This old question needs to be finalized -- accept an answer, split points, or get a refund.  For information on your options, please click here-> http:/help/closing.jsp#1
EXPERTS:
Post your closing recommendations!  No comment means you don't care.
0

Featured Post

Zoho SalesIQ

Hassle-free live chat software re-imagined for business growth. 2 users, always free.

Join & Write a Comment

Suggested Solutions

I have seen several blogs and forum entries elsewhere state that because NTFS volumes do not support linux ownership or permissions, they cannot be used for anonymous ftp upload through the vsftpd program.   IT can be done and here's how to get i…
Note: for this to work properly you need to use a Cross-Over network cable. 1. Connect both servers S1 and S2 on the second network slots respectively. Note that you can use the 1st slots but usually these would be occupied by the Service Provide…
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.
This video demonstrates how to create an example email signature rule for a department in a company using CodeTwo Exchange Rules. The signature will be inserted beneath users' latest emails in conversations and will be displayed in users' Sent Items…

705 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

14 Experts available now in Live!

Get 1:1 Help Now