Solved

host name lookup failure

Posted on 2006-11-27
15
4,484 Views
Last Modified: 2012-08-14
I've been having problem with dns lookup that have gotten rather acute today. My mailq is usually empty, but this morning I had 40 entries which are taking about 3 hours to get queued. They have the annotation: host name lookup failure. I also have about 500 sendmail processes running and I normally have about 40. All this is telling me sendmail is having problems connecting to servers and/or resolving hostnames.

Also, I have tried pinging various hosts (such as yahoo.com, experts-exchange.com) and the pings timeout with no data. I've tried traceroute yahoo.com and I get: traceroute: unknown host yahoo.com. I've tried pinging IP addresses (ping 64.132.94.250) and it is timing out.

I suspect a problem with my DNSs, but that wouldn't explain why I can't ping an IP.

This is being a big problem. I'm running Linux 2.4.29. I'm not running named, but I have nameservers configured in resolv.conf. How do I go about figuring out my problem?
0
Comment
Question by:jmarkfoley
  • 8
  • 7
15 Comments
 
LVL 1

Author Comment

by:jmarkfoley
ID: 18024251
Hey! Is there anyone left in this topic area?! This should be an easy one for some expert!
0
 
LVL 39

Expert Comment

by:noci
ID: 18026572
dig is the tool of the trade here

  dig yahoo.com   (optionaly with +trace)
  will tell what it tries..

Does pinging on ip address still work?
  64.233.167.99   is an ip address of google.
  if not, you can have a ip routing / filtering problem.

traceroute might help here:
  traceroute -I 64.233.167.99
  will show where it goes.

if traceroute works, try with tcptraceroute
  Source is here: http://michael.toren.net/code/tcptraceroute/
  that can attempt a traceroute for say http if policy routing has been setup
  for such protocol in a different way from say smtp .
0
 
LVL 1

Author Comment

by:jmarkfoley
ID: 18033460
OK, next time I'll give dig a shot. I have more data on what's happening. As I said, I generally have about 40ish sendmail processing handling mail, nothing in the mailq and normally ping and whois and traceroute and nslookup all work just spiffy. However, every so often I'll notice my sendmail process count climbing. Today I saw it over 100 and yesterday when I was having noticable problems it was at 500. At these time ping, traceroute, whois, etc. all simply terminate with timeouts. I haven't tried dig. This condition is intermittant, but seems to be happening more frequently lately - once or twice a day; different times of day. The condition can persist on the order of an hour (today) to several hours (yesterday).

Right now, I don't think it is a mail problem. I just think sendmail is being affected like ping et al because sendmail can't resolve domains. I'm thinking:

a) nameserver problem (but then why would it usually work? And why would piniging an actual IP not work?)

b) you mentioned ip routing / filtering problem.

How would I go about determining ip routing/filtering, via your traceroute suggestion? Problem is, when this happens traceroute times out too.

Your thoughts?
0
 
LVL 39

Expert Comment

by:noci
ID: 18033510
Pinging a name involves lookup of names, pinging ip doesn't, it's a quick check for name lookup failures. (first doesn't, 2nd does work).

filter/routing problem:

1) is your own routing OK
netstat -rn   # show routes, does the network 0.0.0.0 with netmask 0.0.0.0 point
to your gateway?

If OK
2) then use ping/traceroute -I to find out where the packets go.

And find out where stuff stops, there nearby is the culprit (probably one beyond
last answering node).

If one protocol does work and another doesn't then you might have filter issues,
otherwise check the system where it all stops.
0
 
LVL 1

Author Comment

by:jmarkfoley
ID: 18037657
I'm experiencing the problem right now. Here's my test results:

1 10:21:50 mfoley@server:~
> dig yahoo.com

; <<>> DiG 9.3.0 <<>> yahoo.com
;; global options:  printcmd
;; connection timed out; no servers could be reached
1 10:22:16 mfoley@server:~
> ping 64.233.167.99
PING 64.233.167.99 (64.233.167.99) 56(84) bytes of data.

The dig timed out. I had to CRT-C the ping after several minutes. A subsequent dig 64.233.167.99 worked! But when I immediately ran it again, it timed out. You see how flakey this is!

1. netstat -rn #looks OK. My gateway is 192.168.1.1

Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
192.168.1.0     0.0.0.0         255.255.255.0   U         0 0          0 eth0
127.0.0.0       0.0.0.0         255.0.0.0       U         0 0          0 lo
0.0.0.0         192.168.1.1     0.0.0.0         UG        0 0          0 eth0

2. traceroute -I 64.233.167.99 #hangs. I killed it after 17 rows of asterisks.

It seems that where it all stops is right on the other side of my gateway (I can ping my gateway), but as you can see, one of the dig's during this testing session made it through. What could it be? Downstream load? Bad router? Flakey service provider equipment?
0
 
LVL 1

Author Comment

by:jmarkfoley
ID: 18038004
btw - things seem to come in OK. I'm still getting lots of spam. Plus users are able to get to my web so it looks like the outside can connect to me, I just can't out. Also, fyi it is now 40 minutes after running my tests and pings and traceroutes are still not working. I am truly puzzled.
0
 
LVL 39

Expert Comment

by:noci
ID: 18038960
If people can reach you => at least routing should work..,
Any firewall (on this system?) and what do it's filters look like?
0
What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

 
LVL 1

Author Comment

by:jmarkfoley
ID: 18039264
The only firewall I believe I have set up is via the linksys router. The firewall settings in there are:

Block Anonymous Internet Request
Filter Multicast
Filter IDENT (Port 113)

The other option: Filter Internaet NAT Redirection, is not selected.

It is really bad today. I still am having extreme problems. I still can't ping/traceroute.

0
 
LVL 39

Expert Comment

by:noci
ID: 18040249
A lot of modems have a problem if the upstream is too large.
(it will almost kill your total downstream),
the trick is to limit the outgoing stream (to the modem) to just below
(say 20-50 kbit per second below max, that would leave the modem
buffers almost empty, using the space for download.

Look for the wondershaper script on the internet.
http://lartc.org/wondershaper/

I am not sure it will work on linksys, although linksys used to use linux for
its OS so I would expect it to be able to adopt the ideas. (it works for me).

Even then a massive upload should allow some pings to get through,

Even the first line is asterisks? I would expect something to return from the
default gw.
Any filters (iptables, ipchains) on your qmail box?
0
 
LVL 1

Author Comment

by:jmarkfoley
ID: 18042473
I'm using sendmail. No I have no ipchains, etc.

My upstream is fairly small. In fact, I experience this problem with nothing going out. The only thing consuming bandwidth is rejected mail. I receive between 300,000 and 400,000 bogus emails per day. I haven't analyzed this activity, but right now I'm still getting about 8 bogus emails/second and my sendmail tasks are at a low of 22 and I can ping, etc. to my heart's content.

I'll check on your wondershaper link.

My building is served by a cisco router administered ny a 3rd party. Is it possible they are limiting upload based on time of day?
0
 
LVL 39

Expert Comment

by:noci
ID: 18043709
The exact mailer is not important (i probably misread mailq)..

Please be aware that even no traffic means a stream of output
of at least ICMP messages, or SYN-ACK + (optional data ACK) + FIN + FIN-ACK
even when not sending... Those packets are at least 32 bytes (header only) big.

If there is an upstream bottleneck, it can case the same problems.

Also you have to keep in mind that UDP traffic is the first to go if a link
gets satured (UDP is a 'lossy' protocol) there is no guard against packet loss.
To have reliable links TCP was invented. TCP though ads overhead (3 packets to
start + 3/4 packets to close + ack packets for chunks of data.)

Name resolution is a UDP (port 53 ) query response protocol. If a link get satured
it will lose the data. Also traceroute uses either udp or icmp.

the a forementioned tcptraceroute might have a slightly different view
as it tries to only send the first tcp packet (SYN), and analyzes the answers
(ICMP for not the complete link yet, or SYN-ACK (bingo)).


Tools to investigate are:
iptraf: http://iptraf.seul.org/
measure link load menu driven. Look for detail interface statistics.

tcpdump: http://www.tcpdump.org/
see the packets in transit...
a 'tcpdump -vni ethX udp port 53' should show all udp request & answers.

Another thing is: is you have problems resolving names, maybe adding your own nameserver can help (it will miss out on the first query but retain the name for
another time, some time later), mail would get delayed.

Also do you send the mail out yourselve, or do you use a smarthost setup.
If you have a smarthost available (the mailserver of your ISP) then that mailer has
to deal with name lookups etc. You just need to provide the name (or ipaddress)
of that smarthost. (better chance to get mail out).


0
 
LVL 1

Author Comment

by:jmarkfoley
ID: 18057321
I do have a nameserver running on my host. I didn't last week, but I configured that hoping it would help. It does not.

Yes, I am sending the mail myself so I am doing the lookup.

I will also check on your suggested tools. You're just a gold mine for this stuff!

Link saturation. Hmmm. What's that? Remember that when this problem occurs I can't even ping or tracerout using an IP address, DNS not involved!

I've been tracking incidents and this inablilty to get out starts generally starts between 8:00 and 9:00am and tapers off around noon. I often have a spike at the end of the day between 3:00pm and 6:pm. My system's attempts to get out (I don't know what else to call it) doesn't really change. That is, I'm not sending thousands of emails in the morning versus the evening. I might send 30 emails a day. Nor is sendmail handling extra spam at any point.

If it is some problem downstream, will I be able to diagnose it at all?
0
 
LVL 39

Accepted Solution

by:
noci earned 500 total points
ID: 18057954
Link saturation is more less comparable to a huge traffic jam.
So much traffic, that some bits get lost (udp does get lost easy, tcp less easy)
lost tcp traffic means retransmits adding to the insult.

Now some modems have a large amount of buffer memory, they are means to keep traffic
while the transmitter is busy, with a traffic jam that just heaps up,
then newer traffic gets lost, but the buffers still have to be sent,
possibly the receiver allready asked for retransmits for some of that data.
Also keep in mind that in tcpip you need to ack the reception of (sets of)packets
those acks also get stuck in that jam.

To much packets for upstream will also mean nobuffers for downstream and packet loss there.

Normaly those traffic jams clear up after some time. (minutes not hours)
unless someone is also running file sharing tools etc. That will hurt if not restrained to a trickle. (wondershaper does that...) You can effectively limit
incoming traffic like you can with outgoing, you ll just have to handle those.

There might be another thing...
How do you connect your system to the linksys is there a managed switch in between?
if so do all settings match up (actual not auto) are both sides of a cable 100FullDuplex or 100Halfduplex if not it will wreak havoc on a connection.

Also upstream problems are tracable to a certain extend.., just like traceroute
there is a tool called pchar (http://www.kitchenlab.org/www/bmah/Software/pchar/)
that can estimate upstream characteristics hop by hop, but it needs a fairly clean first few hops to reliably test further hops, and it takes a while.

0
 
LVL 1

Author Comment

by:jmarkfoley
ID: 18099438
I had an expert come in and look at the system and run some tests. What's happening is that so much attempted spam is coming into my system that I saturate the linksys router doing black-list lookups, sending reject messages, etc. I was advised to do 4 things:

1. Get a firewall running on my linux box. Currently I have none.
2. Get an anti-virus tool. clam was recommended.
3. move the mail to a different server/IP
4. Get a more robust router. Linksys is designed for home use.

I'm going to do all of these. For #3 and #4 I think I'm just going to setup a new linux box to do routing and firewalling.

Thanks for you help.
0
 
LVL 39

Expert Comment

by:noci
ID: 18100111
I guess if the linksys is only used as a bare router it should be able to cope with the traffic. (didn't expect it to be able to filter mail too...;)

having a system behind it doing the mail etc. handling is what i do (different modem though). Also think about using wondershaper to limit the outgoing traffic a little below line capacity (about 10Kbps should be sufficient).
It will prevent some upstream data killing your downstream.
0

Featured Post

Zoho SalesIQ

Hassle-free live chat software re-imagined for business growth. 2 users, always free.

Join & Write a Comment

I have seen several blogs and forum entries elsewhere state that because NTFS volumes do not support linux ownership or permissions, they cannot be used for anonymous ftp upload through the vsftpd program.   IT can be done and here's how to get i…
Note: for this to work properly you need to use a Cross-Over network cable. 1. Connect both servers S1 and S2 on the second network slots respectively. Note that you can use the 1st slots but usually these would be occupied by the Service Provide…
Internet Business Fax to Email Made Easy - With eFax Corporate (http://www.enterprise.efax.com), you'll receive a dedicated online fax number, which is used the same way as a typical analog fax number. You'll receive secure faxes in your email, fr…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now