Link to home
Start Free TrialLog in
Avatar of rstaveley
rstaveleyFlag for United Kingdom of Great Britain and Northern Ireland

asked on

Postfix when 1st MX does not resolve

I have Postfix 2.1.5 set up on a Debian 3.1 server. On two occasions mail has looped back to the sender because the 1st MX record for the recipient resolved to an invalid host.

e.g. A message to someone@BlueH2Ogroup.com looped back to the sender, because tcsmg1.BlueH2Ogroup.com is non-existent.

rob@mini:~$  host -t MX BlueH2Ogroup.com
BlueH2Ogroup.com mail is handled by 10 tcsmg1.BlueH2Ogroup.com.
BlueH2Ogroup.com mail is handled by 50 mail.BlueH2Ogroup.com.
rob@mini:~$  host tcsmg1.BlueH2Ogroup.com
Host tcsmg1.BlueH2Ogroup.com not found: 3(NXDOMAIN)
rob@mini:~$  host mail.BlueH2Ogroup.com
mail.BlueH2Ogroup.com has address 167.206.59.3

How can I get Postfix to try the 2nd MX in the event that the 1st is unresolvable? [I believe it works OK, if the 1st can't be connected to. The problem is when the 1st does not resolve.]

Here are the MX-related bits of postconf:

root@mini:/etc/postfix# postconf | grep mx
best_mx_transport =
ignore_mx_lookup_error = no
parent_domain_matches_subdomains = debug_peer_list,fast_flush_domains,mynetworks,permit_mx_backup_networks,qmqpd_authorized_clients,relay_domains,smtpd_access_maps
permit_mx_backup_networks =
smtp_defer_if_no_mx_address_found = no
smtp_mx_address_limit = 0
smtp_mx_session_limit = 2
SOLUTION
Avatar of Kerem ERSOY
Kerem ERSOY

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of rstaveley

ASKER

I get the same DNS results on multiple networks with different ISPs. If you try the same host commands on BlueH2Ogroup.com, do you not get the same results?

This problem has bitten me before under the same circumstances, where the 1st MX did not resolve. Postfix bounces the message back to me saying that it loops back.

If you have Postfix, what do you get if you e-mail nospam@BlueH2Ogroup.com? [My other example is no longer good, because they've fixed their highest priority MX now.]

I wonder if Postfix has a bug in that it quits, if the 1st MX is an invalid host in the domain.
Avatar of Kerem ERSOY
Kerem ERSOY

you see I've posted the faq for you. According to this there's only one way that Postfix quits searching for another MX. When Postfix wants to deliver a MX this MX responds at port 25 but before the transaction completes the server dies. Then Postfix will not try the MX record since it initially contacted the first MX server. There's a way to understand what has happened. Just dig in ypur postfix log at /var/log/maillog to see if it has ben able to contacted the host which is dead for a while.
Hi,

ignore_mx_lookup_error = no

could solve the problem if you change it to

ignore_mx_lookup_error = yes

Please see the link below:

http://www.postfix.org/postconf.5.html
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
omarfarid, tcsmg1.BlueH2Ogroup.com cannot be resolved, it should be resolvable from the Internet, but there is no entry in DNS for it. You can try it yourself and I expect you will get the same result. My problem is not with MTAs which Postfix is unable to connect to. My problem is - I believe - when the highest priority MX record in DNS has an unresolvable name. I have had this hapoen with two differant domains. The other domain has had its DNS fixed now, so I can only reproduce the problem right now with e-mail to BlueH2Ogroup.com.

Kerem, it doesn't seem to do what the FA says. I am attaching an excerpt from the mail log, with some XX-ing out of stuff that I guess I shouldn't post on the Internet.

You can see that it bounces because it "loops back to myself" straight away.

I'll try now with ignore_mx_lookup_error = yes, as omarfarid suggests and get back to you both.

Many thanks for your input!
Nov 15 07:31:54 mini postfix/cleanup[15133]: CD9B618A33: hold: header Received: from Bigun (XXXX-XXX.XXX.com [XX.XX.XX.XX])??by mail.XXX.XXX (Postfix) with ESMTP id CD9B618A33??for <XXX.XXXXX@BlueH2Ogroup.com>; Thu, 15 Nov 2007 07:31:53 + from XXXX-XXXX.XXX.com[XX.XX.XX.XX]; from=<rstaveley@XXXX.XX> to=<XXX.XXXX@BlueH2Ogroup.com> proto=ESMTP helo=<Bigun>
Nov 15 07:31:54 mini postfix/cleanup[15133]: CD9B618A33: message-id=<002e01c82759$9b0e5700$d12b0500$@com>
Nov 15 07:31:57 mini postfix/smtpd[15118]: disconnect from XXX-XX.XXX.com[XX.XX.XX.XX]
Nov 15 07:31:58 mini MailScanner[14071]: New Batch: Scanning 1 messages, 9746 bytes
Nov 15 07:32:00 mini MailScanner[14071]: Virus and Content Scanning: Starting
Nov 15 07:32:00 mini MailScanner[14071]: Requeue: CD9B618A33.E6EE3 to 08D9318B3F
Nov 15 07:32:00 mini postfix/qmgr[381]: 08D9318B3F: from=<rstaveley@XXXX.XX>, size=9469, nrcpt=1 (queue active)
Nov 15 07:32:00 mini MailScanner[14071]: Uninfected: Delivered 1 messages
Nov 15 07:32:00 mini spamd[32661]: connection from localhost [127.0.0.1] at port 1313
Nov 15 07:32:00 mini spamd[32661]: info: setuid to nobody succeeded
Nov 15 07:32:00 mini spamd[32661]: Creating default_prefs [/nonexistent/.spamassassin/user_prefs]
Nov 15 07:32:00 mini spamd[32661]: Cannot write to /nonexistent/.spamassassin/user_prefs: No such file or directory
Nov 15 07:32:00 mini spamd[32661]: Couldn't create readable default_prefs for [/nonexistent/.spamassassin/user_prefs]
Nov 15 07:32:00 mini spamd[32661]: processing message <002e01c82759$9b0e5700$d12b0500$@com> for nobody:65534.
Nov 15 07:32:00 mini spamd[32661]: clean message (0.0/5.0) for nobody:65534 in 0.4 seconds, 9263 bytes.
Nov 15 07:32:00 mini spamd[32661]: result: .  0 -  scantime=0.4,size=9263,mid=<002e01c82759$9b0e5700$d12b0500$@com>,autolearn=failed
Nov 15 07:32:00 mini postfix/pickup[15146]: ECF2118A33: uid=65534 from=<rstaveley@seseit.com>
Nov 15 07:32:00 mini postfix/cleanup[15121]: ECF2118A33: hold: header Received: by mail.XXXX.XXX (Postfix, from userid 65534)??id ECF2118A33; Thu, 15 Nov 2007 07:32:00 +0000 (GMT) from local; from=<rstaveley@seseit.com> to=<XXXX.XXXXX@blueh2ogroup.com>
Nov 15 07:32:00 mini postfix/cleanup[15121]: ECF2118A33: hold: header Received: from Bigun (XXX.XXXX.com [XX.XX.XX.xx])??by XXX.xxxx.com (Postfix) with ESMTP id CD9B618A33??for <lou.mcelwain@BlueH2Ogroup.com>; Thu, 15 Nov 2007 07:31:53 + from local; from=<rstaveley@XXXX.XXX> to=<XXXX.XXXX@blueh2ogroup.com>
Nov 15 07:32:00 mini postfix/cleanup[15121]: ECF2118A33: message-id=<002e01c82759$9b0e5700$d12b0500$@com>
Nov 15 07:32:00 mini postfix/pipe[15129]: 08D9318B3F: to=<XXX.XXXX@blueh2ogroup.com>, relay=spamassassin, delay=7, status=sent (mail.seseit.com)
Nov 15 07:32:00 mini postfix/qmgr[381]: 08D9318B3F: removed
Nov 15 07:32:06 mini MailScanner[14071]: New Batch: Scanning 1 messages, 9960 bytes
Nov 15 07:32:08 mini MailScanner[14071]: Virus and Content Scanning: Starting
Nov 15 07:32:08 mini MailScanner[14071]: Requeue: ECF2118A33.22893 to 777F418B3F
Nov 15 07:32:08 mini postfix/qmgr[381]: 777F418B3F: from=<rstaveley@XXXX.XXX>, size=9782, nrcpt=1 (queue active)
Nov 15 07:32:08 mini MailScanner[14071]: Uninfected: Delivered 1 messages
Nov 15 07:32:09 mini postfix/smtp[15160]: 777F418B3F: to=<XXXXX.XXXXX@blueh2ogroup.com>, relay=none, delay=9, status=bounced (mail for blueh2ogroup.com loops back to myself)
Nov 15 07:32:09 mini postfix/cleanup[15133]: 78E7718A33: message-id=<20071115073209.78E7718A33@mail.XXXXX.XXX>
Nov 15 07:32:10 mini postfix/qmgr[381]: 78E7718A33: from=<>, size=11478, nrcpt=1 (queue active)
Nov 15 07:32:10 mini postfix/local[15138]: 78E7718A33: to=<rob@XXXX.XXX>, orig_to=<rstaveley@XXXXX.XXX>, relay=local, delay=1, status=sent (delivered to command: /usr/bin/procmail -f- -a "$USER")

Open in new window

> ignore_mx_lookup_error = yes

No cigar, alas. I still get..

Nov 15 07:58:50 mini postfix/smtp[15452]: 5C12F18B3F: to=<XXXX.XXX@blueh2ogroup.com>, relay=none, delay=8, status=bounced (mail for blueh2ogroup.com loops back to myself)

The sanity-check:

root@mini:/var/log# postconf | grep mx
best_mx_transport =
ignore_mx_lookup_error = yes
parent_domain_matches_subdomains = debug_peer_list,fast_flush_domains,mynetworks,permit_mx_backup_networks,qmqpd_authorized_clients,relay_domains,smtpd_access_maps
permit_mx_backup_networks =
smtp_defer_if_no_mx_address_found = no
smtp_mx_address_limit = 0
smtp_mx_session_limit = 2
Come to think of it,

> ignore_mx_lookup_error (default: no)
>
>    Ignore DNS MX lookups that produce no response. By default, the Postfix SMTP client defers delivery and tries again after some delay. This behavior is required by the SMTP standard.
>
>    Specify "ignore_mx_lookup_error = yes" to force a DNS A record lookup instead. This violates the SMTP standard and can result in mis-delivery of mail.

Falling back to an A record isn't the solution here. The problem is that the 1st MX is unresolvable. Other MTAs seem to fall back to the next MX, but I think [my implementation of] Postfix must have a bug.
Oh... not Postfix.

Bingo. It is my resolver, which assumes that any unknown host is part of my network and which then gets misdirected by my own DNS, which has *.mynetwork.net pointing to my own MTA. Bah!

Silly me.

If I'd tried...

  telnet tcsmg1.BlueH2Ogroup.com 25

...as you suggested, omarfarid, rather than looking at hosts/nslookup, I would have seen it.

Likewise, ping....

root@mini:/var/log# ping tcsmg1.BlueH2Ogroup.com
PING tcsmg1.BlueH2Ogroup.com.mynetwork.net (xxx.xxx.xxx.xxx): 56 data bytes
64 bytes from xxx.xxx.xxx.xxx: icmp_seq=0 ttl=64 time=0.2 ms

Many thanks for the help!
For the record, my fix was adding...

  search myisp.net

...to the top of /etc/resolv.conf. It had only listed my nameservers before that and I guess that means the default search was being applied, which is the local domain, which would have "wild-carded".

[I have misgivings now about my DNS set-up, but having *.mynetwork.net point to that box is handy for setting up quick and dirty virtual hosts... so I'll see if I can get away with it for the time being. It has been like that for a few years now!]
Hi,

I was just thinking that it was a DNS issue. It rold "loope back to myself". Why cant you find another solution for virtual host ?
Yes, with hindsight the error message said it all.

It would be a smart move - I guess - to have the wild card point to a box that didn't have the MTA on it or perhaps, make all of the quick'n'dirty virtual hosts be called (say) *.web.mynetwork.net and avoid wildcarding my entire mynetwork.net domain.

The fix to resolv.conf fixes the immediate problem, but I know it would be better to try to do "the right thing", because these things tend to bite you when and where you are not expecting otherwise.
Yeah first of all you should avoid a catchall MX. This is clearly violating the RFC for both DNS and SMTP.
Having the MX on a "catch all" violates the RFCs? Are you sure about that?