kapshure
asked on
Help with Postfix+Nagios Setup - CentOS
Currently rolling-out Nagios for an internal business unit, and I've got the lion-share of the setup completed, except for outbound notifications working. I've yum installed postfix, ran through setup steps over at server-world.info/en. I also modified the commands.cfg file per this URL:
http://www.infosecprojects.net/en/linuxtutorials/nagios-sendmail.html
setup info
postfix-2.3.3-2.1.el5_2
2.6.18-194.26.1.el5
CentOS 5.5
here is output from postconf -n
tail on /var/log/maillog:
tail /var/log/messages
postfix is running:
I can also telnet to the localhost via 25, and to the public IP from my workstation, but each time I telnet, it says connected, but EHLO, HELO commands generate no response from the server. I'm focusing on researching the maillog errors right now, if anyone could lend a hand that'd be great
http://www.infosecprojects.net/en/linuxtutorials/nagios-sendmail.html
setup info
postfix-2.3.3-2.1.el5_2
2.6.18-194.26.1.el5
CentOS 5.5
here is output from postconf -n
alias_database = hash:/etc/aliases
alias_maps = hash:/etc/aliases
body_checks = regexp:/etc/postfix/body_checks
command_directory = /usr/sbin
config_directory = /etc/postfix
daemon_directory = /usr/libexec/postfix
debug_peer_level = 2
header_checks = regexp:/etc/postfix/header_checks
html_directory = no
inet_interfaces = all
mail_owner = postfix
mailq_path = /usr/bin/mailq.postfix
manpage_directory = /usr/share/man
mydestination = $myhostname, localhost.$mydomain, localhost, $mydomain
mydomain = example.com
myhostname = nagios.example.com
mynetworks = 10.0.101.0/24, 127.0.0.0/8
myorigin = $mydomain
newaliases_path = /usr/bin/newaliases.postfix
queue_directory = /var/spool/postfix
readme_directory = /usr/share/doc/postfix-2.3.3/README_FILES
sample_directory = /usr/share/doc/postfix-2.3.3/samples
sendmail_path = /usr/sbin/sendmail.postfix
setgid_group = postdrop
unknown_local_recipient_reject_code = 550
tail on /var/log/maillog:
Nov 19 04:20:46 pov postfix/smtpd[22095]: fatal: open database /etc/aliases.db: No such file or directory
Nov 19 04:20:47 pov postfix/master[21874]: warning: process /usr/libexec/postfix/smtpd pid 22095 exit status 1
Nov 19 04:20:47 pov postfix/master[21874]: warning: /usr/libexec/postfix/smtpd: bad command startup -- throttling
tail /var/log/messages
nagios: Warning: Attempting to execute the command "/usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: PROBLEM\nHost: monitoredbox\nState: DOWN\nAddress: 10.0.101.221\nInfo: CRITICAL - Host Unreachable (10.0.101.221)\n\nDate/Time: Fri Nov 19 04:24:44 PST 2010\n" | /bin/mail -s "** PROBLEM Host Alert: zimbra is DOWN **" 5555555555@tmomail.net" resulted in a return code of 127. Make sure the script or binary you are trying to execute actually exists...
Nov 19 04:41:04 pov nagios: Auto-save of retention data completed successfully.
postfix is running:
ps -ef | grep postfix
root 21874 1 0 04:06 ? 00:00:00 /usr/libexec/postfix/master
postfix 21876 21874 0 04:06 ? 00:00:00 pickup -l -t fifo -u
postfix 21877 21874 0 04:06 ? 00:00:00 qmgr -l -t fifo -u
root 22172 21934 0 04:26 pts/0 00:00:00 grep postfix
I can also telnet to the localhost via 25, and to the public IP from my workstation, but each time I telnet, it says connected, but EHLO, HELO commands generate no response from the server. I'm focusing on researching the maillog errors right now, if anyone could lend a hand that'd be great
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
The newaliases command creates /etc/aliases.db from the contents of /etc/aliases. Without a /etc/aliases.db file present, postfix will exhibit the other behavior you mentioned (no response to HELO or EHLO) - btw.
ASKER
@LunarNRG
you're right, mailx wasn't installed; so I've done that.
I had already done the newaliases command, but I did it again.
Now I've got this in maillog:
you're right, mailx wasn't installed; so I've done that.
I had already done the newaliases command, but I did it again.
Now I've got this in maillog:
lost connection after EHLO from firewall.hostcompany.com[12.34.56.78] < our office IP
postfix/smtpd[23150]: disconnect from firewall.hostcompany.com[12.34.56.78] < our office IP
Are you running your telnet test from localhost? For example,
[user@host ~] telnet localhost 25
ASKER
seems that the mailx and newaliases fixed the problem. I just got an alert from nagios to my phone :)
Now want to see a few more alerts come through and looks like maybe my setup for notifications is complete!
one question I have.
I had sendmail.postfix configured as the only MTA, so how come mailx was required? Could I have changed nagios.cfg to sync up w/ postfix instead of mailx?
thanks again.
Now want to see a few more alerts come through and looks like maybe my setup for notifications is complete!
one question I have.
I had sendmail.postfix configured as the only MTA, so how come mailx was required? Could I have changed nagios.cfg to sync up w/ postfix instead of mailx?
thanks again.
No problem, glad to hear it.
You're right you don't really need mailx, but the nagios defaults for host-notify-by-email, service-notify-by-email, etc. all use /usr/bin/mail, I believe. You could use /usr/sbin/sendmail for the same purpose, but you'd have to create your own macros.
Nagios just calls the command you specify in config, and in your case /usr/bin/mail was used, as in (from the previous warning):
I just now noticed that you mentioned following these instructions change the nagios default:
http://www.infosecprojects.net/en/linuxtutorials/nagios-sendmail.html
... so it would seem your modification did not take, you may wish to review your settings and make sure they match the tutorial. Perhaps you need to restart the nagios service? Not sure. If you convince nagios to use /usr/sbin/sendmail then you can remove the mailx package.
HTH,
Marty
You're right you don't really need mailx, but the nagios defaults for host-notify-by-email, service-notify-by-email, etc. all use /usr/bin/mail, I believe. You could use /usr/sbin/sendmail for the same purpose, but you'd have to create your own macros.
Nagios just calls the command you specify in config, and in your case /usr/bin/mail was used, as in (from the previous warning):
/usr/bin/printf "%b" "***** Nagios *****
<snip>
Date/Time: Fri Nov 19 04:24:44 PST 2010\n" | /bin/mail -s "** PROBLEM Host Alert: zimbra is DOWN **" 5555555555@tmomail.net" resulted in a return code of 127
I just now noticed that you mentioned following these instructions change the nagios default:
http://www.infosecprojects.net/en/linuxtutorials/nagios-sendmail.html
... so it would seem your modification did not take, you may wish to review your settings and make sure they match the tutorial. Perhaps you need to restart the nagios service? Not sure. If you convince nagios to use /usr/sbin/sendmail then you can remove the mailx package.
HTH,
Marty
ASKER
I'm getting the alerts now; just not as timely as they should be. seems that the UP alert comes back way faster than the DOWN. Or sometimes vice versa. May need to tweak some time-thresholds in nagios.
seems like for now though, that the mail part is working, as we're getting service alerts to (2) phones now, and to an email on a different mail server in a different domain.
Thanks again!
seems like for now though, that the mail part is working, as we're getting service alerts to (2) phones now, and to an email on a different mail server in a different domain.
Thanks again!