• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 679
  • Last Modified:

syslogd stops writing to messages

syslogd is running but it stops writing to messages.  I do a "logger -p user.error Hello" and it gets to the Console Output but not to the "messages" file.

I think that it is because "messages" is written to a NFS.  The NFS isn not Solaris server and it has frequent and long time delays.  I think that syslogd times-out when trying to write to messages and stops trying.  I have tried to "truss -p <syslogd>" and "truss logger -p user.errror Hello" but I can't see the difference between when syslogd successfully writes to messages (on one server) and when it fails on another server.   Also pstack doesn't indicate when a failed write occurs.  

I looked in to door status commands but they are all "C" program calls and I won't be allowed to run or develop my own "C" code....

Any ideas how I can tell that syslogd is not writing to the messages file besides using "tail -1 message" ?


0
huffmana
Asked:
huffmana
  • 6
  • 3
  • 3
  • +1
3 Solutions
 
blcarter14Commented:
Use -s so that it gets logged to the messages file as well.  
0
 
TintinCommented:
The -s flag won't help.  That causes logger to duplicate the message to STDERR.

Please show us the relevant line from /etc/syslog.conf
0
 
huffmanaAuthor Commented:
*.err;kern.debug;daemon.notice;mail.crit                  /var/nfs/messages
I don't think that it is the configuration because syslog is working much of the time.  For about 1/2 of our servers, syslog is writing to /var/nfs/messages.  For those that it is not working, after the messages file is rolled over, it restarts  (I think).

/var/nfs/messages is a NFS and I know that NFS is sometimes not accessable for up to 5 minutes at a time... An EMC piece of S--- !  
0
Granular recovery for Microsoft Exchange

With Veeam Explorer for Microsoft Exchange you can choose the Exchange Servers and restore points you’re interested in, and Veeam Explorer will present the contents of those mailbox stores for browsing, searching and exporting.

 
TintinCommented:
error is in invalid syslog level, try

logger -p user.debug "test message"

or

logger -p user.err "test message"
0
 
huffmanaAuthor Commented:
On the servers were syslogd is writing to messages the following occurs:
logger -p kern.err HELLO                          -> messages written as -> user.error
logger -p kern.error HELLO                       -> messages written as -> user.error
logger -p user.err HELLO                          -> messages written as -> user.error
logger -p user.error HELLO                       -> messages written as -> user.error

It always get changed to user - but it gets written to messages - so it is still a valid test.

I really noticed this when the power chord to the backup power-supply was loose on a V440.  It logs a "'input power unavailable" every day.  I found that all log messages stopped for 8 days including the powersupply  message.  This way I could answer the question "maybe there were no syslog messages for 8 days...  

What I was trying to do is to find a way to detect that syslogd is not writing to messages and HUP it - from an every 10 minute cron job.  I just wanted something more direct than trying a "logger" message and a "grep" to see if it got there.  But if that is what I have to to, I will...  Until I convience people to move messages off of the NFS.
0
 
TintinCommented:
Hmm, thinking about it some more, I think that if syslog can't write to a file (for whatever reason) it will buffer the messages in memory.  I can't quite remember what circumstances cause it to flush the buffer (I'm guessing a SIGHUP certainly would).

You could try setting up a cronjob to do

kill -HUP `cat /var/run/syslogd.pid`

to see if that makes any difference.

0
 
huffmanaAuthor Commented:
I started lloking for a "time-out" value and found that syslogd uses a "door" utility to write to "messages."  So I looked into doors to see if I could find an error state.  But the door API appears to be "C:" code only - and I will never be allowed to put an binary executable on these machines.  But I can write scripts, including Perl....  But even if there were a Perl API to doors, I would not be allowed to add it....  Basically I'm stuck with using shell commands :-(  And there is not even a "man door" !

Yes the plan is to HUP syslogd.  Syslogd will restart writing to the messages file if it is restarted (until the next EMC NFS slow down - then it will stop again).

I guess that I'll just send a unique syslog message and manually grep to see if it gets tt messages.  The worst thing that can happen is that I'll HUP syslogd when it doesn't need it.
0
 
blcarter14Commented:
Can you write the log to the local box, and then run a cron job that copies / consolidates the logs to your nfs share?
0
 
huffmanaAuthor Commented:
That is the eventual plan.  To avoid duplicated records, the copy to NFS would be after log maintenance rolls over a version, then copy messages.1. But the people needed access to "all messages" log would not have a real-time view....   I wish that this were Solaris 10 with File-Discriptors that are more explanitory...
0
 
blcarter14Commented:
Don't know if it's helpful, but when I had a few people that needed access to a log file, but didn't know how to work in linux, I just installed apache, and created a cgi-script that would cat and echo out the log file to the webpage.  Then they could see it anytime they wanted.
0
 
huffmanaAuthor Commented:
One theory is that when the system is being booted, and the NFS does not connect, the system will open /var/nfs/messages locally through the door feature.  So the door "C" code file handle is open.  Then when the NFS server is "available" (contacted) the syslogd-door mount point is is still connected to the actual partition and does not change to the NFS mount point.  So the messages file is still being written locally but it is not accessable except by syslog.  

I have to wait until a Schedule Outage to tes this theory because auditing (bsmconv) is also on /var/nfs....  But who looks frequently at praudit ?
0
 
gheistCommented:
You say log is accessible only by syslog - how do you expect to read it then?
0
 
huffmanaAuthor Commented:
Well that is the whole problem isn't it.  The underlaying messages file in the /var partition can not be seen until the NFS mount point is umounted.  But then audit will very quickly fill up the partition, so I have to bsmunconv which is a NO-NO.....  So right now I can't even test the theory....

But this will be changing soon, I will be writting messages locally then roll and copy to NFS daily, or sooner....
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

  • 6
  • 3
  • 3
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now