multiple syslogd's running



at high load Occasionally I see the syslogd #'s shoot up:


kern.ipc.pipekva: 7139328
kern.nselcoll: 267591
# imapgate:     3219
# syslogd:       28



-bash-3.2$ ps -auxww | grep syslogd
root    31129 10.0  0.0  3900  1208  ??  Rs   15Mar10  42:22.35 /usr/sbin/syslogd -svv
root    95946  0.0  0.0  3900  1208  ??  S    12:24AM   0:00.00 /usr/sbin/syslogd -svv
root    95947  0.0  0.0  3900  1208  ??  S    12:24AM   0:00.00 /usr/sbin/syslogd -svv
root    95948  0.0  0.0  3900  1208  ??  S    12:24AM   0:00.00 /usr/sbin/syslogd -svv
root    95949  0.0  0.0  3900  1208  ??  S    12:24AM   0:00.00 /usr/sbin/syslogd -svv
root    95950  0.0  0.0  3900  1208  ??  S    12:24AM   0:00.00 /usr/sbin/syslogd -svv
root    95951  0.0  0.0  3900  1208  ??  S    12:24AM   0:00.00 /usr/sbin/syslogd -svv
root    95952  0.0  0.0  3900  1208  ??  S    12:24AM   0:00.00 /usr/sbin/syslogd -svv
root    95953  0.0  0.0  3900  1208  ??  S    12:24AM   0:00.00 /usr/sbin/syslogd -svv
root    95954  0.0  0.0  3900  1208  ??  S    12:24AM   0:00.00 /usr/sbin/syslogd ...


any ides on how to debug?
VlearnsAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

VlearnsAuthor Commented:
3284 processes:10 running, 3258 sleeping, 16 waiting
CPU:  1.9% user,  0.0% nice, 19.8% system,  0.7% interrupt, 77.5% idle
Mem: 6475M Active, 1297M Inact, 3399M Wired, 1647M Buf, 4705M Free
Swap: 8192M Total, 8192M Free

  PID USERNAME  THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
   11 root        1 171 ki31     0K    16K CPU7    7 497.0H 76.56% idle: cpu7
   12 root        1 171 ki31     0K    16K CPU6    6 496.8H 76.37% idle: cpu6
   13 root        1 171 ki31     0K    16K CPU5    5 496.5H 74.66% idle: cpu5
   16 root        1 171 ki31     0K    16K CPU2    2 495.7H 69.58% idle: cpu2
Hugo(HRC) (4/14/2010 3:52:36 PM):    14 root        1 171 ki31     0K    16K RUN     4 496.1H 69.29% idle: cpu4
   15 root        1 171 ki31     0K    16K CPU3    3 495.5H 67.87% idle: cpu3
   17 root        1 171 ki31     0K    16K CPU1    1 491.6H 64.26% idle: cpu1
   18 root        1 171 ki31     0K    16K CPU0    0 489.6H 58.50% idle: cpu0
  566 root        1 101    0  5840K  1488K select  6 186:59 24.37% syslogd
26891 root        1 101    0  5840K  1488K ttywri  2   0:00 21.19% syslogd
26890 root        1 101    0  5840K  1488K ttywri  2   0:00 21.00% syslogd
26889 root        1 101    0  5840K  1488K ttywri  2   0:00 20.75% syslogd

Hugo(HRC) (4/14/2010 3:52:41 PM): 26888 root        1 101    0  5840K  1488K ttywri  1   0:00 20.75% syslogd
26887 root        1 101    0  5840K  1488K ttywri  2   0:00 20.56% syslogd
26886 root        1 101    0  5840K  1488K ttywri  0   0:00 19.97% syslogd
26885 root        1 102    0  5840K  1488K ttywri  5   0:00 19.09% syslogd
26884 root        1 102    0  5840K  1488K ttywri  2   0:00 18.99% syslogd
26883 root        1 102    0  5840K  1488K ttywri  3   0:00 18.46% syslogd
26882 root        1 102    0  5840K  1488K ttywri  1   0:00 17.68% syslogd
26881 root        1 102    0  5840K  1488K ttywri  1   0:00 17.58% syslogd
0
tfewsterCommented:
I've seen a similar problem with cron on very heavily loaded systems; When cron (or any Unix process) starts a child process, it forks itself (i.e. creates a duplicate, including the name, but with its own PID and the PPID is that of the cron daemon), then execs the code for the child process to overlay the duplicate, so the "right" name appears if you use `ps` a few second later.  On a busy machine, it would take longer to do the "exec", so there's more chance of seeing a duplicate name

However, I'm not familiar with FreeBSD; Your system doesn't look that heavily loaded, and anyway I thought syslogd was just a "listener" to collect messages from other system processes and write them to its own logs, so I can't see why it should need to start child processes.
0
VlearnsAuthor Commented:
any ideas on how to debug this? i can reproduce the problem.
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

tfewsterCommented:
The first thing I'd check is if these _are_ child processes of the real syslogd process, using the -d option to ps.

Or by using the -l or -j options and grepping the output for the syslogd PID, e.g.

ps -alxww |grep `cat /var/run/syslog.pid` > /tmp/syslog_pid.out

# Then check any child processes by PID to see what they really are:
if [ `wc -l /tmp/syslog_pid.out` -gt 1 ]
then
  cat  /tmp/syslog_pid.out | while read PID restofline
  do
     ps -p $PID
  done  > /tmp/syslog_pid_children.out
fi
0
VlearnsAuthor Commented:
cat  /tmp/syslog_pid.out | while read PID restofline    what does this mean?

i am trying to get the bash script running

thnks

0
VlearnsAuthor Commented:
79697 syslogd  GIO   fd 23 wrote 111 bytes
       "Apr 26 15:30:26 <daemon.err> omega1 stunnel: LOG3[45483:4829184]: connect_remote: connection request timed out
       "
 79697 syslogd  RET   writev 111/0x6f
 79697 syslogd  CALL  open(0x5084a0,0x5,0)
 79697 syslogd  NAMI  "/dev/console"
 79697 syslogd  RET   open 8
 79697 syslogd  CALL  writev(0x8,0x7fffffffcdc0,0x7)
 79697 syslogd  GIO   fd 8 wrote 112 bytes
       "Apr 26 15:30:26 <daemon.err> omega1 stunnel: LOG3[45483:4829184]: connect_remote: connection request timed out\r
       "
 79697 syslogd  RET   writev 112/0x70
 79697 syslogd  CALL  close(0x8)
 79697 syslogd  RET   close 0
 79697 syslogd  CALL  open(0x407516,0,0x1b6)
 79697 syslogd  NAMI  "/var/run/utmp"
 79697 syslogd  RET   open 8
 79697 syslogd  CALL  fstat(0x8,0x7fffffffc5c0)
 79697 syslogd  RET   fstat 0
 79697 syslogd  CALL  read(0x8,0x512000,0x1000)
 79697 syslogd  GIO   fd 8 read 1012 bytes
       0x0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x001e 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x003c 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x005a 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x0078 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x0096 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x00b4 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x00d2 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x00f0 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x010e 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x012c 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x014a 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x0168 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x0186 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x01a4 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x01c2 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x01e0 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x01fe 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x021c 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x023a 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x0258 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x0276 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x0294 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x02b2 0000 0000 0000 0000 0000 0000 0000 7474 7970 3000 0000 7961 6e00 0000 0000  |..............ttyp0...yan.....|
       0x02d0 0000 0000 0000 0000 3230 392e 3133 312e 3632 2e31 3133 0000 66b6 d44b 7474  |........209.131.62.113..f..Ktt|
       0x02ee 7970 3100 0000 7663 6861 7661 6e00 0000 0000 0000 0000 3230 372e 3132 362e  |yp1...vchavan.........207.126.|
       0x030c 3233 312e 3133 3200 320b d64b 7474 7970 3200 0000 6874 7275 6f6e 6700 0000  |231.132.2..Kttyp2...htruong...|
       0x032a 0000 0000 0000 3230 392e 3133 312e 3632 2e31 3133 0000 087c ce4b 7474 7970  |......209.131.62.113...|.Kttyp|
       0x0348 3300 0000 6875 676f 6775 0000 0000 0000 0000 0000 3636 2e32 3238 2e31 3632  |3...hugogu..........66.228.162|
       0x0366 2e35 0000 0000 8013 d64b 7474 7970 3400 0000 6874 7275 6f6e 6700 0000 0000  |.5.......Kttyp4...htruong.....|
       0x0384 0000 0000 3230 392e 3133 312e 3632 2e31 3133 0000 185b cf4b 7474 7970 3500  |....209.131.62.113...[.Kttyp5.|
       0x03a2 0000 7663 6861 7661 6e00 0000 0000 0000 0000 3230 372e 3132 362e 3233 312e  |..vchavan.........207.126.231.|
       0x03c0 3133 3200 640b d64b 7474 7970 3600 0000 7663 6861 7661 6e00 0000 0000 0000  |132.d..Kttyp6...vchavan.......|
       0x03de 0000 3230 372e 3132 362e 3233 312e 3133 3200 cb0d d64b                      |..207.126.231.132....K|

 79697 syslogd  RET   read 1012/0x3f4
 79697 syslogd  CALL  read(0x8,0x512000,0x1000)
 79697 syslogd  GIO   fd 8 read 0 bytes
       ""
 79697 syslogd  RET   read 0
 79697 syslogd  CALL  close(0x8)
 79697 syslogd  RET   close 0
 79697 syslogd  CALL  writev(0x19,0x7fffffffcdc0,0x7)
 79697 syslogd  GIO   fd 25 wrote 111 bytes
       "Apr 26 15:30:55 <daemon.err> omega1 stunnel: LOG3[45481:4861952]: connect_remote: connection request timed out
       "
 79697 syslogd  RET   writev 111/0x6f
 79697 syslogd  CALL  sendto(0x6,0x7fffffffc9b0,0x5e,0,0x50c130,0x10)
 79697 syslogd  GIO   fd 6 wrote 94 bytes
       "<27>Apr 26 15:30:55 stunnel: LOG3[45481:4861952]: connect_remote: connection request timed out"
 79697 syslogd  RET   sendto 94/0x5e
 79697 syslogd  CALL  sigprocmask(0x3,0x7fffffffce40,0x7fffffffce30)
 79697 syslogd  RET   sigprocmask 0
 79697 syslogd  CALL  select(0x8,0x513180,0,0,0)
 79697 syslogd  RET   select 1
 79697 syslogd  CALL  read(0x7,0x7fffffffd420,0x3ff)
 79697 syslogd  GIO   fd 7 read 116 bytes
       "<118>Apr 26 15:30:55 <daemon.err> omega1 stunnel: LOG3[45481:4861952]: connect_remote: connection request timed out
       "
 79697 syslogd  RET   read 116/0x74
 79697 syslogd  CALL  sigprocmask(0x1,0x7fffffffce40,0x7fffffffce30)
 79697 syslogd  RET   sigprocmask 0
 79697 syslogd  CALL  gettimeofday(0x7fffffffce50,0)
 79697 syslogd  RET   gettimeofday 0
 79697 syslogd  CALL  writev(0x16,0x7fffffffcdc0,0x7)
 79697 syslogd  GIO   fd 22 wrote 157 bytes
       "Apr 26 15:30:55 <console.info> omega1 kernel: Apr 26 15:30:55 <daemon.err> omega1 stunnel: LOG3[45481:4861952]: conn\
        ect_remote: connection request timed out
       "
 79697 syslogd  RET   writev 157/0x9d
 79697 syslogd  CALL  writev(0x19,0x7fffffffcdc0,0x7)
 79697 syslogd  GIO   fd 25 wrote 157 bytes
       "Apr 26 15:30:55 <console.info> omega1 kernel: Apr 26 15:30:55 <daemon.err> omega1 stunnel: LOG3[45481:4861952]: conn\
        ect_remote: connection request timed out
       "
 79697 syslogd  RET   writev 157/0x9d
 79697 syslogd  CALL  sendto(0x6,0x7fffffffc9b0,0x8b,0,0x50c130,0x10)
 79697 syslogd  GIO   fd 6 wrote 139 bytes
       "<118>Apr 26 15:30:55 kernel: Apr 26 15:30:55 <daemon.err> omega1 stunnel: LOG3[45481:4861952]: connect_remote: conne\
        ction request timed out"
 79697 syslogd  RET   sendto 139/0x8b
 79697 syslogd  CALL  sigprocmask(0x3,0x7fffffffce40,0x7fffffffce30)
 79697 syslogd  RET   sigprocmask 0
 79697 syslogd  CALL  read(0x7,0x7fffffffd420,0x3ff)
 79697 syslogd  RET   read -1 errno 35 Resource temporarily unavailable
 79697 syslogd  CALL  select(0x8,0x513180,0,0,0x7fffffffe720)
 79697 syslogd  RET   select 0
 79697 syslogd  CALL  fsync(0x16)
 79697 syslogd  RET   fsync 0
 79697 syslogd  CALL  fsync(0x19)
 79697 syslogd  RET   fsync 0
 79697 syslogd  CALL  select(0x8,0x513180,0,0,0)
0
VlearnsAuthor Commented:
havent been able to reproduce the multiple syslogd issue yet...but here is a krace on the syslog pid
0
VlearnsAuthor Commented:
bash-3.2$ vi data
 79697 syslogd  RET   select 1
 79697 syslogd  CALL  recvfrom(0x4,0x7fffffffe730,0x400,0,0x7fffffffebc0,0x7fffffffd8dc)
 79697 syslogd  GIO   fd 4 read 93 bytes
       "<27>Apr 26 15:39:56 stunnel: LOG3[45483:198656]: connect_remote: connection request timed out"
 79697 syslogd  RET   recvfrom 93/0x5d
 79697 syslogd  CALL  sigprocmask(0x1,0x7fffffffce40,0x7fffffffce30)
 79697 syslogd  RET   sigprocmask 0
 79697 syslogd  CALL  gettimeofday(0x7fffffffce50,0)
 79697 syslogd  RET   gettimeofday 0
 79697 syslogd  CALL  writev(0x17,0x7fffffffcdc0,0x7)
 79697 syslogd  GIO   fd 23 wrote 110 bytes
       "Apr 26 15:39:56 <daemon.err> omega1 stunnel: LOG3[45483:198656]: connect_remote: connection request timed out
       "
 79697 syslogd  RET   writev 110/0x6e
 79697 syslogd  CALL  open(0x5084a0,0x5,0)
 79697 syslogd  NAMI  "/dev/console"
 79697 syslogd  RET   open 8
 79697 syslogd  CALL  writev(0x8,0x7fffffffcdc0,0x7)
 79697 syslogd  GIO   fd 8 wrote 111 bytes
       "Apr 26 15:39:56 <daemon.err> omega1 stunnel: LOG3[45483:198656]: connect_remote: connection request timed out\r
       "
 79697 syslogd  RET   writev 111/0x6f
 79697 syslogd  CALL  close(0x8)
 79697 syslogd  RET   close 0
 79697 syslogd  CALL  open(0x407516,0,0x1b6)
 79697 syslogd  NAMI  "/var/run/utmp"
 79697 syslogd  RET   open 8
 79697 syslogd  CALL  fstat(0x8,0x7fffffffc5c0)
 79697 syslogd  RET   fstat 0
 79697 syslogd  CALL  read(0x8,0x512000,0x1000)
 79697 syslogd  GIO   fd 8 read 1012 bytes
       0x0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x001e 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x003c 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x005a 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x0078 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x0096 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x00b4 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x00d2 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x00f0 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x010e 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x012c 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x014a 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x0168 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x0186 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x01a4 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x01c2 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x01e0 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x01fe 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x021c 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x023a 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x0258 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x0276 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x0294 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |..............................|
       0x02b2 0000 0000 0000 0000 0000 0000 0000 7474 7970 3000 0000 7961 6e00 0000 0000  |..............ttyp0...yan.....|
       0x02d0 0000 0000 0000 0000 3230 392e 3133 312e 3632 2e31 3133 0000 66b6 d44b 7474  |........209.131.62.113..f..Ktt|
       0x02ee 7970 3100 0000 7663 6861 7661 6e00 0000 0000 0000 0000 3230 372e 3132 362e  |yp1...vchavan.........207.126.|
       0x030c 3233 312e 3133 3200 320b d64b 7474 7970 3200 0000 6874 7275 6f6e 6700 0000  |231.132.2..Kttyp2...htruong...|
       0x032a 0000 0000 0000 3230 392e 3133 312e 3632 2e31 3133 0000 087c ce4b 7474 7970  |......209.131.62.113...|.Kttyp|
       0x0348 3300 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000  |3.............................|
       0x0366 0000 0000 0000 1815 d64b 7474 7970 3400 0000 6874 7275 6f6e 6700 0000 0000  |.........Kttyp4...htruong.....|
       0x0384 0000 0000 3230 392e 3133 312e 3632 2e31 3133 0000 185b cf4b 7474 7970 3500  |....209.131.62.113...[.Kttyp5.|
       0x03a2 0000 7663 6861 7661 6e00 0000 0000 0000 0000 3230 372e 3132 362e 3233 312e  |..vchavan.........207.126.231.|
       0x03c0 3133 3200 640b d64b 7474 7970 3600 0000 7663 6861 7661 6e00 0000 0000 0000  |132.d..Kttyp6...vchavan.......|
       0x03de 0000 3230 372e 3132 362e 3233 312e 3133 3200 cb0d d64b                      |..207.126.231.132....K|

 79697 syslogd  RET   read 1012/0x3f4
 79697 syslogd  CALL  read(0x8,0x512000,0x1000)
 79697 syslogd  GIO   fd 8 read 0 bytes
       ""
 79697 syslogd  RET   read 0
 79697 syslogd  CALL  close(0x8)
 79697 syslogd  RET   close 0
 79697 syslogd  CALL  writev(0x19,0x7fffffffcdc0,0x7)
 79697 syslogd  GIO   fd 25 wrote 110 bytes
       "Apr 26 15:39:56 <daemon.err> omega1 stunnel: LOG3[45483:198656]: connect_remote: connection request timed out
       "
 79697 syslogd  RET   writev 110/0x6e
 79697 syslogd  CALL  sendto(0x6,0x7fffffffc9b0,0x5d,0,0x50c130,0x10)
 79697 syslogd  GIO   fd 6 wrote 93 bytes
       "<27>Apr 26 15:39:56 stunnel: LOG3[45483:198656]: connect_remote: connection request timed out"
 79697 syslogd  RET   sendto 93/0x5d
 79697 syslogd  CALL  sigprocmask(0x3,0x7fffffffce40,0x7fffffffce30)
 79697 syslogd  RET   sigprocmask 0
 79697 syslogd  CALL  select(0x8,0x513180,0,0,0)
 79697 syslogd  RET   select 1
 79697 syslogd  CALL  read(0x7,0x7fffffffd420,0x3ff)
 79697 syslogd  GIO   fd 7 read 115 bytes
       "<118>Apr 26 15:39:56 <daemon.err> omega1 stunnel: LOG3[45483:198656]: connect_remote: connection request timed out
       "

 79697 syslogd  RET   read 1012/0x3f4
 79697 syslogd  CALL  read(0x8,0x512000,0x1000)
 79697 syslogd  GIO   fd 8 read 0 bytes
       ""
 79697 syslogd  RET   read 0
 79697 syslogd  CALL  close(0x8)
 79697 syslogd  RET   close 0
 79697 syslogd  CALL  writev(0x19,0x7fffffffcdc0,0x7)
 79697 syslogd  GIO   fd 25 wrote 110 bytes
       "Apr 26 15:40:00 <daemon.err> omega1 stunnel: LOG3[45484:198656]: connect_remote: connection request timed out
       "
 79697 syslogd  RET   writev 110/0x6e
 79697 syslogd  CALL  sendto(0x6,0x7fffffffc9b0,0x5d,0,0x50c130,0x10)
 79697 syslogd  GIO   fd 6 wrote 93 bytes
       "<27>Apr 26 15:40:00 stunnel: LOG3[45484:198656]: connect_remote: connection request timed out"
 79697 syslogd  RET   sendto 93/0x5d
 79697 syslogd  CALL  sigprocmask(0x3,0x7fffffffce40,0x7fffffffce30)
 79697 syslogd  RET   sigprocmask 0
 79697 syslogd  CALL  select(0x8,0x513180,0,0,0)
 79697 syslogd  RET   select 1
 79697 syslogd  CALL  read(0x7,0x7fffffffd420,0x3ff)
 79697 syslogd  GIO   fd 7 read 115 bytes
       "<118>Apr 26 15:40:00 <daemon.err> omega1 stunnel: LOG3[45484:198656]: connect_remote: connection request timed out
       "
 79697 syslogd  RET   read 115/0x73
 79697 syslogd  CALL  sigprocmask(0x1,0x7fffffffce40,0x7fffffffce30)
 79697 syslogd  RET   sigprocmask 0
 79697 syslogd  CALL  gettimeofday(0x7fffffffce50,0)
 79697 syslogd  RET   gettimeofday 0
 79697 syslogd  CALL  writev(0x16,0x7fffffffcdc0,0x7)
 79697 syslogd  GIO   fd 22 wrote 156 bytes
       "Apr 26 15:40:00 <console.info> omega1 kernel: Apr 26 15:40:00 <daemon.err> omega1 stunnel: LOG3[45484:198656]: conne\
        ct_remote: connection request timed out
       "
 79697 syslogd  RET   writev 156/0x9c
 79697 syslogd  CALL  writev(0x19,0x7fffffffcdc0,0x7)
 79697 syslogd  GIO   fd 25 wrote 156 bytes
       "Apr 26 15:40:00 <console.info> omega1 kernel: Apr 26 15:40:00 <daemon.err> omega1 stunnel: LOG3[45484:198656]: conne\
        ct_remote: connection request timed out
       "
 79697 syslogd  RET   writev 156/0x9c
 79697 syslogd  CALL  sendto(0x6,0x7fffffffc9b0,0x8a,0,0x50c130,0x10)
 79697 syslogd  GIO   fd 6 wrote 138 bytes
       "<118>Apr 26 15:40:00 kernel: Apr 26 15:40:00 <daemon.err> omega1 stunnel: LOG3[45484:198656]: connect_remote: connec\
        tion request timed out"
 79697 syslogd  RET   sendto 138/0x8a
 79697 syslogd  CALL  sigprocmask(0x3,0x7fffffffce40,0x7fffffffce30)
 79697 syslogd  RET   sigprocmask 0
 79697 syslogd  CALL  read(0x7,0x7fffffffd420,0x3ff)
 79697 syslogd  RET   read -1 errno 35 Resource temporarily unavailable
 79697 syslogd  CALL  select(0x8,0x513180,0,0,0x7fffffffe720)
 79697 syslogd  RET   select 0
 79697 syslogd  CALL  fsync(0x16)
 79697 syslogd  RET   fsync 0
 79697 syslogd  CALL  fsync(0x19)
 79697 syslogd  RET   fsync 0
 79697 syslogd  CALL  select(0x8,0x513180,0,0,0)
0
VlearnsAuthor Commented:
local0.*                                        /var/log/local0
local1.*                                        /var/log/local1
local2.*                                        /var/log/local2
local3.*                                        /var/log/local3
local4.*                                        /var/log/local4
local5.*                                        /var/log/local5
local6.*                                        /var/log/local6
local7.*                                        /var/log/local7
security.*                                      /var/log/security
auth.*                                          /var/log/auth
mail.*                                          /var/log/mail
cron.*                                          /var/log/cron
ntp.*                                           /var/log/ntp
console.*                                       /var/log/console
*.notice;kern.debug;local1.none                         /var/log/messages
*.err;kern.notice                               /dev/console
*.notice                                        root
*.emerg                                         *
*.*;local1.none                                         /var/log/all
*.*                                             @syslog1.ops.ac4.yahoo.com
0
VlearnsAuthor Commented:
thats how the syslogd.conf file looks like
0
VlearnsAuthor Commented:
-bash-3.2$ pstree 79697
--- 79697 root /usr/sbin/syslogd -svv
0
VlearnsAuthor Commented:

    0 35381 69672 2302   6  0  3764  1408 ttywri S     ??    0:00.00 /usr/sbin/syslogd -svv
    0 35382 69672 2303   6  0  3764  1408 ttywri S     ??    0:00.00 /usr/sbin/syslogd -svv
    0 35383 69672 2302   6  0  3764  1408 ttywri S     ??    0:00.00 /usr/sbin/syslogd -svv
    0 35384 69672 2302   6  0  3764  1408 ttywri S     ??    0:00.00 /usr/sbin/syslogd -svv
    0 35385 69672 2302   6  0  3764  1408 ttywri S     ??    0:00.00 /usr/sbin/syslogd -svv
    0 35388 69672 2303   6  0  3764  1408 ttywri S     ??    0:00.00 /usr/sbin/syslogd -svv
    0 35389 69672 2302   6  0  3764  1408 ttywri S     ??    0:00.00 /usr/sbin/syslogd -svv
    0 35390 69672 2302   6  0  3764  1408 ttywri S     ??    0:00.00 /usr/sbin/syslogd -svv
    0 35391 69672 2302   6  0  3764  1408 ttywri S     ??    0:00.00 /usr/sbin/syslogd -svv
    0 35392 69672 2302   6  0  3764  1408 ttywri S     ??    0:00.00 /usr/sbin/syslogd -svv
    0 35393 69672 2303   6  0  3764  1408 ttywri S     ??    0:00.00 /usr/sbin/syslogd -svv
    0 35394 69672 2302   6  0  3764  1408 ttywri S     ??    0:00.00 /usr/sbin/syslogd -svv
    0 35395 69672 2302   6  0  3764  1408 ttywri S     ??    0:00.00 /usr/sbin/syslogd -svv
    0 35396 69672 2303   6  0  3764  1408 ttywri S     ??    0:00.00 /usr/sbin/syslogd -svv
    0 35397 69672 2302   6  0  3764  1408 ttywri S     ??    0:00.00 /usr/sbin/syslogd -svv
    0 35398 69672 2302   6  0  3764  1408 ttywri S     ??    0:00.00 /usr/sbin/syslogd -svv
    0 35399 69672 2303   6  0  3764  1408 ttywri S     ??    0:00.00 /usr/sbin/syslogd -svv
    0 35400 69672 2302   6  0  3764  1408 ttywri S     ??    0:00.00 /usr/sbin/syslogd -svv
    0 35401 69672 2302   6  0  3764  1408 ttywri S     ??    0:00.00 /usr/sbin/syslogd -svv
    0 35402 69672 2302   6  0  3764  1408 ttywri S     ??    0:00.00 /usr/sbin/syslogd -svv
    0 35403 69672 2303   6  0  3764  1408 ttywri S     ??    0:00.00 /usr/sbin/syslogd -svv
    0 35404 69672 2310   6  0  3764  1408 ttywri S     ??    0:00.00 /usr/sbin/syslogd -svv
    0 35405 69672 2309   6  0  3764  1408 ttywri S     ??    0:00.00 /usr/sbin/syslogd -svv
    0 35408 69672 2309   6  0  3764  1408 ttywri S     ??    0:00.00 /usr/sbin/syslogd -svv
    0 35410 69672 2309   6  0  3764  1408 ttywri S     ??    0:00.00 /usr/sbin/syslogd -svv
    0 69672     1 2367 132  0  3764  1408 select Ss    ??    8:55.00 /usr/sbin/syslogd -svv
**************************** x^x^x ****************************


69672  seems to be the parent syslogd process from the output


#!/bin/bash
# Write a shell script to display the process running on the system for every
# 30 seconds, but only for 3 times.
# -------------------------------------------------------------------------
#
# for loop 3 times
x=1
while [ $x = 1 ];
do
        #see every process on the system
        echo "**************************** x^x^x ****************************"
        ps -alxww  | grep ` sudo cat /var/run/syslog.pid`
        echo "**************************** x^x^x ****************************"
        #sleep for 30 seconds
        sleep 1
        # clean
done


0
VlearnsAuthor Commented:
gettimeofday({1272502630, 983645}, NULL) = 0^M
writev(22, [{"Apr 28 17:57:10", 15}, {" ", 1}, {"<daemon.err> ", 13}, {"omega1", 6}, {" ", 1}, {"stunnel: LOG3[45485:8658944]: co"..., 74}, {"\n", 1}], 7) = 111^M
open("/dev/console", O_WRONLY|O_NONBLOCK) = 0^M
writev(0, [{"Apr 28 17:57:10", 15}, {" ", 1}, {"<daemon.err> ", 13}, {"omega1", 6}, {" ", 1}, {"stunnel: LOG3[45485:8658944]: co"..., 74}, {"\r\n", 2}], 7) = 100^M
writev(0, [{" timed out", 10}, {"\r\n", 2}], 2) = -1 EAGAIN (Resource temporarily unavailable)^M
fork()                                  = 13278^M
close(0)                                = 0^M
open("/var/run/utmp", O_RDONLY)         = 0^M
fstat(0, {st_mode=S_IFREG|0644, st_size=1056, ...}) = 0^M
read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 1056^M
read(0, "", 4096)                       = 0^M
close(0)    
0
VlearnsAuthor Commented:
looks like teh parent syslogd  is spawning the child

fork()                                  = 13278^M
0
VlearnsAuthor Commented:
but why?
0
tfewsterCommented:
Hi Vlearns - The "cat  /tmp/syslog_pid.out | while read PID restofline " scriptlet I suggested was to get a list of child PIDs and monitor them, but your looping script will do the job just as well. (Or just `pstree`!)  What I was hoping for was to see what the child processes "became" once the fork/exec was complete.

Throughout this thread, I'm slightly confused that the PID of the "real" syslogd appears to be changing - Are you having this problem on multiple systems or is syslogd being restarted?

From your latest output, I'm guessing that syslogd is trying to write to a "log device" (/dev/console or your remote logging server? Does "omega1" mean anything to you?) but is timing out, so it's creating a child process to retry and then moving on. If that's the case, then we need to work out why it's timing out and fix that.

More importantly, we need to check what these error messages are - They should be duplicated in /var/log/messages - and fix any underlying problem

Good luck!
tfewster
0
VlearnsAuthor Commented:


Throughout this thread, I'm slightly confused that the PID of the "real" syslogd appears to be changing - Are you having this problem on multiple systems or is syslogd being restarted?

=="its happening on multiple systems"

From your latest output, I'm guessing that syslogd is trying to write to a "log device" (/dev/console or your remote logging server? Does "omega1" mean anything to you?) but is timing out, so it's creating a child process to retry and then moving on. If that's the case, then we need to work out why it's timing out and fix that.

== omega1 is the host on which syslogd is running,

open("/dev/console", O_WRONLY|O_NONBLOCK) = 0^M
writev(0, [{"Apr 28 17:54:56", 15}, {" ", 1}, {"<daemon.err> ", 13}, {"omega1", 6}, {" ", 1}, {"stunnel: LOG3[45488:20372480]: c"..., 75}, {"\r\n", 2}], 7) = 100^M
writev(0, [{"t timed out", 11}, {"\r\n", 2}], 2) = -1 EAGAIN (Resource temporarily unavailable)^M
fork()                                  = 11621^M
close(0)                                = 0^M
open("/var/run/utmp", O_RDONLY)         = 0^M
fstat(0, {st_mode=S_IFREG|0644, st_size=1056, ...}) = 0^M

everytime a child is forked, i see the emssage abobe.
11621 is the child pid, the parent pid stays the same. as you suggested
....
i see stunnel in the log message i pasted above, so i am assuming syslog had a problem writing to/from stunnel? how do i  debug this further?




0
VlearnsAuthor Commented:
writev(0, [{"Apr 28 17:54:56", 15}, {" ", 1}, {"<daemon.err> ", 13}, {"omega1", 6}, {" ", 1}, {"stunnel: LOG3[45480:5425152]: co"..., 74}, {"\r\n", 2}], 7) = -1 EAGAIN (Resource temporarily unavailable)^M

whatdoes this mean? i see this before a child is forked in the strace on the parent
0
VlearnsAuthor Commented:
i also see the following in strace logs

--- SIGCHLD (Child exited: 20) ---^M
--- SIGCHLD (Child exited: 20) ---^M
wait4(4294967295, [WIFEXITED(s) && WEXITSTATUS(s) == 0], WNOHANG, NULL) = 11623^M
wait4(4294967295, [WIFEXITED(s) && WEXITSTATUS(s) == 0], WNOHANG, NULL) = 11622^M
wait4(4294967295, [WIFEXITED(s) && WEXITSTATUS(s) == 0], WNOHANG, NULL) = 11621^M
wait4(4294967295, 0x7fffffffd3d4, WNOHANG, NULL) = -1 ECHILD (No child processes)^M
sigreturn(0x7fffffffd400)               = 132^M


11623,22,21 are all child syslogd processes
429496729 is max number you can get with 32 bits, what does this all mean, any ideas?

0
VlearnsAuthor Commented:
bash-3.2$ sudo  pstree | grep  syslogd
 | |         \--- 72086 v grep syslogd
 |-+- 69672 root /usr/sbin/syslogd -svv
 | |--- 58008 root /usr/sbin/syslogd -svv
 | |--- 58010 root /usr/sbin/syslogd -svv
 | |--- 58012 root /usr/sbin/syslogd -svv
 | |--- 59228 root /usr/sbin/syslogd -svv
 | |--- 59459 root /usr/sbin/syslogd -svv
 | |--- 59473 root /usr/sbin/syslogd -svv
bash-3.2$ sudo strace -p 59473
Process 59473 attached - interrupt to quit
sigaction(SIGALRM, {SIG_DFL}, {0x402630, [], SA_RESTART}) = 0
sigaction(SIGTERM, {SIG_DFL}, {0x402620, [], SA_RESTART}) = 0
sigprocmask(SIG_SETMASK, [], [HUP ALRM]) = 0
setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={10, 0}}, {it_interval={0, 0}, it_value={0, 0}}) = 0
fcntl(0, F_SETFL, O_RDONLY)             = 0
writev(0, [{"Apr 30 11:51:47", 15}, {" ", 1}, {"<daemon.err> ", 13}, {"omega1", 6}, {" ", 1}, {"stunnel: LOG3[45489:5550080]: co"..., 74}, {"\r\n", 2}], 7) = 112
close(0)                                = 0
exit(0)                                 = ?
0
cjl7freelance for hireCommented:
Hi,

A 'man syslog' tells me that -s is used to send data to the syslog daemon, remote or local. Might it be that it's sending data to the remote syslog host?

//jonas
0
tfewsterCommented:
OK, let's try to summarise what I think we're seeing from the output posted here and in http://www.experts-exchange.com/Programming/System/Linux/Q_26111251.html :

- You have a number of systems ("clients") copying syslogd messages  to a central logging server, syslog1.ops.ac4.yahoo.com (also known as omega1)

- The "client" syslogds copy ALL messages ("*.*") to the central server, including .info & .debug messages, so message volumes may be high

- The clients are showing multiple syslogd processes running at times, and we think the child processes are being spawned after the clients syslogd has failed to communicate with the logging server via an "stunnel" (Secure tunnel); The child process will continue to try to relay that message and exit when complete.

- The syslog server, omega1, shows many SSL connection errors in the messages log

(When posting output, it's important to say what system you're getting that output on, so we know what we're looking at: dummy names like "webserver1" or "logging server" would be sufficient, if you don't want to post hostnames)

I suspect that it's traffic volumes, either the volume of messages being forwarded or network/firewall issues between the clients and the syslogd server, that are giving you these timeouts.  Check your network capacity between clients and syslogd server.

You also need to review the messages being logged, in case they're telling you about a problem.

If you can change the "level" of messages being forwarded from *.* to  *.info or *.notice to reduce the number of messages, that should also help. Then recheck that "appropriate" messages are getting through.
0
VlearnsAuthor Commented:
===The clients are showing multiple syslogd processes running at times, and we think the child processes are being spawned after the clients syslogd has failed to communicate with the logging server via an "stunnel" (Secure tunnel); The child process will continue to try to relay that message and exit when complete.
===

do you mean that syslogd is using stunnel to log messages?

my understanding is that stunnel is used as a proxy to handle ssl traffic(external requests)



0
VlearnsAuthor Commented:
are incomplete/bad ssl requests cause these syslog tiomeouts?

why would syslogd use stunnel to log messages to local server(omega) [omega also hangles external ssl user requests]

how to i check verify the above?
0
VlearnsAuthor Commented:
if this is indeed a capacity issue( single syslogd parent cannot handle the load?) do we change any syslogd settings?

like spawn more syslod parent processes?
0
VlearnsAuthor Commented:
its been 2 days since teh test,

now even at no loads i see syslogd processes hanging around?

-bash-3.2$ ps -auxw | grep syslogd
root    58008  0.0  0.0  3764  1176  ??  D    Fri11AM   0:00.00 /usr/sbin/syslogd -svv
root    58010  0.0  0.0  3764  1176  ??  D    Fri11AM   0:00.00 /usr/sbin/syslogd -svv
root    58012  0.0  0.0  3764  1176  ??  D    Fri11AM   0:00.00 /usr/sbin/syslogd -svv
root    59228  0.0  0.0  3764  1176  ??  D    Fri11AM   0:00.00 /usr/sbin/syslogd -svv
root    59459  0.0  0.0  3764  1176  ??  D    Fri11AM   0:00.00 /usr/sbin/syslogd -svv
root    69672  0.0  0.0  3764  1180  ??  Ds   27Apr10  31:04.03 /usr/sbin/syslogd -svv
vchavan  1015  0.0  0.0  5932  1384  pc  S+   11:27AM   0:00.01 grep syslogd
-bash-3.2$ pstree -p 69672
-+- 00001 root /sbin/init --
 \-+- 69672 root /usr/sbin/syslogd -svv
   |--- 58008 root /usr/sbin/syslogd -svv
   |--- 58010 root /usr/sbin/syslogd -svv
   |--- 58012 root /usr/sbin/syslogd -svv
   |--- 59228 root /usr/sbin/syslogd -svv
   |--- 59459 root /usr/sbin/syslogd -svv
   |--- 59473 root <defunct>
   |--- 71950 root <defunct>
   |--- 71955 root <defunct>
   |--- 71956 root <defunct>
   |--- 71957 root <defunct>
   |--- 71958 root <defunct>
   |--- 71959 root <defunct>
   |--- 71960 root <defunct>
   |--- 71961 root <defunct>
   |--- 71965 root <defunct>
   |--- 71968 root <defunct>
   |--- 71969 root <defunct>
   |--- 71970 root <defunct>
   |--- 71971 root <defunct>
   |--- 71973 root <defunct>
   |--- 71974 root <defunct>
   |--- 71975 root <defunct>
   |--- 71976 root <defunct>
   \--- 71977 root <defunct>
0
VlearnsAuthor Commented:
strace on teh child...what is it doing?

-bash-3.2$ sudo strace -p 58008
Password:
Process 58008 attached - interrupt to quit
sigaction(SIGALRM, {SIG_DFL}, {0x402630, [], SA_RESTART}) = 0
sigaction(SIGTERM, {SIG_DFL}, {0x402620, [], SA_RESTART}) = 0
sigprocmask(SIG_SETMASK, [], [HUP ALRM]) = 0
setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={10, 0}}, {it_interval={0, 0}, it_value={0, 0}}) = 0
fcntl(0, F_SETFL, O_RDONLY)             = 0
writev(0, [{"Apr 30 11:49:36", 15}, {" ", 1}, {"<daemon.err> ", 13}, {"omega1", 6}, {" ", 1}, {"stunnel: LOG3[45486:11417600]: c"..., 75}, {"\r\n", 2}], 7) = 113
close(0)                                = 0
exit(0)                                 = ?
0
VlearnsAuthor Commented:
syslogd.pid file shows just the parent


bash-3.2$ sudo cat /var/run/syslog.pid | more
69672
0
VlearnsAuthor Commented:
hey experts....any updates?
0
tfewsterCommented:
=== are incomplete/bad ssl requests cause these syslog tiomeouts?
If you have network issues between the clients and the syslog server, that would cause timeouts

===why would syslogd use stunnel to log messages to local server(omega) [omega also hangles external ssl user requests]
stunnel is a general purpose utility to send messages securely between two systems; As system messages could be sensitive, it makes sense to use stunnel, (though I don't think it's done by default).

In your outputs in these two questions, I saw at least 4 IP address ranges being used, so I wouldn't say all the client systems are "local" even if they're in the same datacentre.

By the way, the syslog server really should be a secure system inside your firewall, so evil people can't attack it and hide their tracks by deleting logs!


===how to i check verify the above?
As I said, it appears to me that the syslog clients are experiencing timeouts and therefore spawning child processes to retry, and also that they're using stunnel for communications. But rather than spend time trying to prove the mechanism I'd look at likely causes, i.e.:


- I suspect that it's traffic volumes, either the volume of messages being forwarded or network/firewall issues between the clients and the syslogd server, that are giving you these timeouts.  Check your network capacity between clients and syslogd server.

- You also need to review the messages being logged, in case they're telling you about a problem.

- If you can change the "level" of messages being forwarded from *.* to  *.info or *.notice to reduce the number of messages, that should also help. Then recheck that "appropriate" messages are getting through.

Regards,
tfewster
0
VlearnsAuthor Commented:
syslogd is timing out on writing to file descriptor 0 (which is stdin), why is syslog writing to stdin...how do i determine teh setting thats causing it to do that?????

 writev(0, [{"Apr 30 11:49:36", 15}, {" ", 1}, {"<daemon.err> ", 13},
> {"omega1", 6}, {" ", 1}, {"stunnel: LOG3[45486:11417600]: c"..., 75},
0
VlearnsAuthor Commented:
#       Spaces ARE valid field separators in this file. However,
#       other *nix-like systems still insist on using tabs as field
#       separators. If you are sharing this file between systems, you
#       may want to use only tabs as field separators here.
#       Consult the syslog.conf(5) manpage.
local0.*                                        /var/log/local0
local1.*                                        /var/log/local1
local2.*                                        /var/log/local2
local3.*                                        /var/log/local3
local4.*                                        /var/log/local4
local5.*                                        /var/log/local5
local6.*                                        /var/log/local6
local7.*                                        /var/log/local7
security.*                                      /var/log/security
auth.*                                          /var/log/auth
mail.*                                          /var/log/mail
cron.*                                          /var/log/cron
ntp.*                                           /var/log/ntp
console.*                                       /var/log/console
*.notice;kern.debug;local1.none                         /var/log/messages
*.err;kern.notice                               /dev/console
*.notice                                        root
*.emerg                                         *
*.*;local1.none                                         /var/log/all
*.*                                             @syslog1.ops.ac4.yahoo.com
0
VlearnsAuthor Commented:
syslogd.conf pasted above..gus can you see anything suspicious?
0
tfewsterCommented:
===syslogd.conf pasted above..gus can you see anything suspicious?

*.*                                             @syslog1.ops.ac4.yahoo.com

This copies ALL messages from debug level upwards from ALL subsystems on the clients to the remote syslog server. Unless you have a very good reason for that, change it to an appropriate level as I suggested. That will reduce network traffic.

You also need to follow up my other suggestions, on checking network capacity and what messages are actually getting logged. Forget the relay mechanism until you've eliminated more likely causes.
0
VlearnsAuthor Commented:
heres my conclusion based on what we have seen so far

syslogd is trying to write to file descriptor 0 (console) and is timing out

that brings me to two options

1) remove this line
*.err;kern.notice                               /dev/console


2) start the syslogd in non forking mode if possible ( is there a way to change the behavior so that syslogd does not fork if there is a timeout? )
0
VlearnsAuthor Commented:
is it possible to Have a command line option to syslogd to limit the number of forks so the machine isn't brought down by a fork frenzy.
0
VlearnsAuthor Commented:


-bash-3.2$ ls -la /dev/console
crw-------  1 root  wheel    0,   8 May 12 18:18 /dev/console  (previously syslogd used to write here)
-bash-3.2$ ls -la /var/log/console
-rw-rw-r--  1 root  wheel  4581 May 12 18:17 /var/log/console(now it writes here)

do you see any permissions issue?

0
tfewsterCommented:
==heres my conclusion based on what we have seen so far
syslogd is trying to write to file descriptor 0 (console) and is timing out

I disagree, syslogd is timing out trying to log to the remote server, causing an error-level daemon event; THAT daemon.err event is being logged to the console (as specified in syslog.conf). If you have a console attached, you can verify that those messages are actually appearing
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
VlearnsAuthor Commented:
problem
syslogd is timing out while writing daemon.err to dev/console because of permissions or capacity issue
 
solution

rewrite to a file. by making following change in syslog.conf

console.*;*.err;kern.notice                      /var/log/console
#*.err;kern.notice                               /dev/console


this solves the problem
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Unix OS

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.