lkurugan
asked on
uptime not showing number of days it is up
Hi all,
uptime on a solaris server doesn't show number of days it is up..issue might be wtmpx/utmpx getting truncated not sure or it might be a bug..does any one had similar issue? please let me knwo.
xyz # uptime
4:01am 6 users, load average: 12.23, 4.17, 2.62
#
uptime on a solaris server doesn't show number of days it is up..issue might be wtmpx/utmpx getting truncated not sure or it might be a bug..does any one had similar issue? please let me knwo.
xyz # uptime
4:01am 6 users, load average: 12.23, 4.17, 2.62
#
ASKER
uptime is looking at utmpx, utmpx is getting truncated I think...
ASKER
some thing is screwing up utmpx, not sure which script it is and couple of my colleagues wrote dtrace to see which one is screwing up the files..but couldn't figure out ..I need to find out what process is screwing up utmpx and wtmox..any ideas?
The most obvious place is to check all crontabs. Look in /var/spool/cron/crontabs to see all scheduled crons for all users.
Also check /etc/logadm.conf to see if anything happens to utmpw/wtmpx there. Also check if you're using runacct.
What's the oldest record in the utmpx database right now ?
(btw: you're right, I trussed uptime and it uses utmpx not wtmpx)
Some people (admins) try to keep utmpx small by truncation it.
Unfortunately, as it is a binary file and not a text file, it cannot
be truncated using simple commands like "tail" etc.
If you do run
last
or
last | grep boot
what do you get as the most recent entries?
Unfortunately, as it is a binary file and not a text file, it cannot
be truncated using simple commands like "tail" etc.
If you do run
last
or
last | grep boot
what do you get as the most recent entries?
ASKER
we are not running runacct, is there a way on solaris 9 to see which processes is touching, modifying utmpx,wtmpx files?
I verified cron entries and they seem to be fine.
I verified cron entries and they seem to be fine.
Do you have any entries at all (use "last")?
If not, copy it away and "rewind" it using
> /var/adm/wtmpx
To find out what is modifying it:
a) when was it modified last (ls -l /var/adm/wtmpx) ?
b) Run "last" against that date. If there is an entry with that
date it's likely that this event changed it and not your script
c) Check all (!) crontabs
grep -v '^#' /var/spool/cron/crontabs/*
d) Also, see if you have any at jobs scheduled:
at -l
If not, copy it away and "rewind" it using
> /var/adm/wtmpx
To find out what is modifying it:
a) when was it modified last (ls -l /var/adm/wtmpx) ?
b) Run "last" against that date. If there is an entry with that
date it's likely that this event changed it and not your script
c) Check all (!) crontabs
grep -v '^#' /var/spool/cron/crontabs/*
d) Also, see if you have any at jobs scheduled:
at -l
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
I doubt that this is really the reason as it is a normal consition
that multiple users do log out at the same time.
To make sure wtmpx does not get corrupted, only one entry
will get inserted at a time. If there are more, others will get
queued.
Solaris 10 is a very well developed and mature OS.
that multiple users do log out at the same time.
To make sure wtmpx does not get corrupted, only one entry
will get inserted at a time. If there are more, others will get
queued.
Solaris 10 is a very well developed and mature OS.
Uptime seems to need the latest boot record from wtmpx to determine how long the system has been up.
You say the wtmpx gets truncated but you don't know why ?