FBMECS
asked on
unable to log in solaris 10
Dear experts,
We have a Solaris 10 installed on a SPARC machine and every 2-3 months no-one can log in.
the message we get is "Login failure: Error in Underlying Service module"
We can fix this by booting in signle mode and rmoving the password from the shadow file.
does anyone know why this is happening?
is there any other way to fix this without booting in sigle mode?
We have a Solaris 10 installed on a SPARC machine and every 2-3 months no-one can log in.
the message we get is "Login failure: Error in Underlying Service module"
We can fix this by booting in signle mode and rmoving the password from the shadow file.
does anyone know why this is happening?
is there any other way to fix this without booting in sigle mode?
You can make the password entry for root user empty and login into the machine as root and check /etc/pam.conf and verify if the related PAM SPI (pam modules) are present and valid (i.e /usr/lib/security/pam*).
Does this problem happen on any connection (including console only) or only certain connections (e.g. ssh only, or restricted to some subnet)?
The system logs (/var/adm/messages*) may be pretty useful in analysis but make sure you omit/mask all sensitive piece of information.
The system logs (/var/adm/messages*) may be pretty useful in analysis but make sure you omit/mask all sensitive piece of information.
ASKER
When this occurs all ssh connections are not working without any message.
console login fails with the message as discibed in the question.
it looks like when passwords expire this issue is happening.
I have tried to disable expiration of passwords but this is happening still.
console login fails with the message as discibed in the question.
it looks like when passwords expire this issue is happening.
I have tried to disable expiration of passwords but this is happening still.
As for ssh; can you use debug on your ssh client? Depending on your ssh version it may be any of the following:
(in my experience maximum of four v's is sufficient)
Also, as dfke recommended; can you please post your /etc/pam.conf so that we can take a look?
ssh -vvvv
ssh -v -v -v -v
(in my experience maximum of four v's is sufficient)
Also, as dfke recommended; can you please post your /etc/pam.conf so that we can take a look?
ASKER
This is my pam.conf file
#
#ident "@(#)pam.conf-winbind 1.1 07/05/15 SMI"
#
# Copyright 2007 Sun Microsystems, Inc. All rights reserved.
# Use is subject to license terms.
#
# PAM configuration
#
# Unless explicitly defined, all services use the modules
# defined in the "other" section.
#
# Modules are defined with relative pathnames, i.e., they are
# relative to /usr/lib/security/$ISA. Absolute path names, as
# present in this file in previous releases are still acceptable.
#
# Authentication management
#
# login service (explicit because of pam_dial_auth)
#
login auth requisite pam_authtok_get.so.1
login auth required pam_dhkeys.so.1
login auth required pam_unix_cred.so.1
login auth required pam_unix_auth.so.1
login auth required pam_dial_auth.so.1
#
# rlogin service (explicit because of pam_rhost_auth)
#
###rlogin auth sufficient pam_rhosts_auth.so.1
###rlogin auth requisite pam_authtok_get.so.1
###rlogin auth required pam_dhkeys.so.1
###rlogin auth required pam_unix_cred.so.1
###rlogin auth required pam_unix_auth.so.1
#
# Kerberized rlogin service
#
krlogin auth required pam_unix_cred.so.1
krlogin auth binding pam_krb5.so.1
krlogin auth required pam_unix_auth.so.1
#
# rsh service (explicit because of pam_rhost_auth,
# and pam_unix_auth for meaningful pam_setcred)
#
###rsh auth sufficient pam_rhosts_auth.so.1
rsh auth required pam_unix_cred.so.1
#
# Kerberized rsh service
#
krsh auth required pam_unix_cred.so.1
krsh auth binding pam_krb5.so.1
krsh auth required pam_unix_auth.so.1
#
# Kerberized telnet service
#
ktelnet auth required pam_unix_cred.so.1
ktelnet auth binding pam_krb5.so.1
ktelnet auth required pam_unix_auth.so.1
#
# PPP service (explicit because of pam_dial_auth)
#
ppp auth requisite pam_authtok_get.so.1
ppp auth required pam_dhkeys.so.1
ppp auth required pam_unix_cred.so.1
ppp auth required pam_unix_auth.so.1
ppp auth required pam_dial_auth.so.1
#
# Default definitions for Authentication management
# Used when service name is not explicitly mentioned for authentication
#
other auth requisite pam_authtok_get.so.1
other auth required pam_dhkeys.so.1
other auth required pam_unix_cred.so.1
other auth required pam_unix_auth.so.1
#
# passwd command (explicit because of a different authentication module)
#
passwd auth required pam_passwd_auth.so.1
#
# cron service (explicit because of non-usage of pam_roles.so.1)
#
cron account required pam_unix_account.so.1
#
# Default definition for Account management
# Used when service name is not explicitly mentioned for account management
#
other account requisite pam_roles.so.1
other account sufficient pam_unix_account.so.1
other account required pam_winbind.so
#
# Default definition for Session management
# Used when service name is not explicitly mentioned for session management
#
other session required pam_unix_session.so.1
#
# Default definition for Password management
# Used when service name is not explicitly mentioned for password management
#
other password required pam_dhkeys.so.1
other password requisite pam_authtok_get.so.1
other password requisite pam_authtok_check.so.1
other password sufficient pam_winbind.so
other password required pam_authtok_store.so.1
#
# Support for Kerberos V5 authentication and example configurations can
# be found in the pam_krb5(5) man page under the "EXAMPLES" section.
#
###krlogin auth required pam_krb5.so.1
###krsh auth required pam_krb5.so.1
###ktelnet auth required pam_krb5.so.1
And this is the message we get in the /var/adm/message
Jan 3 09:54:49 uni-test login: [ID 468494 auth.crit] login account failure: Error in underlying service module
Jan 3 09:54:49 uni-test svc.startd[11]: [ID 694882 daemon.notice] instance svc:/system/console-login: default exited with status 1
#
#ident "@(#)pam.conf-winbind 1.1 07/05/15 SMI"
#
# Copyright 2007 Sun Microsystems, Inc. All rights reserved.
# Use is subject to license terms.
#
# PAM configuration
#
# Unless explicitly defined, all services use the modules
# defined in the "other" section.
#
# Modules are defined with relative pathnames, i.e., they are
# relative to /usr/lib/security/$ISA. Absolute path names, as
# present in this file in previous releases are still acceptable.
#
# Authentication management
#
# login service (explicit because of pam_dial_auth)
#
login auth requisite pam_authtok_get.so.1
login auth required pam_dhkeys.so.1
login auth required pam_unix_cred.so.1
login auth required pam_unix_auth.so.1
login auth required pam_dial_auth.so.1
#
# rlogin service (explicit because of pam_rhost_auth)
#
###rlogin auth sufficient pam_rhosts_auth.so.1
###rlogin auth requisite pam_authtok_get.so.1
###rlogin auth required pam_dhkeys.so.1
###rlogin auth required pam_unix_cred.so.1
###rlogin auth required pam_unix_auth.so.1
#
# Kerberized rlogin service
#
krlogin auth required pam_unix_cred.so.1
krlogin auth binding pam_krb5.so.1
krlogin auth required pam_unix_auth.so.1
#
# rsh service (explicit because of pam_rhost_auth,
# and pam_unix_auth for meaningful pam_setcred)
#
###rsh auth sufficient pam_rhosts_auth.so.1
rsh auth required pam_unix_cred.so.1
#
# Kerberized rsh service
#
krsh auth required pam_unix_cred.so.1
krsh auth binding pam_krb5.so.1
krsh auth required pam_unix_auth.so.1
#
# Kerberized telnet service
#
ktelnet auth required pam_unix_cred.so.1
ktelnet auth binding pam_krb5.so.1
ktelnet auth required pam_unix_auth.so.1
#
# PPP service (explicit because of pam_dial_auth)
#
ppp auth requisite pam_authtok_get.so.1
ppp auth required pam_dhkeys.so.1
ppp auth required pam_unix_cred.so.1
ppp auth required pam_unix_auth.so.1
ppp auth required pam_dial_auth.so.1
#
# Default definitions for Authentication management
# Used when service name is not explicitly mentioned for authentication
#
other auth requisite pam_authtok_get.so.1
other auth required pam_dhkeys.so.1
other auth required pam_unix_cred.so.1
other auth required pam_unix_auth.so.1
#
# passwd command (explicit because of a different authentication module)
#
passwd auth required pam_passwd_auth.so.1
#
# cron service (explicit because of non-usage of pam_roles.so.1)
#
cron account required pam_unix_account.so.1
#
# Default definition for Account management
# Used when service name is not explicitly mentioned for account management
#
other account requisite pam_roles.so.1
other account sufficient pam_unix_account.so.1
other account required pam_winbind.so
#
# Default definition for Session management
# Used when service name is not explicitly mentioned for session management
#
other session required pam_unix_session.so.1
#
# Default definition for Password management
# Used when service name is not explicitly mentioned for password management
#
other password required pam_dhkeys.so.1
other password requisite pam_authtok_get.so.1
other password requisite pam_authtok_check.so.1
other password sufficient pam_winbind.so
other password required pam_authtok_store.so.1
#
# Support for Kerberos V5 authentication and example configurations can
# be found in the pam_krb5(5) man page under the "EXAMPLES" section.
#
###krlogin auth required pam_krb5.so.1
###krsh auth required pam_krb5.so.1
###ktelnet auth required pam_krb5.so.1
And this is the message we get in the /var/adm/message
Jan 3 09:54:49 uni-test login: [ID 468494 auth.crit] login account failure: Error in underlying service module
Jan 3 09:54:49 uni-test svc.startd[11]: [ID 694882 daemon.notice] instance svc:/system/console-login:
I can't see any obvious errors in your pam.conf and the syslog contains the symptom, not the reason. Any entries above this one that may be related to the start of problem, i.e. something related to pam service, not to login service?
Also, please clarify "every 2-3 months no-one can log in":
- the problem appears every 2-3 months (exact time varies or not?) and it affects everyone at the same time? or
- the problem appears for everyone every 2-3 months, but not necessarily at the same time?
I assume the former and it doesn't sound like password expiration. What if you set password expiration for user longexp to 6 months and for user shortexp to 7 days? (or even 1 day) Just curious if the expiration of shortexp's password would trigger the problem.
Also, please clarify "every 2-3 months no-one can log in":
- the problem appears every 2-3 months (exact time varies or not?) and it affects everyone at the same time? or
- the problem appears for everyone every 2-3 months, but not necessarily at the same time?
I assume the former and it doesn't sound like password expiration. What if you set password expiration for user longexp to 6 months and for user shortexp to 7 days? (or even 1 day) Just curious if the expiration of shortexp's password would trigger the problem.
one more thing; try adding "debug" to the pam.conf and verify that syslog for level DEBUG is redirected somewhere (may be usual /var/adm/messages but it'll probably cause lots of contamination)
e.g. (snippet)
Note: check man pages (e.g. "man pam_authtok_get") for details. The man page for pam_unix_auth doesn't mention the "debug" option but you may give it a try.
e.g. (snippet)
login auth requisite pam_authtok_get.so.1 debug
login auth required pam_dhkeys.so.1 debug
login auth required pam_unix_cred.so.1 debug
login auth required pam_unix_auth.so.1
login auth required pam_dial_auth.so.1 debug
Note: check man pages (e.g. "man pam_authtok_get") for details. The man page for pam_unix_auth doesn't mention the "debug" option but you may give it a try.
More info on PAM debug here:
http://blog.simplex-one.com/?p=515
http://blog.simplex-one.com/?p=515
ASKER
I have the expiration of passwords set to 9999 in the /etc/default/passwd file.
But this does not work.
Also this happens on the same day for all users every 2-3 months (have not couneted excactly)
When we reset the root password this issue is solved.
But this does not work.
Also this happens on the same day for all users every 2-3 months (have not couneted excactly)
When we reset the root password this issue is solved.
Not sure but I think /etc/default/passwd doesn't influence already existing users.
But more interestingly, do you say that if you reset the root password _only_ then the issue is solved for _all_ users?
But more interestingly, do you say that if you reset the root password _only_ then the issue is solved for _all_ users?
ASKER
Yes that is correct.
once i Boot on single mode and remove the root password, then reset it to somthing else the issue is resolved for all users.
once i Boot on single mode and remove the root password, then reset it to somthing else the issue is resolved for all users.
So my last idea is the set of debug traces but it's hard to justify if you need to wait 2-3 months till the next occasion. Assume it's a production server but if it's not; just to rule out the influence of system date, could you change system time to 4 Apr 2014 and see if problem occurs "immediately"?
One more question: does the problem reoccur every 2-3 months even if there are reboots in-between? To be more exact:
- when was the last time the problem occurred before 3 Jan 2014?
- how many times and when was the system rebooted between these two dates?
Explanation of this question:
Another thing popped into my mind when thinking about timers; in past some commands (e.g. ps) printed some cryptic message that turned out to be an inconsistency of system timer: "Unknown HZ value! (XX) Assume YY." where XX and YY were two integers. It typically happened a number of weeks or maybe months after reboot because the server was apparently damn too fast compared to what the kernel expected. This counter can't be cheated by setting the system time, only running without reboot for several weeks/months (depending on HW).
One more question: does the problem reoccur every 2-3 months even if there are reboots in-between? To be more exact:
- when was the last time the problem occurred before 3 Jan 2014?
- how many times and when was the system rebooted between these two dates?
Explanation of this question:
Another thing popped into my mind when thinking about timers; in past some commands (e.g. ps) printed some cryptic message that turned out to be an inconsistency of system timer: "Unknown HZ value! (XX) Assume YY." where XX and YY were two integers. It typically happened a number of weeks or maybe months after reboot because the server was apparently damn too fast compared to what the kernel expected. This counter can't be cheated by setting the system time, only running without reboot for several weeks/months (depending on HW).
ASKER
This is production server so I cannot change the date/time.
THe server is never rebooted unless there is a major Patching needing reboot.
the last time we have rebooted the server was mid October after installing latest Solaris 10 patches.
we had no issue until the root password expired. I am trying to change the root password every 1 month to check and confirm that this is what is causing the issue.
is there a way to have the root password never expire?
THe server is never rebooted unless there is a major Patching needing reboot.
the last time we have rebooted the server was mid October after installing latest Solaris 10 patches.
we had no issue until the root password expired. I am trying to change the root password every 1 month to check and confirm that this is what is causing the issue.
is there a way to have the root password never expire?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.