Link to home
Start Free TrialLog in
Avatar of Heartnet
Heartnet

asked on

Windows accounts getting locked all the time

I have 2 Windows 2003 DC. Early this week, suddenly we got issue with all users accounts keeps getting locked. I have unlocked them and they get locked back after a period of time. It happens on random group of users but eventually every single users account get locked. I have tried gpupdate, rebooting, and scanning the DCs themselves for possible viruses.
I found nothing and it does nothing.
There's no policy change as far as I know before January 5th and suddenly on January 5th, all hell break loose. it's been 3 days and I couldn't figure out what's the issue.

All users are using using WIndows XP SP2.

Under audit log, there's no failure log. All it shows is Success Audit on User Account Locked out. For example:
User Account Locked Out:
       Target Account Name:      <username>
       Target Account ID:      <domain\username>
       Caller Machine Name:      <user machinename>
       Caller User Name:      <dc machinename>$
       Caller Domain:      <dc name>
       Caller Logon ID:      (0x0,0x3E7)

Event ID: 644

Under Application log, there's bunch of warning since August 6, 2008:
Security policies were propagated with warning. 0xd : The data is invalid.

Advanced help for this problem is available on http://support.microsoft.com. Query for "troubleshooting 1202 events".
Event ID: 1202

Any ideas?

Thanks.
Avatar of TDKD
TDKD
Flag of United States of America image

Until we fix this issue, I would suggest setting all users in AD to "Account Never Expires"

IMPORTANT: Do you currently have a Password policy enforced by way of GPO?
The reason I ask about the password policy is because I know Admin's who envoked a password policy on a domain and all hell broke loose (e.g. all domain accounts became locked out). The only quick resolution was to set the AD account to "Never Expiore".
Avatar of Heartnet
Heartnet

ASKER

Some accounts have been set to never expire, some are not. Both account type are affected by this issue.

Yes, there's password GPO on this.
It is set as:
Account Policies/Password
Policy Setting
Enforce password history 10 passwords remembered
Maximum password age 90 days
Minimum password age 0 days
Minimum password length 6 characters
Password must meet complexity requirements Enabled

Account Policies/Account Lockout
Policy Setting
Account lockout duration 30 minutes
Account lockout threshold 5 invalid logon attempts
Reset account lockout counter after 30 minutes
Quote:
TDKD:
The reason I ask about the password policy is because I know Admin's who envoked a password policy on a domain and all hell broke loose (e.g. all domain accounts became locked out). The only quick resolution was to set the AD account to "Never Expiore".


There's been no changes to any GPO at all. And everything was working fine until early this week.
Any new trusts established between Servers? Any new Servers? Is this Domain in a mixed or Native environment?
snusgubben: I already have that tool. It doesn't help much
TDKD: No new trusts, no new servers, they are in Native mode.
The LockOut tool is usually good to see what time, Server and individual user is locked out, but probably not the tool of choice for your issue.
I would check the Event Logs on a few of the users in question, determine if there is an issue with how they are handling the GPO, and if there has been any update pushed out to all users that may have adverese affects such as these? (e.g. Any update, not just MS updates).
I just unlocked accounts and it seems like every 20 seconds system is locking other accounts according to audit log event. It's going in loop here. After a while accounts that I was unlocked will become locked again.
Ok, is there any service that caches old credentials on your Domain?
Has there been changes (any type) on the domain level?
Any new updates to your Virus scan?
You dont have GPO's cpmpeting against one another do you?
Any lab servers that may be causing issues? (e.g. Server setup to validate with AD using SiteMinder?)
Have you checked the AD sync logs to make sure all DC's are in sync ( this can happen if one DC is not up to date).
Ok, is there any service that caches old credentials on your Domain?
Has there been changes (any type) on the domain level?
Any new updates to your Virus scan?

I will check on the services. No changes on domain level that I know and tehre should not have any changes. I have the latest update on antivirus.
You dont have GPO's cpmpeting against one another do you?
I'll doube check but I don't think there is.
Any lab servers that may be causing issues? (e.g. Server setup to validate with AD using SiteMinder?)
We don't have any lab servers.
Also check the Event Logs on a computer in question..I will continue to brain storm while you check these items ...
Since you already got the tool run EventCombMT.exe and pull out "644" events from both DCs (if that's the only event that is logged and not 529, 675, 676, 681). You can then see if it's only one DC that logs 644 events.

Check the replication between the two DCs:

"repadmin /showreps"

Has there been a recently successful replication?


SG  
A great article on how to use these tools is located here...

http://searchwindowsserver.techtarget.com/tip/0,289483,sid68_gci1271584,00.html
Here is a great article for troubleshooting common causes for account lockaouts Heartnet

http://technet.microsoft.com/en-us/library/cc773155.aspx
snusgubben: Yes, I ran that command and it shows the last successful attempt is on today about 20 minutes ago.
TDKD: I will read more on the tools as well as the troubleshooting guide. I'll check the AD synch record as well.
Sounds like a good starting point Heartnet. I will also keep checking my resources.
By the way, your users are not all connecting by way of VPN are they? Or are they local to where the DC is located?
TDKD: That article does points out couple interesting. We have Exchange 2003, however, we don't have any legacy apps in any servers. No WINS enabled either.
AD synch properly.

I found this error under System log in Event Viewer:
A Kerberos Error Message was received:
on logon session
Client Time:
Server Time: 21:27:43.0000 1/7/2009 Z
Error Code: 0xd KDC_ERR_BADOPTION
Extended Error: 0xc00000bb KLIN(0)
Client Realm:
Client Name:
Server Realm: <domain>
Server Name: host/<computername.domain>
Target Name: host/<computername.domain>@<domain>

Error Text:
File: 9
Line: ae0
Error Data is in record data.
It started today when I enabled the Kerberos log using EnableKerbLog.vbs from the tool mentioned by SG. Not sure if it has anything to do. Any clue where I can find this log? I never check any Kerberos log before.
 
TDKD: Only a few users are using VPN, almost 90% of them are local to DC.
Ok,

Have you verified that DNS is working properly? DCdiag utility is good for this


DCDIAG.exe (runs checks)

netdiag /test:dns (ensures DNS is ok)

Check the security event log to see if the DC is accepting logons

if its in a multi dc environment go into sites and services, go into the site where the dc sits and expand the servername then expand ntds settings - check that replication objects have been created (and also try to replicate -> right click -> replicate now)

if its the only DC, try and log onto it with a client!

check the event logs for errors to indicate stuff isnt working.
Sry, didnt see this question before...

It started today when I enabled the Kerberos log using EnableKerbLog.vbs from the tool mentioned by SG. Not sure if it has anything to do. Any clue where I can find this log? I never check any Kerberos log before.

Usually Kerberos-related events can be found in the system log.
TDKD: I'll be away for a moment and I'll give the DNS check a try to see what will come up. I'll check Kerberos again. Those message are found from System log, however.
Also, check to see if local accounts on the user's computers are loclked out? I want to make sure we are not dealing with an infection of sorts.

Replication and Account Lockout
Account lockout relies on the replication of lockout information between domain controllers to ensure that all domain controllers are notified of an accounts status. In addition, password changes must be communicated to all domain controllers to ensure that a user's new password is not considered incorrect. This data replication is accomplished by the various replication features of Active Directory and is also discussed in this section.
Immediate Replication
When you change a password, it is sent over Netlogon's secure channel to the PDC operations master. Specifically, the domain controller makes a remote procedure call (RPC) to the PDC operations master that includes the user name and new password information. The PDC operations master then locally stores this value.
Immediate replication between Windows 2000 domain controllers is caused by the following events:
"      Lockout of an account
"      Modification of a Local Security Authority (LSA) secret
"      State changes of the Relative ID (RID) Manager
Urgent Replication
Active Directory replication occurs between domain controllers when directory data is updated on one domain controller and that update is replicated to all other domain controllers. When a change in directory data occurs, the source domain controller sends out a notice that its directory store now contains updated data. The domain controller's replication partners then send a request to the source domain controller to receive those updates. Typically, the source domain controller sends out a change notification after a delay. This delay is governed by a notification delay. (The Windows 2000 default notification delay is 5 minutes; the Windows Server 2003 default notification delay is 15 minutes.) However, any delay in replication can result in a security risk for certain types of changes. Urgent replication ensures that critical directory changes are immediately replicated, including account lockouts, changes in the account lockout policy, changes in the domain password policy, and changes to the password on a domain controller account. With urgent replication, an update notification is sent out immediately, regardless of the notification delay. This design allows other domain controllers to immediately request and receive the critical updates. Note, however, that the only difference between urgent replication and typical replication is the lack of a delay before the transmission of the change notification. If this does not occur, urgent replication is identical to standard replication. When replication partners request and subsequently receive the urgent changes, they receive, in addition, all pending directory updates from the source domain controller, and not only the urgent updates.
When either an administrator or a delegated user unlocks an account, manually sets password expiration on a user account by clicking User Must Change Password At Next Logon, or resets the password on an account, the modified attributes are immediately replicated to the PDC emulator operations master, and then they are urgently replicated to other domain controllers that are in the same site as the PDC emulator. By default, urgent replication does not occur across site boundaries. Because of this, administrators should make manual password changes and account resets on a domain controller that is in that user's site.
The following events are not urgent replications in Windows 2000 domains:
"      Changing the account lockout policy
"      Changing the domain password policy
"      Changing the password on a computer account
"      Domain trust passwords
For additional information about urgent and immediate replication, see "Urgent Replication Triggers in Windows 2000" in the Microsoft Knowledge Base.
That was for reading , i have to go to a work site, but will check in with you ASAP.
SOLUTION
Avatar of snusgubben
snusgubben
Flag of Norway image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Good suggestion snusgubben. I would also check the users local event logs, it will usually state something like unable to contact a net logon server or something like unable to contact a domain controller.
Hi Heartnet,

I was just checking in to see how its going? Your user's dont all manually mapp network resources do they? I know I should have asked this from the start, but I figured you use scripts to mapp all your AD user's drive mappings.
Hi TDKD,
Sorry for the delay reply. Yes they are using KIXTART script on GPO on the logon section that automatically maps network resources to users.
Local accounts doesn't appear to be locked. At least not while I'm checking them. However, I find this information from this forum. I don't know if it has something similar but it happens almost at the same time as we are:
http://forum.kaspersky.com/lofiversion/index.php/t98887.html
On the server, we can access to the mentioned sites, so I doubt this is the cause.
I'm attaching the files from SG suggested scan for reviewing. It seems like DNS issue from what it looks like. This result is from yesterday before I left office.

netdiagtest.txt
dcdiagtest.txt
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Could you also post a "ipconfig /all" from a client PC?


SG
SG:
I don't notice time difference between two DC, nor between DC and a client.
This is the event log from VNSV20004 on the same time. The test is from VNS20005
Event Type: Warning
Event Source: W32Time
Event Category: None
Event ID: 47
Date:  1/7/2009
Time:  1:53:03 PM
User:  N/A
Computer: VNSV20004
Description:
Time Provider NtpClient: No valid response has been received from  manually configured peer vnsv20004.van.am.arcint after 8 attempts to contact it. This peer will be discarded as a time source and NtpClient will attempt to discover a new peer  with this DNS name.  
For more information, see Help and Support Center at
[i]http://go.microsoft.com/fwlink/events.asp[/i].

Event Type: Error
Event Source: W32Time
Event Category: None
Event ID: 29
Date:  1/7/2009
Time:  1:53:03 PM
User:  N/A
Computer: VNSV20004
Description:
The time provider NtpClient is configured to acquire time from one or more time sources, however none of the sources are currently accessible.  No attempt to contact a source will be made for 120 minutes. NtpClient has no source of accurate time.
For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.


 ipconfig /all from a client:
<Attached> -- FYI: <Domain> is something I purposely put in the file. It's our public domain name.

ipconfig.txt
Not sure if this helps or not. I have my AD Users and Computers up on server and I got my account locked. However, I'm able to unlock my own account and it works fine afterwards.
do you guys know by any chance any tool that allows me to unlock ALL users with couple clicks?
Nevermind on unlocking tool. Seems like http://www.wisesoft.co.uk/Products/PasswordControl/ does the job.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I would have to agree with snusgubben, your logs dont seem to be related to the issue at hand. Your users are not seeing errors on the client side when this happens? Has auditing of user accounts been enabled on the client side?
Check the Security Logs in particular...
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I found couple interesting fact on this locking behaviors:
Around 10.30 AM, accounts are starting to get locked every few minutes. It gets worse around noon time. With that tool above, literally every 1-2 seconds, there's account being locked.
Around 2:00 PM, it seems to slow down and I only see 1 user account getting locked at 2:15 PM.
Not sure if this is behavior of timed virus or the DC is doing replication during this time. How do I check when DC is doing its synchronization/replication/etc?
OK. I will do that. I only enable log on server side. Let me enable that and find it out from client side. I might have to wait till tomorrow for it. I did run couple scanning on couple client computers but couldn't find anything. I'm using Avira with the latest database update.
It's not a replication error. If it was it would have been listed in the dcdiag log.

Replication is not a "scheduled task" that runs at a spesific time. If a object is changed a replication occurs (not that correctly, but close. It's a little bit more complex).


SG
I think it would be a good idea to get a grip of some network analyzer tools to analyze the traffic towards you DCs.

Since it sounds like something happens around noon, maybe you could see if there are any hosts that generate much traffic (TCP 445, 139)


SG
That's excellent idea, SG. Any recommendation on the tool or I can just grab any analyzer?
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thank you, SG. I will grab that tool and test it out tomorrow. I'll post the updates afterwards.
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
For the first time in this week, I don't have to unlock any user accounts. Seems like removing the trojan solved the problems. Nevertheless, I'll monitor the situation on Monday and if everything works, I'll close this case.
Before that, many thanks to TDKD and Snusgubben for the help. It's helpful and I learnt couple useful things to work out problems. I'll update on Monday. This is my first time in EE, so if anyone else having similar issue, they might want to check the trojan possibilities before going through the steps.
Hi Heartnet,

Thats great news!! I have been in meetings on and off today and haven't had much time to check in with you:-( But I am very excited for your find, It sure sounds like that was it. Keep us posted :-)

Warm Regards,
Tony D.
Hi Tony and SG,

I'm gladly to inform the case is closed.The culprit is that trojan. I have no more issue today.
Thank you for the help.
Hi Heartnet,

Please reward points to SG and myself for the efforts we made, and for pointing you in the right direction (points are an acknowledgement of our hard work to assist you)
>Notice: Heartnet has requested that this question be closed by accepting Heartnet's comment #23339200 (0 points) as the solution and snusgubben's comment #23320088 (50 points), snusgubben's comment #23327286 (50 points), snusgubben's comment #23330635 (50 points), TDKD's comment #23330693 (80 points) and snusgubben's comment #23331147 (50 points) as the assisted solutions for the following reason:

I noticed this, never mind...And I am very glad your sit is all set, take good care.