If you've spent any time administering Active Directory, you've probably come across the concept of Flexible Single Master Operations
(FSMO) roles. Their introduction is arguably one of the most important but misunderstood changes to Active Directory in the last ten years.
Take a trip down memory lane
In the days of Windows NT, one may recall the Primary Domain Controller (PDC) and Backup Domain Controller (BDC) concept. The directory was structured such that every DC, whether a PDC or a BDC, had a copy of the directory database, but only
the PDC could make changes to that database. The model was inefficient, negatively impacted growth and desperately needed improving if the product had any chance of surviving.
Enter Windows 2000. The Directory Service went through one of its largest scale rebuilds to date. Replication and management was significantly improved and the concept of having a multi-master directory was introduced. Although this design has been tweaked over the years, fundamentally, it has remained the same through the versions - because it works. Any
in the domain can execute virtually any
update to the directory. This scales beautifully, even on large, geographically dispersed networks with many thousands of users.
However, notice I said virtually
any change. Since a change can take effect at any DC, there is the possibility that a conflicting change will be made in two locations concurrently - or before replication can occur. Active Directory must ensure these situations are accounted for. In most cases, it applies its complex Multimaster Conflict Resolution Policy, which essentially says the last change wins
. However, there are several procedures which simply cannot conflict; these procedures are assigned to one of the five FSMO roles, which go on to be delegated to one or more Domain Controllers.
What are the FSMO roles?
There are nominally five roles present in the directory which reside on DCs nominated specifically by the Administrator to perform these tasks. All the roles are very important and constitute a single point of failure in all Active Directory enterprises. If you have a complex topology with more than one domain, some roles are domain-specific, so you can expect to have duplicates of some roles in every domain in the enterprise.
The Domain Naming Master exists once per forest - in the forest root domain - and is rarely used. It is responsible for making changes to the Partitions container in the Configuration naming context at the root of the forest (CN=Partitions,CN=Configuration,DC=company,DC=com). The aforementioned path is the location of the objects used to identify child domains, application partitions and external cross-references (links out to other LDAP directories). Edits are strictly controlled and limited by FSMO role placement to one DC to ensure no objects of this type are duplicated with the same name.
For example, when using the Domain Controller Promotion tool (dcpromo.exe) to create a new child domain or domain tree, the process will contact the Domain Naming Master role holder in the forest root domain to determine whether the domain name provided is unique. Similarly, if demoting the last domain controller in a child domain, the Domain Naming Master will again be contacted to clean the metadata from Active Directory.
Infrastructure Master:If a user from a foreign domain within the same forest is added as a member of a compatible group in another domain, the DCs in the group’s domain must have some information about that user in its local database in order to update the member attribute of the group. To do this, it adds a special record to its database called a phantom, which contains only the foreign user’s security identifier (SID), globally unique identifier (GUID) and their distinguished name (DN). Like all objects in the database, this record is given a distinguished name tag, or DNT, an internal reference used solely in the low-level Active Directory database layer. In doing this, the directory service is able to add that user as a member of the group by referring to the phantom’s DNT, just like it would refer to a user’s own DNT if you added a user from the group’s own domain to the group.
That’s very clever, but what if something about the source user in their original domain changes? If the user is renamed, moved or deleted, the phantom in the group domain DC databases would lose its referential integrity with the source domain. This is a situation the infrastructure master aims to avoid. On a periodic basis (by default, every 2 days), the infrastructure master – an FSMO role present in every domain – compares its local database to a Global Catalog (GC) server to determine whether any changes have been made to the objects the phantoms were created to represent. A GC contains a partial replica of all objects in the forest, so replication means any GC would already know about this updated data. The phantom is then updated with new values or deleted from the domain’s database if the object has been removed from its source domain.
In a multi-domain forest, you must either locate this role on a Domain Controller which is not a Global Catalog or, if you must locate the role on a GC, ensure all DCs in that particular domain are GCs. A GC will never create phantoms because it already knows about users from other domains. If the infrastructure master is a GC, there will never be any phantoms in its local database to compare with the global catalog data, so no updates will be made, but other non-GC DCs in the domain would gradually become outdated. If all DCs in the domain are GCs, or you only have a single-domain forest, every DC knows enough about the security principal that it does not need to create a phantom, so this role is essentially redundant.
Schema Master: As the name suggests, this role is the Master of the Schema, the information which contains the formal definitions of how Active Directory stores objects, what attributes are available on those objects and so on. This role exists once per forest, on a DC in the forest root domain. Any updates to the Schema must be tightly controlled, so one DC delegated as the Schema Master performs all such changes to the database. Schema updates are then replicated to other DCs on the network by standard Active Directory replication.
So far, three of the five roles have been covered. Those above are those I would consider the least critical FSMO roles in the forest. If you lose the DC delegated one or more of these roles, it's no big deal -- it may prevent a network administrator taking an action, but it will not impact the usability of the network. Losing the Domain Naming Master or Schema Master would create problems in regard to creating child domains or running schema updates, but these generally occur very rarely and checking this Operations master DC is up would be part of the planned engineering works. Similarly, losing the Infrastructure Master may cause integrity issues in the database, but given that it only runs its scan every two days in the first place, a day or two of outage will not generally cause an issue.
RID Master: This role is one of the two which are important to the daily operation of Active Directory. Under the glossy GUI of Windows, security principals are identified and differentiated by use of two values - a Security Identifier (SID) and a Globally Unique Identifier (GUID).
A SID is an alphanumeric string which is unique throughout a forest. The SID is the actual value used internally by Windows to identify users and grant access to resources using Discretionary Access Control Lists (DACLs), for example, via the 'Security' tab on a file or directory. Have you ever deleted a user, recreated her, then wondered why she cannot access the same files and folders, despite having the same username? The new account would have a new SID and is therefore considered an entirely different security principal to the system.
Contrary to popular belief, the username, distinguished name or full name of a user are not internal tracking mechanisms within Windows as all these values could change.
The standard make up of an SID might be as follows (this SID is purely random):
The nature and formation of an SID is beyond the scope of this article, but it is the very last octet (in this instance, 10823) we are interested in. This figure represents a Relative Identifier (RID), an incremental value which actually makes the SIDs unique within a domain, ensuring no two users conflict in the database. When a security principal (user, computer, group etc.) is created, the domain SID (in this instance S-1-5-21-789336058-1123561945-725345543) has the next available RID appended to the end.
Each Domain Controller is initially allocated a pool of 500 RIDs. As security principals are created, RIDs are used up. The allocation of RIDs to DCs is a task delegated in the RID Master FSMO role to one DC in a domain. Placing the operation in an FSMO role ensures no DC obtains a duplicate RID pool, which would eventually lead to conflicts in SID values and a major problem in terms of SID-uniqueness within the domain.
is the most complicated and least understood role, for it runs a diverse range of critical tasks. It is a domain-specific role, so exists in the forest root domain and every child domain. Its original conception was for backwards compatibility with legacy systems, such as Windows NT BDCs. However, the role is also responsible for keeping the domain time in sync
, given that the DC holding this role in the forest root domain is the most authoritative time source
in the forest. Password changes and account lockouts are immediately processed at the PDC Emulator for a domain, to ensure such changes do not prevent a user logging on as a result of multi-master replication delays, such as across Active Directory sites.
It should be noted that the PDC Emulator does not
act in the same fashion as a PDC on a Windows NT network. Cast your eye back to the top of this article and note the section regarding a multi-master directory -- for multi-master aware applications, most updates can be made at any
DC on the network. However, if an application (or Operating System) is not multi-master aware, the PDC Emulator acts as if it were the PDC on the Windows NT network. One of these older applications would most probably single out the PDC Emulator and write all its changes there.
The latter two roles are much more crucial to the daily operation of the network and could very quickly become a limiting factor in its growth, usability or even the logon process if the DC(s) holding the roles are offline for any period of time. If the RID Master is lost, impact will only be felt by the Network Administrator if a DC depletes its pool of RIDs. On busy networks, this could potentially occur in a matter of days through the creation of new security principals. However, loss of the PDC Emulator could directly
affect your users -- you'd better have a substantial help desk ready for a spike in call volume if this DC is down for an extended period of time. For example, with the most authoritative source of time unavailable, time skew could eventually occur between DCs and computers in the enterprise and/or domain, lending itself to Kerberos authentication errors and ultimately, failed logons. While it would not be an immediate issue to take this server offline (provided you do not have any legacy applications), this would be the role I would be most concerned about in the event of a DC failure.
If you are still reading, well done! This article covers several aspects of Active Directory in detail, including low-level database processes unseen at the surface - particularly via the GUI. However, FSMO roles are a crucial component of your deployment -- having an understanding of the underpinning concepts will help with their placement, deployment and high availability concerns within your enterprise.
I have a question.
Lets say we have 2 sites SITE A and SITE B.
The PDC emulator in site A goes down.
A user in site A requests helpdesk to change password.
helpdesk person seeing that the PDC emulator in Site A is down, tries to reset the password on DC @ Site B - hoping the replication will communicate the new password.
The user tries the new password on Site A - and cannot login.
My question is;
a) What happens when the PDC emulator in site A comes-up, will take the previous password / or take the new one from Site B.
b) Is there anyway we can avoid PDC emulators from going-down. What do you suggest for business continuity purposes.
c) Is there some configuration required of InterSite Links (DEFAULTIPSITELINK) or does the AD take care of it by itself.
Thanks for a really comprehensive article.
>> What happens when the PDC emulator in site A comes-up, will take the previous password / or take the new one from Site B.
If the PDCe is offline, the password change cannot be replicated to it. In this case, a changed password will be replicated around other DCs by standard non-urgent replication, which could cause a delay in the user using the updated password due to replication latency between sites. You would need to be careful in circumstances without a functioning PDCe that you make the password change on a DC in the same site as the user; if you don't, there is no PDC to act as, if you like, the "arbitrator", and it could take several hours for the change to reach the user's site depending on your replication topology.
If your site links are good and across strong connections, you could potentially enable change notifications between sites to have changes replicated without the replication delay. This is not something you should enable without careful planning and consideration of the bandwidth and connection reliability requirements though.
When the PDCe is back online, the update sequence number (USN) of the change will be used by standard Active Directory replication to ensure the PDCe replicates the new password back from another Domain Controller which already has the new password. The PDCe certainly won't come back online and replicate the old password out to Domain Controllers.
>> Is there anyway we can avoid PDC emulators from going-down. What do you suggest for business continuity purposes.
A PDCe is just as liable to go down as any other server, but you can take some standard steps to protect yourself: use RAID in some sort of redundancy configuration (RAID 1/10 etc), redundant power supplies, ECC RAM and uninterruptible power supplies (UPS) to protect from the most likely things which are going to fail. However, if you do lose a PDCe, it's not THAT big a deal; hopefully you don't run any other roles on that server, so you can simply write the box off, seize the operations role(s) it held and run a metadata cleanup of the old DC to remove it from the network. Just make very sure you don't bring the failed box back online without formatting it first.
>> Is there some configuration required of InterSite Links (DEFAULTIPSITELINK) or does the AD take care of it by itself.
If you want to enable features such as change notifications between sites -- that is, to replicate changes to other sites with the same frequency as they replicate within a site by intra-site replication -- you would need to enable that on your IP site links.
Otherwise, no, you don't need to worry about this. Passwords will ALWAYS be replicated to the PDCe if it is available, no matter what the configuration on the site links is. Note that Domain Controllers can be prevented from checking with the PDCe when there is a password mismatch, but this is not a default configuration.
Thanks for the education :-)
Him: You're the president, I'm the vice president.
Him: You have all the stuff you can do (the FSMO roles) that I can't.
Him: BANG! You're dead, and I get to take over.
Me: Close - what allows the vice-president to take over?
Him: He gets sworn in
Me: Correct - and that's what seizing the roles does - it swears in another domain controller as the FSMO role holder.
Him: So, Nixon leaving office was a role transfer?
Me: And Kennedy being killed was a role seizure.
I have to hand it to him - it's brilliant.