Solved

Forced DC1 Demotion, now DC2 has issues

Posted on 2004-10-27
897 Views
Last Modified: 2012-08-14
I had two DCs in 1 domain.

DC1 = Win2000 SP4 [FMSO roles / GC]
DC2 = Win2003

DC1 used to be standalone DC, but about 4 months ago brought DC2 online (using ADMT) as a companion DC.  Never gave it GC functions, just DNS, DCHP, and AD functions along with DC1  Decided we wanted to remove DC1 and convert to Novell file server.  Changed DC2 to be Global Catalog and have all FSMO roles.  Tried to gracefully demote DC1, but it wouldn't budge.  So I did DCPROMO /FORCEREMOVAL and then the metadata cleanup.  

...now I get "domain does not exist" issues and my sysvol is completely empty(!).  AD info is still available from DC2, but I have to get it by doing "connect to domain controller" in AD Users & Computers, then manually entering in DC2.  Logging now reveals that DC2 is being prevented from becoming full DC because of SYSVOL issue.  

...now my delimma:  DC1 has been totally blown away as it related to active directory (i.e., no sysvol to recover with).  It hasn't even rejoined the domain because, obviously, the domain quit working once DC1 got demoted.  DC2 appears to have the AD info, but Sysvol is in "struggle mode."  DC2 has all appropriate FSMO roles, GC roles, and DNS/DHCP roles given and verified, so that probably isn't it.   How do I make DC2 produce a SYSVOL?  

...in the meantime, we've been scrounging up our tape buckups for system state info.  Might not have recent DC1 System State because of recent tape issue.  Any idea on how to make this happen possibly w/o system state info?  SS info would be idea because DC1 is still active.  Restoring SS would be a good start, but might not be possible.

...i realize there are quite a few precautions that weren't taken like always having more than 1 DC at any given time, checking File Replication event logs more closely, not just assuming I had been making system state backups, etc etc.  ...I know.  But those comments won't help me fix the problem, so trust me I'm very cognizant of these things.

Thanks and looking forward to EE's famously helpful answers.
0
Question by:Colebert
    15 Comments
     
    LVL 11

    Expert Comment

    by:WeHe
    0
     

    Author Comment

    by:Colebert
    Found a DC1 SS from 10/14.  Loaded that onto DC1 and it appears I have regained some domain functionality (e.g., Logging in).  But I'm still left with the problem of DC2 and its inability to replicate.  This is now compounded by the fact that I have it thinking its all the FSMOs while DC1 is back online as FSMOs.  I think its probably not causing a problem yet because it can be FSMOs all it wants, but if it is prevented from even being a full DC, it ain't gonna matter.  

    Hows the best way to bring DC2 back into the fold so I can try (again) to make it the sole DC (for now)?  DCPROMO it out of the AD and Domain, then ADMT it back in with DC1 as the supplying AD source?
    0
     
    LVL 51

    Expert Comment

    by:Netman66
    Seize the roles to server 2 with NTDSUTIL.

    http://support.microsoft.com/default.aspx?scid=kb;en-us;255504

    If you have SYSVOL contents now, but the server won't advertise because of SYSVOL inconsistencies, then we need to D4 that server.

    http://support.microsoft.com/default.aspx?scid=kb;en-us;315457

    Once SYSVOL starts advertising then you'll need to identify and correct (manually) all your GPOs that are in an incomplete or non-existent state.

    0
     

    Author Comment

    by:Colebert
    Since I restored the system state of DC1 (the original 2000 server i was trying to demote and ended up forceably domoting) shouldn't the best course be to just delete the AD on DC2 (the 2003 server) and rejoin to 2000?   since the system state on DC2 its its a sole DC w/ full FSMO roles in the domain but isn't becoming a DC b/c of the SYSVOL/replication issues, wouldn't it be better to just forget trying to repair the AD on DC2 and demote and promote back to DC1?

    Also, I've got another server we were going to promote to DC3 (win2k server) after we removed DC1.  Should my first step be to setup DC3 and DCPROMO it before I do anything else.  Just for the sheer safety of having two DCs?


    Thanks!
    0
     
    LVL 51

    Expert Comment

    by:Netman66
    Is AD functioning properly on DC1?

    If so, you can D4-D2 the servers to make DC1 authoritative.  This way DC2 will discard SYSVOL and count on replication to pull DC1's SYSVOL in.

    If that fails, then DCPROMO out DC2 then back in - but seize back the roles to DC1 and make it a GC first.


    0
     

    Author Comment

    by:Colebert
    can you clairify what happens when i change burflags to D4?  (i'm assuming thats what you're referring to.)


    AD is functions properly on DC1.  Actually, the restored SystemState on DC1 was a SystemState prior to transferring the FSMO roles to DC2.  So it had full FSMO roles as well as GC function.  DC2 (which, again, isn't really a DC because the FRS logs say it didn't get a replication which "prevents it from becoming a DC") has those roles, too because you can change the roles, make it a GC, etc etc but the domain doesn't know it because the FRS is preventing it from assuming its rightful place as a DC for the domain.  

    Thats where I was before the sys state restore of DC1.  My whole domain was in complete limbo because I believed DC2 was fully functional, but in reality it was just taking the FSMO transfer and a GC role but was not in a position for any objects in the domain to link to it.

    So, as I understand it, I've got my DC1 back to a state before I made the changes, but a DC2 thats still in struggle mode because of FRS issues.  I know the DC1 is running the domain again because before i made the DC1 system state restore, all my users couldn't log in on clients except where their login info was cached (thank god for the caching or i'd have shot myself).  Now my users can log back in everywhere.  

    ..which leaves me with DC2 in AD limbo.  


    If I go the DCPROMO route on DC2, should I check the box that indicates its the last DC in the domain?  (Since I don't want it to try to trasnfer out any AD info.)
    0
     
    LVL 51

    Expert Comment

    by:Netman66
    Hmmm... good questions.

    On DC1, does DC2 show up as a DC or even a member?

    If not, I would trash the DC2 installation and start over with that box.  There's way too much at stake if you force remove it off the network - it's very possible that any further AD installations will have big issues.

    I understand what you are getting at with the last DC question, but again, it's part of the original domain with that domain SID, whether DC1 knows about it or not there is effectively an identical AD installed on DC2.  Removing it forcefully while not on the production LAN might work, but I would not trust that box again in the future to provide proper AD functionality for any domain.

    My opinion - reinstall the OS on DC2 and do a full backup prior to rejoining the domain - that way you can roll back.  It shouldn't take more than a few hours to redo things.  It's time well spent up front compared to the time you might have to spend fixing a corrupted production AD.
    0
     

    Author Comment

    by:Colebert
    well, i've been staggering along for the last very days without really taking any action.  network runs, but has serious issues.  

    For starters, I can't add any object to the domain at all.  If I try to add a user, I get an error about some network resource (i forget at the moment) not being available.  (i.e., there is not any available resource in a pool.)  MS indicates this is caused by the situtation I indicated above (having two servers in a domain with FSMOs).  

    I don't have a system state for DC2 from prior to making it the FSMO only to find it can't replicate.  And I'd love to blow away DC and do a fresh install, except that it has a specialized accounting program on it that we did not have the expertise in-house to install and contracted out for, so a fresh install isn't my first choice.  

    I'm leaning towards trying to DCPROMO DC2 out of the domain and indicate its the last DC.  Then DCPROMO it back up to DC status.   I've got SysState backups of both servers now and figure worst case I just put DC2 pre-DCPROMO system state back on and try something different.  

    Any thoughts about that solution?
    0
     
    LVL 51

    Expert Comment

    by:Netman66
    I think the error you are encountering is a RID master error stating there are no more RID numbers left in the pool.  This is because the AD has no idea where the RID master is.

    I'd have to begin to think that you have a fairly significant problem right now with your AD - one that you should stop trying to guess your way through with our help.  Since I can't remote in - board policy - I can't do more without actually seeing what's happening myself.  

    If things are as critical as they appear and you cannot reinstall the new server either because of the software installed - your best approach now before it completely trashed is to get Microsoft PSS involved.  It's going to cost you - but it's a small price to pay to save your infrastructure.

    The more you try unsuccessful things, the worse this will get - perhaps beyond what MS can fix.

    Let us know how you make out.

    0
     

    Author Comment

    by:Colebert
    worst case i have to redo my entire active directory of 40 users, 80 computers, and 4 printers.  Filer server is unaffected.  

    probably will just really brush up on my MSKB articles, follow some of their solutions for these problems, employ liberal system state backups, and hope it works out.

    let you know how it goes.  
    0
     
    LVL 51

    Expert Comment

    by:Netman66
    Well, if you're brave, try this.

    Using the last backup you had before adding the 2003 server (where AD was functioning) - do an Authoritative Restore of Active Directory.

    Here's a doc:

    http://support.microsoft.com/default.aspx?scid=kb;en-us;241594

    Once this is complete, your first server should be working properly.  The next step would be to remove AD from the 2003 box - while OFF the LAN, but attached to a hub or switch so the network stack initializes during the boot.  Do a DCPROMO /forceremoval:  http://support.microsoft.com/default.aspx?scid=kb;en-us;332199

    Let me know.

    0
     

    Author Comment

    by:Colebert
    everything seems to come back to this file replication issue.  

    i killed all the replication links from DC1 to DC2.  Then did a metadata cleanup on DC1 to remove all instances of DC2.  Then I DCPROMOed DC2 out of the domain.   Finally, I restarted DC1 and viola, I can now join objects to the domain.  No more RID Master Pool errors.

    ...Then I took a fresh 2k machine and joined it to the AD.  Still no file replication across the servers.  REPADMIN /SHOWREPS reports everything a seccess.  EventLogging says otherwise.

    ...then everything completely fell apart.  My domain went back to being non-existant and nothing can join the domain again.  Can't automatically connect to it in AD Domains&Trusts.  Says no domain exists.  DC1 says it can't find the global catalog again, and DCDIAG says no server is advertising, but most everything else completes sucessfully.  

    Gonna probably restore SS on DC1 and start over.  This time not joining any server to the domain and just leave it where everything was working.  Then make a new test domain out of DC3 (a new 2k box) and get it ready for production.  Then manually move over all the users and turn it into a production box.  Join DC2 to DC3 and blow away DC1, as originally intended, for my new novell server.

    thoughts?
     
     
    0
     
    LVL 51

    Expert Comment

    by:Netman66
    System state restore is not enough.  Do an Authoritative AD restore - then report back what you see.

    0
     

    Author Comment

    by:Colebert
    fixed the problem.

    i always sort of thought that my sysvol was damaged, corrupt, or (at the very least) 'just ain't right.'   but since i didn't understand it well enough, I didn't want to mess with it.  I tried everything under the sun to get replication working because it was apparent that was the main and original problem independent of my foolish /FORCEREMOVALs, etc.  I went back a year through my file replication logs on DC1 and found that everything went swimmingly until one day back in 2003 it started throwing up warnings.  

    I found this article (315457) on MS about repairing SYSVOLs: http://support.microsoft.com/default.aspx?kbid=315457.  

    It says to set the BURFLAGS key in your master DC's HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\NtFrs\Parameters\Cumulative Replica Sets\[Specific GUID] to D4 (from D0).  Then I restarted the FRS service in the services list.  The event log showed that windows completely rebuilt my sysvol from given active directory info.  Then after a minute or two, the AD log showed that it had kicked in as a Global Catalog server.  Ran a dcdiag and the ADVERTISING failure went away.

    ...the only thing left was to get DC2 and DC3 replicating with DC1.   So I changed their HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\NtFrs\Parameters\Cumulative Replica Sets\[Specific GUID] to D2 (not D4!) and restarted their FRS services.  They started replicating with DC1 and immediately became legitimate domain controllers as well.  (the replication issue has kept them from ascending to working DCs.)

    I kept a watch all day on the AD and FRS event logs for errors.  Restarted all the DCs and things kept working fine.  Since the whole point of this was to down DC1 and turn it into a novell server, at the end of the day I shut it down competely and plan to leave it off all weekend since I made DC2 the FSMO masters and both DC2 and DC3 GCs.  It should be noted that I had to add myself as a SCHEMA ADMIN to transfer the schema master role and ENTERPRISE ADMIN to transfer the rest.

    finally, I had a little OU consisting of 15 lab computers operating under a semi-kiosk style group policy.  after the SYSVOL restore, its group policy got deleted entirely.  It still showed up until the GPMC but upon trying to edit it said it didn't exist.   So keep that in mind as a possible issue of a D4/D2 SYSVOL restore.

    ...all for now.  

    Thanks Netman!


    0
     
    LVL 51

    Accepted Solution

    by:
    Glad to see you got it working - now, if you look back in the posts to : Date: 10/28/2004 02:07AM ADT, you'll see that I mentioned this was something to do back then.

    "If you have SYSVOL contents now, but the server won't advertise because of SYSVOL inconsistencies, then we need to D4 that server.

    http://support.microsoft.com/default.aspx?scid=kb;en-us;315457

    Once SYSVOL starts advertising then you'll need to identify and correct (manually) all your GPOs that are in an incomplete or non-existent state."


    In a sense it's good you went through the motions because (IMHO) that's how people learn and actually retain what they've learned - by fixing issues directly.  I can tell you how to do something until I'm blue in the face, but until you actually have to use it you won't retain it.

    0

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    What Should I Do With This Threat Intelligence?

    Are you wondering if you actually need threat intelligence? The answer is yes. We explain the basics for creating useful threat intelligence.

    by Batuhan Cetin In this article I will be guiding through the process of removing a failed DC metadata from Active Directory (hereafter, AD) using the ntdsutil tool in a Windows Server 2003 environment. These steps are not necessary in a Win…
    This may not be a text book method to resolve VSS backup issues but it seemed to have worked on few of the Windows 2003 servers we had issues while performing a Volume Shadow Copy backup. If you have issues while performing a shadow copy backup usin…
    Hi everyone! This is Experts Exchange customer support.  This quick video will show you how to change your primary email address.  If you have any questions, then please Write a Comment below!
    This video discusses moving either the default database or any database to a new volume.

    846 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    9 Experts available now in Live!

    Get 1:1 Help Now