Solved

DFS and remote sites - part II

Posted on 2004-03-28
32
476 Views
Last Modified: 2010-05-18
This thread was born of http://www.experts-exchange.com/Operating_Systems/Windows_Server_2003/Q_20931263.html#10699678
 
It has been created because the original focus of the thread has shifted, as well as a lot of time and effort has gone into the previous thread.  I felt that it warranted a new thread...
0
Comment
Question by:theamzngq
  • 17
  • 12
  • 3
32 Comments
 
LVL 51

Expert Comment

by:Netman66
Comment Utility
I'll be with you shortly.
0
 
LVL 51

Expert Comment

by:Netman66
Comment Utility
Post the errors you're getting.

As for the third-party app - I'm not following you.  I think you can restore to the pre-staging area and start DFS this will then use the pre-stage area to fill the real DFS share - however, it still verifies every file against the root before it moves it from pre-stage.

0
 
LVL 2

Author Comment

by:theamzngq
Comment Utility
The article mentions restoring from NTbackup or a comparable 3rd party backup solution as a way of pre-staging the DFS shared folder.  Then, DFS will move the restored backup to the pre-existing folder, and then once it gets the MD5 checksum, will choose to move files from the pre-existing folder if the checksum matches.
0
 
LVL 51

Expert Comment

by:Netman66
Comment Utility
Didn't the replication complete the other day?

0
 
LVL 2

Author Comment

by:theamzngq
Comment Utility
I'm sorry, I am speaking in reference to Server3!  I wanted to add it to the DFS root (having previously removed it), but the frs-staging folder filled up the C drive, causing the server to act 'werid', not printing, not responding to IIS stuff, etc.  I was looking for a way to move the staging folder to the drive that has enough space, or a way to pre-populate the shared folder, ie, restore from backup.

Sorry for the confusion....

It turns out that my attempt to restore from backup and start the process over has failed again.  The frs-staging folder has filled up the C drive again, causing the afrementioned issues.  Dang.
0
 
LVL 51

Expert Comment

by:Netman66
Comment Utility
I have to ask this...both Server2 and Server3 are at the same location - correct?

Forget Server3 as a DFS replica - it's not worth the effort since they're both on the same LAN.  You can normally bring up a dead server fairly quickly unless it's toast.  Failing that, a backup of the DFS folder can be restored manually to Server3 in a crisis.

0
 
LVL 2

Author Comment

by:theamzngq
Comment Utility
Yes, that's correct, they are on the same LAN.  

The full restore of all our data that I did to Server3 just recently took over 5 hours.  5.5 hours x 100 people x about $125/hour/person = over $68,000 that we cannot bill to clients for the time it takes to restore all the data.  My only goal was to give us some redundancy and immediate failover in the case that our ever-getting-older main fileserver bites it.  One thing the boss hates the most is wasted time...

I'm just about to take Irvine off-wire and change its IP....
0
 
LVL 2

Author Comment

by:theamzngq
Comment Utility
Once I move Irvine in to the Irvine Site, what should the DNS settings on Irvine and on the client machines in irvine be?
0
 
LVL 51

Expert Comment

by:Netman66
Comment Utility
DNS for Irvine server should be itself with the ISP as a Forwarder.

All clients there should point to Irvine only.


0
 
LVL 2

Author Comment

by:theamzngq
Comment Utility
As far as the DNS for Irvine as you mentioned before, since Irvine should update its own DNS entries, does that mean I shouldn't go in and manually update them in Server2's DNS?
0
 
LVL 11

Expert Comment

by:ewtaylor
Comment Utility
I think  if you require 0 downtime then you should look into a active/passive cluster solution. http://www.microsoft.com/technet/prodtechnol/windows2000serv/evaluate/featfunc/clustovw.mspx
0
 
LVL 2

Author Comment

by:theamzngq
Comment Utility
That would be nice, and actually is where I would like to head.  One of the prevailing reasons for that is that I am developing a Cold Fusion based intranet site which will eventually become the nerve center of our business operations, serving up all pertinent data like client info, billing, payroll, timesheets, invoicing, memos, etc.  Once its gets to the point where it is indispensible, it would be nice to have a cluster solution so that if one web server goes down, the system will be able to keep on humming.
0
 
LVL 51

Expert Comment

by:Netman66
Comment Utility
Irvine should - in the perfect world.

You can let it try and just monitor and confirm it has or has not.  At least you know what to look for now~!~

Clustering is an idea, but keep in mind you need the Enterprise versions of the OS (either W2K Adv. Server or 2003 Enterprise) to cluster without a third party tool.

Also, in clustering you'll need a shared data array - something I was hinting at earlier.  You could certainly start there.

0
 
LVL 2

Author Comment

by:theamzngq
Comment Utility
What kind of device is a shared data array?  Is is a SCSI kind of thing, or a NAS kind of thing?  Are there any performance issues?
0
 
LVL 11

Expert Comment

by:ewtaylor
Comment Utility
The one I have is a shared scsi bus that connects to both computers. They then run a continous ping on the second private NIC if no reply the software will fail the control of the scsi array to the other inactive cluster and activate it. I was running this with windows nt 4.0 mscs 1.0 and had to work out a few bugs, I was actually running it on a file server and had another one running for exchange and after getting the initial bugs worked out they worked really well. Windows 2k and 2k3 both have clustering technology built into it (from advanced server up).
0
 
LVL 2

Author Comment

by:theamzngq
Comment Utility
ok, Irvine is up and running.  I have NetSupport access to it on it new IP 192.168.111.2.  What should I check first?  Shall I do the Sysvol/scripts test you mentioned first, or should I wait for Event viewer to give me  a 13509?
0
IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 
LVL 51

Expert Comment

by:Netman66
Comment Utility
For the shared storage - we have a Fibre channel SAN with a dedicated fibre switch - but you don't need to go that high end.

You'll definitely need to use a SAN that is capable of being access from two servers.


Now, with respect to the testing - do the SYSVOL test to make sure the AD replication is happening.  Also, try creating a user from Irvine and see if it replicates to Server2.

0
 
LVL 51

Expert Comment

by:Netman66
Comment Utility
Check DNS on Server2 for Irvine and check Irvine's DNS for Irvine.

0
 
LVL 2

Author Comment

by:theamzngq
Comment Utility
I have checked the DNS on both, and they both reflect the correct addresses.  I created a text document in the winnt\sysvol\sysvol\wse.com\scripts.  Repadmiin /showreps shows that replication inbound from server2 was good:

C:\Documents and Settings\Administrator.WSE>repadmin /showreps
Irvine\IRVINE
DC Options: IS_GC
Site Options: (none)
DC object GUID: 51f814c3-f364-482a-8553-72a476a41261
DC invocationID: ba8b3fc4-dd78-4614-8bf1-0e933e7450e5

==== INBOUND NEIGHBORS ======================================

DC=wse,DC=com
    Vegas\SERVER2 via RPC
        DC object GUID: 6233f4eb-40c9-47a7-9096-2f1e88d0c8b1
        Last attempt @ 2004-03-30 12:13:47 was successful.

CN=Configuration,DC=wse,DC=com
    Vegas\SERVER2 via RPC
        DC object GUID: 6233f4eb-40c9-47a7-9096-2f1e88d0c8b1
        Last attempt @ 2004-03-30 12:13:47 was successful.

CN=Schema,CN=Configuration,DC=wse,DC=com
    Vegas\SERVER2 via RPC
        DC object GUID: 6233f4eb-40c9-47a7-9096-2f1e88d0c8b1
        Last attempt @ 2004-03-30 12:13:47 was successful.

But, the file (which was there before then) did not appear on Irvine in the same location.  What do you think?
0
 
LVL 2

Author Comment

by:theamzngq
Comment Utility
I created a user on Irvine, which did eventually appear on Server2, but it took a couple of cycles.  Still no test files in sysvol appearing on Irvine.

Irvine's event viewer just came up with a couple of 13508's, one for sysvol, and one for datastore
0
 
LVL 2

Author Comment

by:theamzngq
Comment Utility
Looks like the hosts file was the culprit.  I remembered that we added entries for all the servers  hosts files, so I went back into them and corrected them.  The files in Server2's sysvol ended up on Irvine.  I put a file in Datastore on Irvine, and it showed up on Server2.  Things are looking good, but there are still a couple of things that I'm not sure of:

1) The ghost IrvineServer still comes up when I do repadmin /showreps from Server2 only.

2) In Sites and Services, should there be two connections showing in NTDS for all three servers?  That is NOT the case currently.  
    Under Server2 there is a connection to Server3;
    under Irvine there is a connection to Server2;
    under Server3 there is a connection to Irvine AND a connection to Server2.  That seems strange...

3)  I have the replication interval set to 15 minutes.  Does that mean that replication begins every 15 minutes and keeps going until it has caught up with all files that need replicating?  Or does it mean that it checks every 15 minutes and only replicates for the next 15 minutes?

4) We went through so much so quickly, I can't remember if there is something else I can check, or even where some of the tools were.  Part of our interaction, Netman66, was via Netsupport chat, which did not get recorded.  What do you think?
0
 
LVL 51

Accepted Solution

by:
Netman66 earned 500 total points
Comment Utility
Nice troubleshooting - I forgot we did that.

1) Ignore it.  To do a metadata cleanup now is going to cause a fair bit of work.

2) No, not really.  KCC has determined the best topology based on connectivity.  No need to worry.

3) If your interval is too small it will overlap the next cycle if it hasn't finished.  Every 15 minutes it will poll it's rep partner for the USN of the AD to see if it has changed.  If so, it pulls the updates.  Critical replication (such as a password change) is done pretty much immediately.

4) No need to check much more - it sounds like things are functioning properly.

Good work.  Now you can plan out your AD cleanup - carefully!

0
 
LVL 11

Expert Comment

by:ewtaylor
Comment Utility
Great job netman66, I learned a lot following these threads. Thanks to both of you.
0
 
LVL 2

Author Comment

by:theamzngq
Comment Utility
Is it your opinion that I need to extend the interval beyond 15 minutes?

By the way, I have had several people come up to me this morning telling me that 'the file they worked all day on and saved before leaving work yesterday' did not have all the changes on it that they made.  Their changes were not saved somehow.  I didn't fix the hosts file until about 9 pm, after which I noticed that the replication started working.  Could replication have taken taken the older file on Irvine and overwrote the one on Server2?  Isn't replication supposed to move last-saved, or most recently changed files--last writer wins?  Or is there some kind of priority setting that needs to be adjusted?  If I check the replication topology on Irvine, it says 'full mesh'.  I assume that that means that the most recent file from anywhere gets replicated everywhere, right?

I was able to restore these couple files from the backup last night, which happens at 8 pm.  I fixed the hosts file at 9 pm....seems fishy...

I wonder how many other people today will be coming up to me saying that their files didn't get saved somehow.

Could this just be some 'settling in' that AD and FRS have to do before their in good sync?  
0
 
LVL 2

Author Comment

by:theamzngq
Comment Utility
I just looked in the event viewer and saw event 13503, that FRS had stopped, and so, I went into services and started it.  Then came up 13521, which says:
________________________________________________________

The File Replication Service cannot enable replication on the comptuer SERVER2 until a backup/restore application completes.
 
A backup/restore application has set a registry key that prevents the File Replication Service from starting until the registry key is deleted or the system is rebooted.
 
The backup/restore application may still be running. Check with your local administrator before proceeding further.
 
The computer can be rebooted by clicking on Start, Shutdown, and selecting Restart.
 
WARNING - DELETING THE REGISTRY KEY IS NOT RECOMMENDED! Applications may fail in unexpected ways.
 
The registry key can be deleted by running regedit.
 
Click on Start, Run, and type regedit.
 
Expand HKEY_LOCAL_MACHINE, SYSTEM, CurrentControlSet, Services, NtFrs, Parameters, Backup/Restore,"Stop NtFrs from Starting". On the toolbar, click on Edit and select Delete. Be careful! Deleting a key other than "Stop NtFrs From Starting" can have unexpected sideeffects.
___________________________________________________________________________________________________

There are no backups currently running, only scheduled for later tonight.  I did, however, just restore a couple of files (as I mentioned above), but those have been done for a while.  Uh....help?  We are using Backup Exec 9.
0
 
LVL 51

Expert Comment

by:Netman66
Comment Utility
The Restore is likely what set the key so that replication won't continually change files during the backup state.

This should clear itself.  I would certainly watch it.  If it doesn't clear then fix the key - but I think it should.

It terms of replication - if the file was saved into a rep partner with a lower USN - which in all likelihood couldn't happen, then during replication it would get overwritten by the USN with the higher number.  Windows is "supposed" to prevent this kind of thing.

Full Mesh is a great sign - it means KCC has reconfigured the topology and it has converged.

0
 
LVL 2

Author Comment

by:theamzngq
Comment Utility
Full Mesh is how I set it up to begin with, actually.  

So, how can I avoid this problem in the future?  Is there a way?  I have a steady stream of people coming to me about files.  I'm wondering if I restore them if they will be restored with the same USN value; will I then have the same problem just happen again?  At the moment, FRS is not running, presumably because of all the restores I'm doing.
0
 
LVL 2

Author Comment

by:theamzngq
Comment Utility
No sign of that registry key being removed yet.  I wonder how long I should wait before deleting it?
0
 
LVL 2

Author Comment

by:theamzngq
Comment Utility
Ok.  Things are getting weird.  This whole replication thing seems way too unpredictable/unstable/unmonitorable.  I had to restart Server2 because an update installer I ran caused the system to freeze.  It took over 20 minutes, but I let it restart by itself.  When it came up, FRS started again, but then moved some of the folders out of DataStore into the 'pre-existing' folder, making them inaccessible to my users, of course.

I don't get it!  I thought the replication was done with Irvine before it left the building?  I CANNOT have DFS moving my files/folders around!!! This is killing me!!  Is there anything I can do, any decent monitoring tool that will show me EXACTLY what is happening, a list of files that are being compared with USN numbers, files that are about to be replicated to and from where and when, etc?  All of this seems way too out of my ability to control and monitor...

Its starting to feel like I'm either in way over my  head or I need to consider switching to another OS...(feeling exasperated!)
0
 
LVL 51

Expert Comment

by:Netman66
Comment Utility
Take a deep breath and count to ten..........

Replication needs to be configured for every hour or two - 15 minutes is too short.

FRS is likely choking on the whole restructure thing.

I think you'll need to reconfigure DFS from scratch.  Take the time to set up things correctly and prestage the data from what is already there.  

Use the documents we looked at earlier and see if you can find a TechNet article that steps you through the setup from the start.


Personally, 120GB over the WAN is a big deal for anyone.  However, technically, it should work.

Post some netdiag /v and dcdiag logs for me or send them to work so I can look at them.

It probably wouldn't hurt to do those large logs for me too.

0
 
LVL 2

Author Comment

by:theamzngq
Comment Utility
Thank you for your words of wisdom...

I think you're right, I will tear down the DFS root and recreate it, letting it run over the weekend.  I've been doing some hardware and software updates and such on Server2 tonight, and it seems that everytime I restart, DFS moves files into the pre-existing folder.  It seems as if its almost starting all over with every restart.

Also, that registry key that prevents FRS from starting never did get removed automatically.  I am going to call Veritas and ask them about that; perhaps they know something about it.

I've also been thinking that there has to be a better plan than this; there has to be a better way to accomplish the kind of data availability that we're trying to achieve.  Perhaps we need to break up the data into groups and decide what reduce it to the bare minimum?

--I have deleted the DFS root.  I will send you the logs you have requested.

What do you think about upgrading server2 & 3 to 2003 Standard?  I mentioned it to the boss and he went for it.  It would be nice if our DFS issues got better, but I also want to stay up-to-date.  2000 server is now 4 years old.  What do you think?

0
 
LVL 51

Expert Comment

by:Netman66
Comment Utility
Good idea.

I also think you need to determine what data needs to be where.  Shared data is fine to have accessed across the WAN depending on the size.  We run an entire Province (think State) from one location - so, it's possible.

You really should look at a fibre-channel SAN - it's not cheap, but if you move that kind of data then purchasing one is inevitable.

0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Many of us need to configure DHCP server(s) in their environment. We can do that simply via DHCP console on server or using MMC snap-in on each computer with Administrative Tools installed in a network. But what if we have to configure many DHCP ser…
I've always wanted to allow a user to have a printer no matter where they login. The steps below will show you how to achieve just that. In this Article I'll show how to deploy printers automatically with group policy and then using security fil…
Internet Business Fax to Email Made Easy - With eFax Corporate (http://www.enterprise.efax.com), you'll receive a dedicated online fax number, which is used the same way as a typical analog fax number. You'll receive secure faxes in your email, fr…
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

7 Experts available now in Live!

Get 1:1 Help Now