Solved

Netward 5.1 - Replicas not syncing in timely manner

Posted on 2004-09-07
7
772 Views
Last Modified: 2010-08-05
I have a four server Netware 5.1 setup.  All servers have been patched with SP6e, and are running DS NLM 8.85.

There are four servers, server 1 through 4.  SERVER1 is the Master replica, with a R/W replica on 2,3,4.

When I do a DSRepair, 'Report Sync status...' I get results like this (going with a current time of 8:45)

Replica     Server1        9-07-2004        8:27:34
Replica     Server2        9-07-2004        8:39:34
Replica     Server3        9-07-2004        8:43:08
Replica     Server4        9-07-2004        8:44:43

My biggest problem is when I create users, we have to wait 15-30 minutes for everything to sync, and then they can log in.

My biggest personal quandry is why the Master replica is always the oldest.

I would assume, with all servers connected so closely, there shouldn't be any lag (esp. with high-priority items like password changes and user creation).  There are no communication problems between servers, and workstations can access all no problem.  Time is reported in sync.

Any insight, thoughts, opinions welcome.

Thanks,

TN
0
Comment
Question by:tnorman
  • 4
  • 3
7 Comments
 
LVL 34

Expert Comment

by:PsiCop
ID: 11997893
Hmmm....well, you do have more than the recommended amount of replicas in your replica ring. Novell recommends 3, not 4. 1 Master and 2 Read/Write. The extra replica can slow things down.

NDS is a actual multi-master replication environment (AD claims it is, but its really master-slave), so the Master replica isn't a whole lot different than a R/W in most respects. The point I'm trying to make is there's no special reason he should be the first one updated.
0
 

Author Comment

by:tnorman
ID: 11997920
I could certainly remove a r/w off of one of the servers.  Any 'forewarnings' about doing this?  
0
 
LVL 34

Accepted Solution

by:
PsiCop earned 250 total points
ID: 11997969
It IS a little unusual for there to be such a spread in sync times across the replicas.

Try this

1) At the server console, enter --> SET DSTRACE=ON
2) Then enter --> SET DSTRACE=+s
3) Then enter --> SET DSTARCE=+h

This will activate the Trace to Screen feature in NDS, and specifically tell it to report synchronization and heartbeat activity. A new screen on the NetWare console will be created, you access it the same way you access any other screen on the NetWare console. You can watch the NDS activity.

You can do this on each server if you like.

Look for errors like "Replica in skulk", which is usually a transitory error message that means when a server went to synch NDS with another server, that server was already busy performing another synch, and so couldn't synch with the particular server on which you see the error message.

My suspicion, given what you've reported, is that your environment has too much NDS sychronization going on, and its delaying the propogation of changes, because everyone is trying to synch with everyone else (each server have to synch with 3 others). The result is a mild case of gridlock.

I'd suggest deleting one R/W replica and seeing is the situation doesn't improve.
0
What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

 
LVL 34

Expert Comment

by:PsiCop
ID: 11998092
Forwarnings. Hmmm.

If the server is a busy server, don't do it during the business day unless your replica is small. Don't want to drag the server's performance down re-working its NDS database while the CEO is trying to arrange a golf date with his buddies. This is strictly a performance issue (I know a moron who removed a 2000+ object R/W replica in a 400+ server NDS tree from a server performing massive file service for over 800 client machines in the middle of the business day - the server's performance almost ground to a halt - NetWare v5.1 and an older NDS version, might have even been a late v7.x).

Do make sure the replicas, all of them, are healthy. Use DSREPAIR, get it down to 0 errors, start with the Master, then the other ones. 95% of minor NDS errors will have no effect in your situation, doing what you want to do; but a few runs of DSREPAIR are very cheap insurance.

I want to make certain that time really is in synch. I say this because I got burned early in my NDS career by not distinguishing between "Time Sycnhronization is Active" and "Time is Sychronized to the Network" when I entered "TIME" on the server console. The former statement merely means that the server has TIMESYNC.NLM loaded and that it is performing synchronization activities. The latter is what's *really* important - it means that the server is in agreement with the other servers as to what time it is. The latter is what you should see on all servers.

I would also make sure that the server that holds your Master replica is the Primary timeserver, and that the other ones are Secondary timeservers.

Removing an NDS replica should be fairly quick and painless in your environment. As I recall, that was accomplished using PARTMGR.EXE on a workstation logged in with admin-level privledges. Do NOT use NWCONFIG to "Remove Directory Services" from the server. :-)
0
 

Author Comment

by:tnorman
ID: 11998194
I think we are on to something here.  Of the four servers, this is what is happening:

SERVER 1 - OK
Server 2 & 3 - error, failed Replica in Skulk 698
Server 4 - Very odd:

Sync - [0000841b] <usrcat.wic.wic> [2002/03/28 16:19:45 ,2 ,1]

(this is scrolling/repeating about once a second on the screen)

('usrcat' is our Catalog for NDS login/wildcard searching.)

I am guessing I have some corruption somewhere here, and it is tying up the NDS update.  The server that is having this repeating sync message is the one that is always behind in the sync.
0
 
LVL 34

Expert Comment

by:PsiCop
ID: 11998319
Hmmm....yes, I think you're onto something. Definitely should not be seeing those messages constantly like that.

I confess to never having worked hands-on with NDS Catalogs, so I'm at a loss to suggest specific remedies. I would probably try deleting and re-creating the catalog (assuming there are no repair utilities to try first).
0
 

Author Comment

by:tnorman
ID: 11998405
I think I will close this question, and start off (at least) one new one with that specific error message going on.  I will also post the message in Novell support forums.

Thanks for your help on this one.

TN
0

Featured Post

Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Is your Office 365 signature not working the way you want it to? Are signature updates taking up too much of your time? Let's run through the most common problems that an IT administrator can encounter when dealing with Office 365 email signatures.
We have come a long way with backup and data protection — from backing up to floppies, external drives, CDs, Blu-ray, flash drives, SSD drives, and now to the cloud.
Here's a very brief overview of the methods PRTG Network Monitor (https://www.paessler.com/prtg) offers for monitoring bandwidth, to help you decide which methods you´d like to investigate in more detail.  The methods are covered in more detail in o…
You have products, that come in variants and want to set different prices for them? Watch this micro tutorial that describes how to configure prices for Magento super attributes. Assigning simple products to configurable: We assigned simple products…

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now