Netward 5.1 - Replicas not syncing in timely manner

I have a four server Netware 5.1 setup.  All servers have been patched with SP6e, and are running DS NLM 8.85.

There are four servers, server 1 through 4.  SERVER1 is the Master replica, with a R/W replica on 2,3,4.

When I do a DSRepair, 'Report Sync status...' I get results like this (going with a current time of 8:45)

Replica     Server1        9-07-2004        8:27:34
Replica     Server2        9-07-2004        8:39:34
Replica     Server3        9-07-2004        8:43:08
Replica     Server4        9-07-2004        8:44:43

My biggest problem is when I create users, we have to wait 15-30 minutes for everything to sync, and then they can log in.

My biggest personal quandry is why the Master replica is always the oldest.

I would assume, with all servers connected so closely, there shouldn't be any lag (esp. with high-priority items like password changes and user creation).  There are no communication problems between servers, and workstations can access all no problem.  Time is reported in sync.

Any insight, thoughts, opinions welcome.

Thanks,

TN
tnormanAsked:
Who is Participating?
 
PsiCopCommented:
It IS a little unusual for there to be such a spread in sync times across the replicas.

Try this

1) At the server console, enter --> SET DSTRACE=ON
2) Then enter --> SET DSTRACE=+s
3) Then enter --> SET DSTARCE=+h

This will activate the Trace to Screen feature in NDS, and specifically tell it to report synchronization and heartbeat activity. A new screen on the NetWare console will be created, you access it the same way you access any other screen on the NetWare console. You can watch the NDS activity.

You can do this on each server if you like.

Look for errors like "Replica in skulk", which is usually a transitory error message that means when a server went to synch NDS with another server, that server was already busy performing another synch, and so couldn't synch with the particular server on which you see the error message.

My suspicion, given what you've reported, is that your environment has too much NDS sychronization going on, and its delaying the propogation of changes, because everyone is trying to synch with everyone else (each server have to synch with 3 others). The result is a mild case of gridlock.

I'd suggest deleting one R/W replica and seeing is the situation doesn't improve.
0
 
PsiCopCommented:
Hmmm....well, you do have more than the recommended amount of replicas in your replica ring. Novell recommends 3, not 4. 1 Master and 2 Read/Write. The extra replica can slow things down.

NDS is a actual multi-master replication environment (AD claims it is, but its really master-slave), so the Master replica isn't a whole lot different than a R/W in most respects. The point I'm trying to make is there's no special reason he should be the first one updated.
0
 
tnormanAuthor Commented:
I could certainly remove a r/w off of one of the servers.  Any 'forewarnings' about doing this?  
0
Cloud Class® Course: CompTIA Healthcare IT Tech

This course will help prep you to earn the CompTIA Healthcare IT Technician certification showing that you have the knowledge and skills needed to succeed in installing, managing, and troubleshooting IT systems in medical and clinical settings.

 
PsiCopCommented:
Forwarnings. Hmmm.

If the server is a busy server, don't do it during the business day unless your replica is small. Don't want to drag the server's performance down re-working its NDS database while the CEO is trying to arrange a golf date with his buddies. This is strictly a performance issue (I know a moron who removed a 2000+ object R/W replica in a 400+ server NDS tree from a server performing massive file service for over 800 client machines in the middle of the business day - the server's performance almost ground to a halt - NetWare v5.1 and an older NDS version, might have even been a late v7.x).

Do make sure the replicas, all of them, are healthy. Use DSREPAIR, get it down to 0 errors, start with the Master, then the other ones. 95% of minor NDS errors will have no effect in your situation, doing what you want to do; but a few runs of DSREPAIR are very cheap insurance.

I want to make certain that time really is in synch. I say this because I got burned early in my NDS career by not distinguishing between "Time Sycnhronization is Active" and "Time is Sychronized to the Network" when I entered "TIME" on the server console. The former statement merely means that the server has TIMESYNC.NLM loaded and that it is performing synchronization activities. The latter is what's *really* important - it means that the server is in agreement with the other servers as to what time it is. The latter is what you should see on all servers.

I would also make sure that the server that holds your Master replica is the Primary timeserver, and that the other ones are Secondary timeservers.

Removing an NDS replica should be fairly quick and painless in your environment. As I recall, that was accomplished using PARTMGR.EXE on a workstation logged in with admin-level privledges. Do NOT use NWCONFIG to "Remove Directory Services" from the server. :-)
0
 
tnormanAuthor Commented:
I think we are on to something here.  Of the four servers, this is what is happening:

SERVER 1 - OK
Server 2 & 3 - error, failed Replica in Skulk 698
Server 4 - Very odd:

Sync - [0000841b] <usrcat.wic.wic> [2002/03/28 16:19:45 ,2 ,1]

(this is scrolling/repeating about once a second on the screen)

('usrcat' is our Catalog for NDS login/wildcard searching.)

I am guessing I have some corruption somewhere here, and it is tying up the NDS update.  The server that is having this repeating sync message is the one that is always behind in the sync.
0
 
PsiCopCommented:
Hmmm....yes, I think you're onto something. Definitely should not be seeing those messages constantly like that.

I confess to never having worked hands-on with NDS Catalogs, so I'm at a loss to suggest specific remedies. I would probably try deleting and re-creating the catalog (assuming there are no repair utilities to try first).
0
 
tnormanAuthor Commented:
I think I will close this question, and start off (at least) one new one with that specific error message going on.  I will also post the message in Novell support forums.

Thanks for your help on this one.

TN
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.