IDMU in Windows Server 2008 SP2 suddenly unable to authenticate

We are running Identity Management Services for UNIX on Windows Server 2008 SP2 with one Windows Master and one Windows subordinate.  After working for over two years, our NIS domain is suddenly unable to authenticate users on any of our Linux/UNIX boxes.  Restarting services did not help.  Looking at Event Viewer or c:\Windows\idmu\logs yielded no information.

I did not setup our NIS configuration and in fact my knowledge of NIS is rather slim.  What I do know is that the IDMU configuration had not been touched for many months up until this point.  I did try at one point to get NFS file sharing on a separate 2008 R2 server to authenticate by pointing to the AD domain for identity mapping source.  That also was several weeks prior to this breakdown.

Here are the only potential problem indicators I can see:

1)  Use of the ypcat commands sometimes displays the appropriate information and sometimes returns the error "NIS Service is not running on the host '<servername>' in domain '<domainname>' - it's as though the Server for NIS is constantly starting and stopping, but no such activity is recorded in Event Viewer, no entries for Server for NIS starting and stopping are recorded unless I manually turn it off and on.

1a)  Likewise, Linux and UNIX servers that run the ypwhich will attempt to contact the appropriate server and will sometimes get a response back and sometimes will not get a response.  (I think that's the command - again, my knowledge of NIS and these commands is minimal)

2)  In ADSI editor I see duplicate container entries for defaultMigrationContainer30 and ypserv30 that have the objectGUID tacked onto the container name like so:

CN=defaultMigrationContainer30CNF:2bedf883-f6b4-4650-a2fa-cddf7d03dcdc

CN=ypServ30CNF:be1e659e-9fbc-4daf-9d98-c0e63a8ad4d4

Having said all that, my first question is obvious:  Can anyone shed some light as to what might have happened?  Secondly, are those duplicate containers safe to flat-out delete through ADSI edit?
kjw_pkwAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

David PiniellaCommented:
Are the login auth attempts getting to the windows box? wireshark (or other packet capture) will tell you if the packets are getting to the windows boxes, and also if the packets are leaving the unix boxes properly. from the sporadic nature of it, I suspect you're having network problems of some sort (routing, bad dns or something) or possibly the service on the windows box is failing in some really weird way -- maybe the windows boxes are under high load and it's causing timeouts? I would setup a packet capture and see if i can find where it's breaking for sure.
0
kjw_pkwAuthor Commented:
A packet capture shows that the packets are getting to the Windows boxes, but the Windows boxes are not answering.  The performance counters on the windows servers are not showing high load in network, memory, disk, processor, or any other way that matters.  So it definitely seems like a problem developed with the service itself.  

Could it be related to the odd duplicate containers I mentioned?  For grins, I tried renaming the CN=<domain> entry in the Containers for the mappings under the duplicate ypserv30CNF* container, and did not seem to affect the NIS Server (one way or the other) once it was restarted
0
David PiniellaCommented:
I find that pretty unlikely, and in any case, your renaming should have obviated that. As far as the how/why that happened, I would guess some sort of automated tool -- anything for migration or possibly something installed on the DC(s) that would modify the schema.

You're going to need to turn on logging (or make the logging more explicit) in order to find out WTH is actually going on inside the service to make it fail.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Windows Server 2008

From novice to tech pro — start learning today.