Link to home
Start Free TrialLog in
Avatar of mikhael
mikhaelFlag for Australia

asked on

Clients unable to login - NETLOGON and SYSVOL shares not present

Hi experts

The scenario is that my client's Windows Small Biz Svr 2000 (It's a PDC without another server) had a power failure (or something) resulting in an unexpected shutdown a few days ago. Now, they do not logon as before. They can ALL logon but using cached credentials only. Most of the clients cannot access the server's drives except for 2 XP machines, which can access the mapped drives and other resources!

Anyway, after some digging, I found 3 Exchange services not running (resulting in everyone's Outlook not working) and the SYSVOL and NETLOGON shares have disappeared.

I found some references to an old server that was on the network a while ago. Believing this might be interfering, I deleted references to this server using LDP.exe.

DHCP is working, and the clients use the server as their gateway and their DNS server and their WINS server.

I have rebuilt DNS back to default, with new AD-Integrated zones. I believe DNS is working OK.

DCDIAG reveals...

Starting test: kccevent
         * The KCC Event log test
         An Error Event occured.  EventID: 0xC0000466
            Time Generated: 01/03/2006   17:39:38
            Event String: Unable to establish connection with global catalog.
         ......................... <SERVER_NAME_HERE_DELETED> failed test kccevent

And...

Starting test: FsmoCheck
         Warning: DcGetDcName(GC_SERVER_REQUIRED) call failed, error 1355
         A Global Catalog Server could not be located - All GC's are down.
  PDC Name: \\<SERVER_NAME_HERE_DELETED>
         Locator Flags: 0xe00001fd
         Warning: DcGetDcName(TIME_SERVER) call failed, error 1355
         A Time Server could not be located.
         The server holding the PDC role is down.
         Warning: DcGetDcName(GOOD_TIME_SERVER_PREFERRED) call failed, error 1355
         A Good Time Server could not be located.
         Warning: DcGetDcName(KDC_REQUIRED) call failed, error 1355
         A KDC could not be located - All the KDCs are down.
         .........................  failed test FsmoCheck

And...

 Starting test: Advertising
         Fatal Error:DsGetDcName (<SERVER_NAME_HERE_DELETED>) call failed, error 1355
         The Locator could not find the server.
         ......................... <SERVER_NAME_HERE_DELETED> failed test Advertising

NETDIAG reveals...

    Testing Kerberos authentication... Failed

And...

Domain membership test . . . . . . : Failed
    [WARNING] Ths system volume has not been completely replicated to the local machine. This machine is not working properly as a DC.
    Machine is a . . . . . . . . . : Primary Domain Controller Emulator

And...

DC discovery test. . . . . . . . . : Failed

    Find DC in domain '<DOMAIN_NAME_HERE_DELETED>':
        [FATAL] Cannot find DC in domain '<DOMAIN_NAME_HERE_DELETED>'. [ERROR_NO_SUCH_DOMAIN]

DC list test . . . . . . . . . . . : Failed
        '<DOMAIN_NAME_HERE_DELETED>': Cannot find DC to get DC list from [test skipped].
    List of DCs in Domain '<DOMAIN_NAME_HERE_DELETED>':

And...

LDAP test. . . . . . . . . . . . . : Failed
    Cannot find DC to run LDAP tests on. The error occurred was: The specified domain either does not exist or could not be contacted.

Event Viewer...

Every 5 mins or so I get an error in the App'n Log (Source Userenv, event ID 1000) saying...

Windows cannot determine the user or computer name. Return value (1355).

And...

Every so often (couple hours) I get a warning in the FRS Log (Source NTFRS, event ID 13566) saying...

File Replication Service is scanning the data in the system volume. Computer SSRSYDSVR cannot become a domain controller until this process is complete. The system volume will then be shared as SYSVOL.
 
To check for the SYSVOL share, at the command prompt, type:
net share
 
When File Replication Service completes the scanning process, the SYSVOL share will appear.
 
The initialization of the system volume can take some time. The time is dependent on the amount of data in the system volume.

And the above never seems to finish.

Well that's my situation. I hope you guys and girls can help.
Avatar of Glenn Abelson
Glenn Abelson
Flag of United States of America image

What I would try, in order:
1. Bottle of Beer
2. Tape restore of \Windows and sub folders and any other folders of import.
3. In Place Windows Repair from CD and reinstall of service packs.
4. Pizza and Beer for consolation or celebration.

The shares can be recreated.

We had a similar, though not as extensive problem with a power loss.
All we had to do was shut down everything (server and all work stations) and restart everything.
Avatar of mikhael

ASKER

Thanks Glenn :)

I like the sound of 1) and 4). Not real excited about 3) and especially not 2)!

I thought about an in-place repair, but am always worried about what changes are being done (DHCP, DNS, Exchange, RRAS, etc etc.). This is a production server, and their only server (i.e. no redundancy).

BTW, have already done several restarts (incl the workstations and switch).

Any other ideas guys?

Cheers
Michael
Avatar of mikhael

ASKER

One more thing...

An error pops up in the Directory Service event log every hour (Source NTDS General, and ID 1126)

"Unable to establish connection with global catalog"

Thanks
Avatar of pubmarc
pubmarc

Active directory is toast.  You will need boot into active directory restore mode and revert to your tape backups to restore system state.
Avatar of mikhael

ASKER

Thanks Pub (great name!)

I had a feeling this might be the case (esp. after Glenn's coments above). But please tell me, how sure are you I need to do the AD restore thing and restore the SS from tape? Because I keep thinking it's DNS (for AD) that's causing me the prob. Any easy way to test DNS?

Ta
Michael
can you ping the domain from a workstation??

IE..  If your internal FQDN is domain.local,  from a command prompt type ping domain.local

If you can ping the FQDN from a workstation then DNS is OK.
assuming your workstations are using your server for DNS
>>>DcGetDcName(GC_SERVER_REQUIRED)

An API call is executed to find Domain Controllers and Global Catalog servers and this call is responded back by DNS Server service. DNS contains SRV records for all domain controllers and GC servers.

Make sure all DC SRVs are registered properly and this server is also a Global Catalog server.

You can use dcdiag /fix or netdiag /fix to correct SRV problems. SRV records should re-register as long as zone is accepting dynamic updates.

Also try restarting Netlogon service and rebooting the server.

Let us know.
Avatar of mikhael

ASKER

Could be the problem Pub!

When pinging <Domain_Name>.local the workstation can't find the host. (I didn't know you could ping a domain!)

When pinging <Server_Name>.<Domain_Name>.local the workstation quickly pings the right IP.

Systm - It IS a GC server (The event viewer told me so).

How do I ensure all DC SRVs are registered properly ?

And I will try the 2 /fix  'es.

I have restarted the netlogon service and the server itself SO many times, but will do so again.
Avatar of mikhael

ASKER

Also in the NTDS properties it says it IS a Global Catalog server.

And the failed tests of the 2 /FIX 'es are....

DCDIAG /FIX...

Starting test: Advertising
         Fatal Error:DsGetDcName (SSRSYDSVR) call failed, error 1355
         The Locator could not find the server.
         ......................... SSRSYDSVR failed test Advertising

Starting test: FsmoCheck
         Warning: DcGetDcName(GC_SERVER_REQUIRED) call failed, error 1355
         A Global Catalog Server could not be located - All GC's are down.
         Warning: DcGetDcName(TIME_SERVER) call failed, error 1355
         A Time Server could not be located.
         The server holding the PDC role is down.
         Warning: DcGetDcName(GOOD_TIME_SERVER_PREFERRED) call failed, error 1355
         A Good Time Server could not be located.
         Warning: DcGetDcName(KDC_REQUIRED) call failed, error 1355
         A KDC could not be located - All the KDCs are down.
         ......................... shire.local failed test FsmoCheck

And the NETDIAG /FIX...

Global results:
Domain membership test . . . . . . : Failed
    [WARNING] Ths system volume has not been completely replicated to the local machine. This machine is not working properly as a DC.

DC discovery test. . . . . . . . . : Failed
        [FATAL] Cannot find DC in domain 'SHIRE'. [ERROR_NO_SUCH_DOMAIN]

DC list test . . . . . . . . . . . : Failed
        'SHIRE': Cannot find DC to get DC list from [test skipped].

LDAP test. . . . . . . . . . . . . : Failed
    Cannot find DC to run LDAP tests on. The error occurred was: The specified domain either does not exist or could not be contacted.
[WARNING] Cannot find DC in domain 'SHIRE'. [ERROR_NO_SUCH_DOMAIN]

Guys...

I am thinking I need to run DCPROMO (to demote and then promote). What will I lose? Accounts? ACLs? Emails? Data?

Thanks
Avatar of Jeffrey Kane - TechSoEasy
You don't have to do a complete reinstall, just restore your System State and Active Directory.  

Instructions are here:
http://www.microsoft.com/technet/prodtechnol/sbs/2000/reskit/sbrk0026.mspx

If you've never used your 2 free incidents with Microsoft PSS (included with SBS 2000 retail version - not OEM, but nobody tends to realize it), then you can also have them walk you through it.  Call them at 800.936.4900.

Jeff
TechSoEasy


Jeff is correct. You can try restoring your last system state backup.

Have a look at this article. How to verify registeration of DC SRVs:

http://support.microsoft.com/default.aspx?scid=kb;en-us;816587&Product=winsvr2003

My only concern is about DNS which is not working properly. I can see the log from netdiag /fix and the first one which tells me that DcGetDcName API call fails and its because all AD-tools are DNS-aware. They always query DNS to find domain controllers and that's what happening with your clients. Clients will also send DNS query for SRV records.
Avatar of mikhael

ASKER

OK thanks SYSTM,

I checked the MS web page, and tested the DNS.

Using the NSLOOKUP method noted, and got...

Server:  ssrsydsvr.shire.local
Address:  192.168.16.1

_ldap._tcp.dc._msdcs.shire.local        SRV service location:
          priority       = 0
          weight         = 100
          port           = 389
          svr hostname   = ssrsydsvr.shire.local
ssrsydsvr.shire.local   internet address = 192.168.16.1

According to MS I should have...

Server: localhost
Address:  127.0.0.1
_ldap._tcp.dc._msdcs.Domain_Name
SRV service location:
      priority      = 0
      weight            = 100
      port            = 389
      srv hostname      = Server_Name.Domain_NameServer_Name.Domain_Name      internet address = Server_IP_Address

Is the fact that it's pointing to localhost and 127.0.01 important?
Also, my svr hostname   = ssrsydsvr.shire.local  
MS says differently. Could this be my problem?????

One more thing, from the MS article, it passes the DNS MMC test and the NETLOGON.DNS test.
Avatar of mikhael

ASKER

Another thing...

According to http://support.microsoft.com/kb/241515/EN-US/  (which is the Win2000 version of the previous web page),

my NSLOOKUP result looks correct...

> _ldap._tcp.dc._msdcs.shire.local
Server:  ssrsydsvr.shire.local
Address:  192.168.16.1

_ldap._tcp.dc._msdcs.shire.local        SRV service location:
          priority       = 0
          weight         = 100
          port           = 389
          svr hostname   = ssrsydsvr.shire.local
ssrsydsvr.shire.local   internet address = 192.168.16.1
Avatar of mikhael

ASKER

Guys, how invasive is an in-place repair? (I assume this is from the SBS2000 CD - right?)

I am always worried about what changes are being done (DHCP, DNS, Exchange, RRAS, etc etc.).

Thanks, I appreciate your efforts.
Michael

p.s I have upped the points to 500
It's no problem that it has your domain instead of localhost... I'm not sure why they put that there, if you look at the win2k article on the same subject it doesn't even mention it:  http://support.microsoft.com/kb/241515

Just make sure that if you are going to delve further into MS articles that you use the win2k ones because Win2K3's DNS uses AD partitions, while Win2k doesn't.

Jeff
TechSoEasy
That's fine.

How many NICs you have in this system?

Post your Ipconfig /all result. (ipconfig /all > ip.txt)
That's correct Jeff.
Your DNS zone should look like below:

DNS
   |--ServerName
   |-----Forward Lookup Zones
   |----------domain_name.local
   |             |   _sites
   |             |     |    |
   |             |     |   Default-First-Site-Name
   |             |     |         |
   |             |     |       _tcp--------------- _ldap [SRV]: 0:100:389: server_name.domain_name.com.
   |             |     |                                  _gc [SRV]: 0:100:3268: server_name.domain_name.com
   |             |     |                                  _kerberos [SRV]: 0:100:88: server_name.domain_name.com
   |             |     |      
   |             |    _tcp---------------------- _ldap [SRV]: 0:100:389: server_name.domain_name.com.
   |             |     |                                 _gc [SRV]: 0:100:3268: server_name.domain_name.com
   |             |     |                                 _kerberos [SRV]: 0:100:88: server_name.domain_name.com
   |             |     |                                 _kpasswd [SRV]: 0:100:464: server_name.domain_name.com
   |             |     |        
   |             |    _udp--------------------  _kpasswd [SRV]: 0:100:464: server_name.domain_name.com.
   |             |     |                                _kerberos [SRV]: 0:100:88: server_name.domain_name.com.

You must have the above SRVs registerd in DNS zone so that clients (Netlogon service) can get the list of domain controllers available in domain by executing DcGetDcName API call.

These are the basic guidelines for DNS and TCP/IP Configuration on a server:

1. On DC or DNS server: Make sure DNS server is pointing to server IP address.

2. Client machines must use this IP address (Preferred DNS server).

3. Configure Forwarders on DNS server to forward DNS query requests to other DNS servers such as ISP DNS Server or any other DNS server in your domain or forest. Do not put ISP DNS Server in there. You need to delete root zone (".") to configure forwarders.

4. Make sure Dynamic or Secure Dynamic update is enabled on authoritative Zone.

5. Make sure SOA record in DNS zone is pointing to correct DNS server IP Address.

6. Issue Ipconfig /registerdns from command prompt to register A records of server in zone.

7. If there are two LAN cards make sure Internal NIC of the server is listed first in Binding Order.

Let us know.
Avatar of mikhael

ASKER

2 NIC's and I also connect by VPN normally (hence there is a VPN adaptor in there as well)
*****************
ip.txt

Windows 2000 IP Configuration



      Host Name . . . . . . . . . . . . : ssrsydsvr
      Primary DNS Suffix  . . . . . . . : shire.local
      Node Type . . . . . . . . . . . . : Hybrid

      IP Routing Enabled. . . . . . . . : Yes

      WINS Proxy Enabled. . . . . . . . : No

      DNS Suffix Search List. . . . . . : shire.local

Ethernet adapter Internal LAN:



      Connection-specific DNS Suffix  . : shie.local
      Description . . . . . . . . . . . : HP NC3163 Fast Ethernet NIC
      Physical Address. . . . . . . . . : 00-02-A5-EA-63-12

      DHCP Enabled. . . . . . . . . . . : No

      IP Address. . . . . . . . . . . . : 192.168.16.1

      Subnet Mask . . . . . . . . . . . : 255.255.255.0

      Default Gateway . . . . . . . . . :

      DNS Servers . . . . . . . . . . . : 192.168.16.1
      Primary WINS Server . . . . . . . : 192.168.16.1


Ethernet adapter External LAN:



      Connection-specific DNS Suffix  . :
      Description . . . . . . . . . . . : NETGEAR FA311/FA312 PCI Adapter
      Physical Address. . . . . . . . . : 00-09-5B-08-AF-B6

      DHCP Enabled. . . . . . . . . . . : No

      IP Address. . . . . . . . . . . . : 192.168.0.2

      Subnet Mask . . . . . . . . . . . : 255.255.255.0

      Default Gateway . . . . . . . . . : 192.168.0.1

      DNS Servers . . . . . . . . . . . : 192.168.16.1
                                          192.168.0.2

PPP adapter RAS Server (Dial In) Interface:



      Connection-specific DNS Suffix  . :
      Description . . . . . . . . . . . : WAN (PPP/SLIP) Interface

      Physical Address. . . . . . . . . : 00-53-45-00-00-00

      DHCP Enabled. . . . . . . . . . . : No

      IP Address. . . . . . . . . . . . : 192.168.16.200

      Subnet Mask . . . . . . . . . . . : 255.255.255.255

      Default Gateway . . . . . . . . . :

      DNS Servers . . . . . . . . . . . : 127.0.0.1
      NetBIOS over Tcpip. . . . . . . . : Disabled


And SYSTM, I'll get to your comments and suggestions soon...

Can you check that the GC function is being advertised in DNS correctly? it has the same Service Record structure as the LDAP Server:

_ldap._tcp.gc._msdcs.shire.local

Can you also verify the Port connection from a client to the server on Port 3268:

Telnet <your server> 3268

If it works you'll just get a blank screen, if not it'll tell you the connection failed.

If DNS looks to be the problem it's possible to entirely recreate the AD Integrated Zone which may flush out any problematic entries. Since this step is easy to do and carries little risk it would be well worth trying:

http://support.microsoft.com/?kbid=305967

Chris

Avatar of mikhael

ASKER

SYSTM, more info for you...

In the DNS map above, my setup is nearly identical. The only difference is where you have
" server_name.domain_name.com " 9 times, I have " server_name.domain_name.local " 9 times; and under
" domain_name.local ", I also have an entry " _msdcs "

Then, re. your numbered points...

1) The DC and DNS server are one and the same machine. In the Network properties (as per the IPCONFIG), the DNS is set as 192.168.16.1 on the LAN NIC and as 192.168.16.1 AND 192.168.0.2 on the WAN NIC. (I normally have the WAN's DNS not set at all but was desperate)

2) The clients (per DHCP service on this server) use 192.168.16.1 as their DNS.

3) Forwarders are NOT enabled (I normally have this set to the ISP's DNS server's but according to another post on EE I let the Root Hint Server's look after this. The server can certainly ping an external host name.). There are no other DNS Servers on the network - or any other servers at all!

4) On the shire.local (in Fwd Lookup Zones), Secure Dynamic Updates are allowed. Zone transfers are NOT allowed. Under "Name Servers" this same server's LAN IP is entered !?!?

5) There is an SOA record, an A and a NS record. (I am wondering about this NS record)

6) I have issued the ipconfig /registerdns many times. Does it matter how often or when? I must admit I do it freely (and the flushdns as well)

7) There ARE 2 NIC's. Do you mean in Network Prop's --> See IPCONFIG above. Do you mean in the DNS MMC, then server --> Prop's --> Interfaces, I have 3 IP's listed. LAN NIC then WAN NIC then 192.168.16.200 (the PPP adapter - see IPCONFIG above). I normally would only enter the LAN NIC and that's what I had, but got desperate.

And Chris...

It certainly DOES respond to port 3268. And using NSLOOKUP, I get...

> _ldap._tcp.gc._msdcs.shire.local
Server:  ssrsydsvr.shire.local
Address:  192.168.16.1

_ldap._tcp.gc._msdcs.shire.local        SRV service location:
          priority       = 0
          weight         = 100
          port           = 3268
          svr hostname   = ssrsydsvr.shire.local
ssrsydsvr.shire.local   internet address = 192.168.16.1
>

I will try the MS article re clearing out the DNS shortly. Just wanted to post this first.

FINALLY, how invasive is a repair installation from the CD? I am worried about what changes are going to be done (DHCP, DNS, Exchange, RRAS, Service Packs etc etc.). (My client is getting despearate, and the backup is not up to date because Backup Exec hasn't worked since the problem began)

Thanks guys (really is appreciated)
Michael
Avatar of mikhael

ASKER

Chris and others,

I did as per the MS article (305967). When re-creating the zones, I chose AD-integrated.

I'm not sure if the NETLOGON and SYSVOL shares should have been miraculously re-created. Anyway they were not. Should this take some time? Anyway I chose to restart the server. No different after.

Am starting to think that DNS is not the problem but rather GC or AD. What do you think, guys?

Is it better to do a CD reinstall? or do a DCPROMO demotion and then promotion?

(This server looks after about 20-30 users with several security groups and various folders and drives allocated to certain groups. As well Exchange with numerous public folders with permissions set. The 1st DCPROMO screen tells me it will delete accounts. Will it also delete the mailboxes?)
Avatar of mikhael

ASKER

Hi guys

Well the problem is now fixed.

I resorted to MS Prof. Help. (and $AU297 !!). Basically I was forced to by my client who was getting quite impatient!

First of all, he looked at all the logs via the cab file generated by ( http://www.microsoft.com/downloads/info.aspx?na=46&p=5&SrcDisplayLang=en&SrcCategoryId=&SrcFamilyId=cebf3c7c-7ca5-408f-88b7-f9c79b7306c0&u=http%3a%2f%2fdownload.microsoft.com%2fdownload%2fb%2fb%2f1%2fbb139fcb-4aac-4fe5-a579-30b0bd915706%2fMPSRPT_DirSvc.EXE )

He had me do a reg hack :-

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Netlogon\DependOnService --> I had to add DNS

And restart

And then the BURFLAGS to D4 (i.e. authoritative restore -  see http://support.microsoft.com/kb/290762 )

Then another flushdns, registerdns and net stop netlogon net stop ntfrs and restart both, and the the SYSVOL share came back up !!

The NETLOGON share wouldn't come back up (coz the "scripts" and "policies" folders were missing - apparently deleted at some stage with all our "surgery")

He had me download and run the Windows 2000 Default Group Policy Restore Tool...

http://www.microsoft.com/downloads/details.aspx?FamilyID=b5b685ae-b7dd-4bb5-ab2a-976d6873129d&DisplayLang=en

Some more flushdns, registerdns and net stop netlogon net stop ntfrs and restart both, and the the NETLOGON share came back up !! But alas, empty!

Anyway it wasn't too hard to re-create the login scripts from backup.

Certainly the trick was the first reg hack.

Thanks very much to all of you guys. I'll give the points to SYSTMPROG.

Cheers
Michael
Avatar of mikhael

ASKER

So SYSTMPROG, please post and I'll award the points.
SOLUTION
Avatar of Jeffrey Kane - TechSoEasy
Jeffrey Kane - TechSoEasy
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of mikhael

ASKER

Yep I knew that Jeff. I just thought it would be easier for future viewers of this thread to see that the answer is right near the bottom  :)
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of mikhael

ASKER

Thanks again all