JeffJCA
asked on
SQL Authentication and Win NT Authentication Issues
Okay, I am at a loss on this issue. I have a client who has three servers. A 2012R2 DC, A 2008R2 SQL Server, and a 2012R2 Terminal Server. The SQL Server is running SQL 2008
The VAR software vendor that runs a SQL app on the server server did an upgrade to their software. During the upgrade process someone evidently logged into the SQL server app. Everything on the network, domain and SQL server and this app was working fine with no issues before this upgrade. After the upgrade many things with their app and SQL stopped working. They blame it on the users logging into the SQL app during the upgrade and are saying it is a domain, DNS or AD issue even though everything worked before their upgrade and nothing was changed at all with the domain, AD or DNS. The main issue is you can no longer log into their SQL app using Win NT Authentication from anywhere other than on the SQL server itself. You can use their app to login to the SQL server on the physical SQL server using Win NT Auth just fine. If you try to do it from any pc or the other two servers you get "could not establish a connection to the database" error. You can use their SQL app to login from anywhere if you use SQL auth and the sa credentials.
You can log into the SQL server with domain credentials just fine, just as you can any pc or the other servers. Users have no issues logging into the domain with their domain credentials, locally or even remotely. You can add new users, new computers, etc to the domain just fine. You can ping the SQL server with ip and name no problems from the other servers and all pc's. DNS and nslookup commands on the SQL server work from the other servers and all pc's. SQL server has named pipes and IP enabled. Remote connections is enabled. The ip's are correct. Using the port query tool and running query on SQL service of the SQL server from either of the other servers all pass.
Looking at permissions on the database the domain users account has permissions to the db.
The SQL server service is using the Local System account for login. There is only one instance - the default
I have another client using this same setup, same VAR software and version who got the upgrade but they work fine. I have tried to compare all settings in the domain and SQL between the two to find a difference but have not yet.
Any suggestions on more things I can look at or try? Anything I can run to do more tests? I am not a SQL expert by any means and have looked at everything I know. Thanks.
The VAR software vendor that runs a SQL app on the server server did an upgrade to their software. During the upgrade process someone evidently logged into the SQL server app. Everything on the network, domain and SQL server and this app was working fine with no issues before this upgrade. After the upgrade many things with their app and SQL stopped working. They blame it on the users logging into the SQL app during the upgrade and are saying it is a domain, DNS or AD issue even though everything worked before their upgrade and nothing was changed at all with the domain, AD or DNS. The main issue is you can no longer log into their SQL app using Win NT Authentication from anywhere other than on the SQL server itself. You can use their app to login to the SQL server on the physical SQL server using Win NT Auth just fine. If you try to do it from any pc or the other two servers you get "could not establish a connection to the database" error. You can use their SQL app to login from anywhere if you use SQL auth and the sa credentials.
You can log into the SQL server with domain credentials just fine, just as you can any pc or the other servers. Users have no issues logging into the domain with their domain credentials, locally or even remotely. You can add new users, new computers, etc to the domain just fine. You can ping the SQL server with ip and name no problems from the other servers and all pc's. DNS and nslookup commands on the SQL server work from the other servers and all pc's. SQL server has named pipes and IP enabled. Remote connections is enabled. The ip's are correct. Using the port query tool and running query on SQL service of the SQL server from either of the other servers all pass.
Looking at permissions on the database the domain users account has permissions to the db.
The SQL server service is using the Local System account for login. There is only one instance - the default
I have another client using this same setup, same VAR software and version who got the upgrade but they work fine. I have tried to compare all settings in the domain and SQL between the two to find a difference but have not yet.
Any suggestions on more things I can look at or try? Anything I can run to do more tests? I am not a SQL expert by any means and have looked at everything I know. Thanks.
What's the SQL Server authentication mode?
ASKER
Sorry, thought I put that in there... it is mixed, both SQL and WIN NT auth
ASKER
In talking with someone tonight I was referred to this instance of another case... this sounds very much like the issue the VAR vendor had when they upgraded their SQL app and the problems started. I am thinking this is most likely our issue. What do others think?
>>>>
Here, I talk about one extreme situation: SQL server was running under Local System and was shutdown accidentally. The user then decides to run SQL server under a different account, e.g local account, domain account etc., for whatever reasons. Then he/she hit this “Cannot Generate SSPI Context” error when the client tries to connect the server. Keep in mind this only happens when TCP is enabled for the SQL server and is used by the client to connect the server.
What happened here is: When SQL server ran under Local System, it had successfully registered the Service Principle Name (SPN) for the service. The SPN is kept in the Active Directory and should be de-registered when the server is shutdown. Due to the accidental shutdown, SQL server failed to de-register the SPN. When the client connects to the server using TCP, it can find the SPN in the Active Directory and Kerberos will be used to perform the security delegation. However, the new account is not the correct container of the SPN, and Kerberos will fail.
When this happens, some people may choose to reinstall SQL Server or even the whole OS. They may be frustrated by the fact that the problem is still there if local or domain account is again chosen as the service account. The SPN in the Active Directory won’t go away even if you reinstall the OS.
Setspn.exe can be used to register/de-register SPNs. One can register the same SPN for the same container more than one time. The registration beyond the first registration may not do anything. One de-registration will remove the SPN from Active Directory totally. Because of this, the easiest first step to troubleshoot “Cannot Generate SSPI Context” is to run SQL server under Local System account and gracefully shut it down. You can then change your service account to whatever you want. SPN will not be registered and clients will fallback to use NTLM.
Also note that, if you made any change related to SPN or service account on the server, the cached information on the clients may need a couple of minutes to go away. You may see some inconsistent information during this period. Just wait several minutes in this case.
>>>
>>>>
Here, I talk about one extreme situation: SQL server was running under Local System and was shutdown accidentally. The user then decides to run SQL server under a different account, e.g local account, domain account etc., for whatever reasons. Then he/she hit this “Cannot Generate SSPI Context” error when the client tries to connect the server. Keep in mind this only happens when TCP is enabled for the SQL server and is used by the client to connect the server.
What happened here is: When SQL server ran under Local System, it had successfully registered the Service Principle Name (SPN) for the service. The SPN is kept in the Active Directory and should be de-registered when the server is shutdown. Due to the accidental shutdown, SQL server failed to de-register the SPN. When the client connects to the server using TCP, it can find the SPN in the Active Directory and Kerberos will be used to perform the security delegation. However, the new account is not the correct container of the SPN, and Kerberos will fail.
When this happens, some people may choose to reinstall SQL Server or even the whole OS. They may be frustrated by the fact that the problem is still there if local or domain account is again chosen as the service account. The SPN in the Active Directory won’t go away even if you reinstall the OS.
Setspn.exe can be used to register/de-register SPNs. One can register the same SPN for the same container more than one time. The registration beyond the first registration may not do anything. One de-registration will remove the SPN from Active Directory totally. Because of this, the easiest first step to troubleshoot “Cannot Generate SSPI Context” is to run SQL server under Local System account and gracefully shut it down. You can then change your service account to whatever you want. SPN will not be registered and clients will fallback to use NTLM.
Also note that, if you made any change related to SPN or service account on the server, the cached information on the clients may need a couple of minutes to go away. You may see some inconsistent information during this period. Just wait several minutes in this case.
>>>
When SQL server ran under Local System, it had successfully registered the Service Principle Name (SPN) for the service.That's not correct. SPN should only work with AD accounts so when Local System is used (a very bad option, I may say) a SPN can't be set.
When this happens, some people may choose to reinstall SQL Server or even the whole OS.That's wrong. When that happens you must set the SPN correctly. Remove old ones and add the new one.
ASKER
Thanks for the response Vitor, so you think since SQL Server is running under Local System account that the SPN is not the issue? What do you think I should look at or test?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
The odd thing is I have another client that they also have this same setup that is having no issues at all and working. SQL there is also running under local system but working fine. I have compared everything between the two but cannot find anything different yet. The SQL servers are both 2008 and 2008R2 servers. The only difference is the one working the DC is 2003 Server still where this one not working has a 2012R2 DC.
ASKER
The VAR is saying and blaming the DC saying there is a problem with the DC causing it but I cannot find any issues with it as everything else works fine but their SQL app.
The only difference is the one working the DC is 2003 Server still where this one not working has a 2012R2 DC.This may justify the difference. I'm not sure if Kerberos authentication were already available in 2003. Can you check the SPN for that client?
ASKER
If both of the clients SQL servers service account is set to local system, will there be an SPN for it?
ASKER
I had asked them to install a new SQL server on another server with a blank DB to see if that worked. They said it did not but with different errors which I am looking to see what the other error is.
Interesting enough they did not use local system on that install instead used the Domain admin account.
When run the SPN query on the two different servers I get this...
C:\Users\Administrator.CFP >setspn -L CFP-EMR
Registered ServicePrincipalNames for CN=CFP-EMR,OU=servers,OU=C FP,DC=cfp, DC=local:
TERMSRV/CFP-EMR
TERMSRV/CFP-EMR.cfp.local
WSMAN/CFP-EMR
WSMAN/CFP-EMR.cfp.local
RestrictedKrbHost/CFP-EMR
HOST/CFP-EMR
RestrictedKrbHost/CFP-EMR. cfp.local
HOST/CFP-EMR.cfp.local
C:\Users\Administrator.CFP >setspn -L CFPserver2
Registered ServicePrincipalNames for CN=CFPSERVER2,CN=Computers ,DC=cfp,DC =local:
MSSQLSvc/CFPServer2.cfp.lo cal:1433
MSSQLSvc/CFPServer2.cfp.lo cal
TERMSRV/CFPSERVER2
TERMSRV/CFPServer2.cfp.loc al
WSMAN/CFPServer2
WSMAN/CFPServer2.cfp.local
RestrictedKrbHost/CFPSERVE R2
HOST/CFPSERVER2
RestrictedKrbHost/CFPServe r2.cfp.loc al
HOST/CFPServer2.cfp.local
Interesting enough they did not use local system on that install instead used the Domain admin account.
When run the SPN query on the two different servers I get this...
C:\Users\Administrator.CFP
Registered ServicePrincipalNames for CN=CFP-EMR,OU=servers,OU=C
TERMSRV/CFP-EMR
TERMSRV/CFP-EMR.cfp.local
WSMAN/CFP-EMR
WSMAN/CFP-EMR.cfp.local
RestrictedKrbHost/CFP-EMR
HOST/CFP-EMR
RestrictedKrbHost/CFP-EMR.
HOST/CFP-EMR.cfp.local
C:\Users\Administrator.CFP
Registered ServicePrincipalNames for CN=CFPSERVER2,CN=Computers
MSSQLSvc/CFPServer2.cfp.lo
MSSQLSvc/CFPServer2.cfp.lo
TERMSRV/CFPSERVER2
TERMSRV/CFPServer2.cfp.loc
WSMAN/CFPServer2
WSMAN/CFPServer2.cfp.local
RestrictedKrbHost/CFPSERVE
HOST/CFPSERVER2
RestrictedKrbHost/CFPServe
HOST/CFPServer2.cfp.local
ASKER
Here is the SPN on the DC...
C:\Users\Administrator.CFP >setspn -L CFPDomain
Registered ServicePrincipalNames for CN=CFPDOMAIN,OU=Domain Controllers,DC=cfp,D
C=local:
DNS/CFPDomain.cfp.local
HOST/CFPDomain.cfp.local/C FP
RPC/40f22dbb-52d9-46c8-97b 3-cd6dfbd1 64a2._msdc s.cfp.loca l
GC/CFPDomain.cfp.local/cfp .local
HOST/CFPDomain.cfp.local/c fp.local
HOST/CFPDOMAIN/CFP
ldap/CFPDOMAIN/CFP
ldap/40f22dbb-52d9-46c8-97 b3-cd6dfbd 164a2._msd cs.cfp.loc al
ldap/CFPDomain.cfp.local/C FP
ldap/CFPDOMAIN
ldap/CFPDomain.cfp.local
ldap/CFPDomain.cfp.local/F orestDnsZo nes.cfp.lo cal
ldap/CFPDomain.cfp.local/D omainDnsZo nes.cfp.lo cal
ldap/CFPDomain.cfp.local/c fp.local
E3514235-4B06-11D1-AB04-00 C04FC2DCD2 /40f22dbb- 52d9-46c8- 97b3-cd6df bd164a
2/cfp.local
NtFrs-88f5d2bd-b646-11d2-a 6d3-00c04f c9b232/CFP Domain.cfp .local
Dfsr-12F9A27C-BF97-4787-93 64-D31B6C5 5EB04/CFPD omain.cfp. local
TERMSRV/CFPDOMAIN
TERMSRV/CFPDomain.cfp.loca l
WSMAN/CFPDomain
WSMAN/CFPDomain.cfp.local
RestrictedKrbHost/CFPDOMAI N
HOST/CFPDOMAIN
RestrictedKrbHost/CFPDomai n.cfp.loca l
HOST/CFPDomain.cfp.local
I noticed that for some reason when they created the network and setup the DC they created the OU=CFP with additional OU's under it and the DC is under the OU=Domain Controllers as seen in attachment... wondering if that is also creating any issues
C:\Users\Administrator.CFP
Registered ServicePrincipalNames for CN=CFPDOMAIN,OU=Domain Controllers,DC=cfp,D
C=local:
DNS/CFPDomain.cfp.local
HOST/CFPDomain.cfp.local/C
RPC/40f22dbb-52d9-46c8-97b
GC/CFPDomain.cfp.local/cfp
HOST/CFPDomain.cfp.local/c
HOST/CFPDOMAIN/CFP
ldap/CFPDOMAIN/CFP
ldap/40f22dbb-52d9-46c8-97
ldap/CFPDomain.cfp.local/C
ldap/CFPDOMAIN
ldap/CFPDomain.cfp.local
ldap/CFPDomain.cfp.local/F
ldap/CFPDomain.cfp.local/D
ldap/CFPDomain.cfp.local/c
E3514235-4B06-11D1-AB04-00
2/cfp.local
NtFrs-88f5d2bd-b646-11d2-a
Dfsr-12F9A27C-BF97-4787-93
TERMSRV/CFPDOMAIN
TERMSRV/CFPDomain.cfp.loca
WSMAN/CFPDomain
WSMAN/CFPDomain.cfp.local
RestrictedKrbHost/CFPDOMAI
HOST/CFPDOMAIN
RestrictedKrbHost/CFPDomai
HOST/CFPDomain.cfp.local
I noticed that for some reason when they created the network and setup the DC they created the OU=CFP with additional OU's under it and the DC is under the OU=Domain Controllers as seen in attachment... wondering if that is also creating any issues
When run the SPN query on the two different servers I get this...So no SPN entry for SQL Server service (MSSQLSvc).
ASKER
Correct, but that is due to the SQL service logging in as LOCAL SYSTEM. Correct?
I just changed the SQL services to logon as the domain admin account and now most of the authentications work. However, if I run the setspn -L CFP-EMR command it still does not show an SPN for SQL server.
I just changed the SQL services to logon as the domain admin account and now most of the authentications work. However, if I run the setspn -L CFP-EMR command it still does not show an SPN for SQL server.
ASKER
ASKER
Also, just fyi - both named pipes and TCP/IP are enabled and in this order
if I run the setspn -L CFP-EMR command it still does not show an SPN for SQL server.You'll need to add it manually with SETSPN -A parameter.
Jeff, any update on this question?
ASKER
Thanks for the help. Changing the account SQL logs in with has fixed the issue. It must be something with having a 2012 DC since they had this same setup with a 2003 DC and it worked. Thanks again.
ASKER
I did this and found something interesting and not sure if this has anything to do with it.
I found several MSSQL user accounts that for some reason say SQL 2005 and have the old domain controller name instead of the new domain controller name. We removed the old domain controller which was 2003 and installed a new DC that is 2012R2. Now we did this over 6 months before the issues started so that is why I am not sure this is part of the issue. The old DC was named CFP_Server1 and the old SQL was 2005. These were both replaced 6 months before the issue started. The DC has a new name and SQL is 2008 now. I have attached a screen shot.