how to force a HASH JOIN in SQL 2008

--On Server 2008
FROM Table1
---execution time  7min  
table1  ID is PK and table has 10177 records
--On Server 2005
FROM  [SERVER2008].[DataBase1].DBO.Table1
--less than 1second.
table2 no indexes and has 103180 records.

in execution plan SERVER2008 use NestedLoops cost 96%
In     ---  ||  ----      SERVER2005 use Hash Match  cost 10%

Question is :
How to force a hash join in this case.
Or why the optimizer is choosing NestedLoops.
Who is Participating?
vasureddymConnect With a Mentor Commented:
I'm successfully able to replicate the issue on my local box.

Server 1: SQL server 2008 express instance
Server 2: SQL server 2008 enterprise instance (trail version)
These servers have linked servers setup between them

execute the below script on the server 1
--== server 1 script (DESKTOP68\SQLEXPRESS1)
create table dbo.TestMasterTable (id int identity Primary Key, nme varchar(100))
insert into TestMasterTable values('A')
insert into TestMasterTable values('B')
insert into TestMasterTable values('D')
go 3000

Open in new window

execute the below script on the server 2
--== server 2 script (DESKTOP68)
create table dbo.TestChildTableTemp (id int identity, nme varchar(100), id2 uniqueidentifier default(newid()))

insert into TestChildTableTemp(nme) values('A')
insert into TestChildTableTemp(nme) values('B')
insert into TestChildTableTemp(nme) values('C')
go 30000

-- loading from the TestChildTableTemp instead of directly loading, to generate random values in 'id' column
create table dbo.TestChildTable (id int, nme varchar(100))
insert into TestChildTable
select id, nme from TestChildTableTemp
order by id2

-- delete few ids so that the NOT IN query has some results
delete from TestChildTable where id between 100 and 200

Open in new window

Now execute the below query on each of the servers and check the execution time and execution plan. (Note: change the servers names to match your environment)
--== Query 1: using subquery
select id
from [DESKTOP68\SQLEXPRESS1].test.dbo.TestMasterTable t1
where id not in (select id from DESKTOP68.test.dbo.TestChildTable)

Open in new window

You will note that the query executed on server 2 uses Hash Match join and runs fast (just like in your case). And the query executed on server 1 uses the Nested Loop join and takes long time.

I have spent a lot of time and here is my conclusions.

1) The original culprit is not just the Nested Loop join, but is the Table Spool operator placed just before the join. The Actual Rows returned by the Table Spool operator are many hundred times the actual data in the table. I'm still not sure why that is so, but the amount of data for the Nested Loop Join is huge (can note that the CPU utilization amount in the execution plan) and is taking a lot of time to execute.

2) On the other hand, for the same query when executed on the server 1, Row Count Spool is used which has resulted in less number of records. When you carefully observe the query, it is slow when we the query in the NOT IN subquery is from different server. To confirm this, I have swapped the servers and recreated the scenario.
In other words, if the NOT IN subquery is a remote query, we have the trouble.

3) If you actually use the LEFT OUTER JOIN (even without using the HASH JOIN hint), the optimizer is able to make better decision, and is actually avoiding the SPOOL operation, and using the MERGE join (in my case). Results are also faster. code below
from [DESKTOP68\SQLEXPRESS1].test.dbo.TestMasterTable t1
left join DESKTOP68.test.dbo.TestChildTable t2 on =
where is null

Open in new window

Note: these are the observations that I made based on the scenario that I replicated on my local box. I tried to make it as close as that of yours based on the details you have provided. I could actually get the my execution plans same as that of the snapshots you sent.

Hope this explanation helps you understanding the issue. Please get back to me if you need any clarification, or if you have any further questions.
here is an example of how to force a hash join:

USE AdventureWorks2008R2;
SELECT p.Name, pr.ProductReviewID
FROM Production.Product p
LEFT OUTER HASH JOIN Production.ProductReview pr
ON p.ProductID = pr.ProductID
ORDER BY ProductReviewID DESC;
This should be a working solution :
  FROM Table1
  LEFT OUTER HASH JOIN [SERVER2005].[DataBase2].DBO.Table2 Table2 ON Table1.ID = Table2.FNAME

Open in new window

Easily Design & Build Your Next Website

Squarespace’s all-in-one platform gives you everything you need to express yourself creatively online, whether it is with a domain, website, or online store. Get started with your free trial today, and when ready, take 10% off your first purchase with offer code 'EXPERTS'.

Hash join runs faster then Neste loop join in case no of record in table are more and columns do not have indexes. But HASH join uses the processing power a lot and you have to consider it before changing queries on production server.
I'm actually thinking why does SQL Server 2008 (in your case) has chosen less efficient plan? Is this a hardware issue? subhashpunia has actually raised a good point.

can you please send (attach) the 2 execution plans (.sqlplan files)? and also the hardware details of the 2 servers.
PDFAuthor Commented:
Thanks for example.I guess i was'not clear i need to now reason way is that hapening.

@ vasureddym
here are  execution plans  attached and Servers\Hardwares details .
Server 2005:
Runing on:
M.Windows Server2003 Standard Edition SP1
Processor: Intel(R) Xeon(R) CPU 3.00GHz proc.2.99GHz
Mem. 4 GB
64-bit Operating System

Microsoft SQL Server 2005 - 9.00.1399.06 (Intel X86)  
Standard Edition on Windows NT 5.2 (Build 3790: Service Pack 1)

Server 2008:
Runing on:
Windows Server@Standard SP2
Processor: Intel(R) Xeon(R) CPU E5420 @ 2.50GHz 2.49GHz (2 processor)
Mem. 16 GB
64-bit Operating System

Microsoft SQL Server 2008 (RTM) - 10.0.1600.22 (X64)
Standard Edition (64-bit) on Windows NT 6.0 <X64> (Build 6002: Service Pack 2)

Thanks a lot.
PDFAuthor Commented:
Thank you very much.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.