SQL Query Challenge

gdemaria
gdemaria used Ask the Experts™
on
SQL Server 2012 -

I have a challenging query... trying to best optimize it.   Would love some help!

I have two tables (called BEFORE and AFTER) with the same structure; two columns, a GROUP_ID and a VALUE.
I need to compare the VALUES for each GROUP_ID and if they are DIFFERENT, add the GROUP_ID to another table.

So given the BEFORE and AFTER tables below.
GROUP ID = 1 is different between the two tables because value 11 and 12 are not in AFTER
GROUP ID = 2 is different between the two tables because value 23 is not in BEFORE
GROUP ID = 3 is the same in both tables

Therefore, the RESULT table should contain  GROUP ID  with 1 and 2

BEFORE
Group ID     Value
    1                 10
    1                 11
    1                 12
    1                 13
    2                 21
    2                 22
    2                 24
    2                 25
    3                 30
    3                 31
    3                 32


AFTER
Group ID     Value
    1                 10
    1                 13
    2                 21
    2                 22
    2                 23
    2                 24
    2                 25
    3                 30
    3                 31
    3                 32

RESULT table
GROUP ID   (unique key, no duplicates)
     1
     2  


SO basically, it would have to be something like this...


INSERT INTO RESULTS (GROUP_ID)
select distinct GROUP_ID  
  from BEFORE
    ( join ) AFTER ON   B.GROUP_ID = A.GROUP_ID
                                   and B.VALUE = A.VALUE

WHERE the BEFORE and AFTER tables do not contain EXACTLY the same VALUES


So to recap, I need to compare the values of each GROUP_ID
If any thing is in BEFORE and not in AFTER or visa versa, I need to put the GROUP_ID into RESULTS
If they are exactly the same, the GROUP_ID does not go into RESULTS.

What's the best way to do this??
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
EE Topic Advisor
Most Valuable Expert 2014
Awarded 2013
Commented:
Full outer join works for this
SELECT DISTINCT 
  coalesce(b.group_id, a.group_id) group_id
from [BEFORE] b
full outer join [AFTER] a on b.group_id = a.group_id and b.value = a.value 
where b.group_id is null
or a.group_id is null

Open in new window

Thanks very much Paul!  
This solution does work, so that's great!
Because my tables are big, it takes 45 seconds to run.   I do have good indexes, so I was just wondering if you had an ideas for another format that might run faster?  

What I'm thinking is that this needs to match every record and the reduce the record set with a distinct.   Is there a way to just find a non-matching record and then stop processing using perhaps TOP 1 in a sub query or maybe with a CTE or something else?

Just wondering about improving performance; otherwise it's great!
PortletPaulEE Topic Advisor
Most Valuable Expert 2014
Awarded 2013

Commented:
full outer joins are expensive
can you attach a .sqlplan file (execution plan) perhaps?
Ensure you’re charging the right price for your IT

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden using our free interactive tool and use it to determine the right price for your IT services. Start calculating Now!

PortletPaulEE Topic Advisor
Most Valuable Expert 2014
Awarded 2013

Commented:
You could try this alternative:

select
     group_id
from before b
where not exists (
  select null from after a
  where b.group_id = a.group_id and b.value = a.value 
  )
union
select
     group_id
from after a
where not exists (
  select null from before b
  where b.group_id = a.group_id and b.value = a.value 
  )

Open in new window

Scott PletcherSenior DBA
Most Valuable Expert 2018
Top Expert 2014

Commented:
I do have good indexes

How are the tables clustered?  In particular, for this specific query, best would be clustered on ( GroupID, Value ).
Hey Scott - thanks for joining.    Right, clustered index on the group and then value
Scott PletcherSenior DBA
Most Valuable Expert 2018
Top Expert 2014

Commented:
Great, glad it helped.

Btw, if you have a non-clus index(es) that start with column "GroupID", you can get rid of it(them) now, because now the clus index starts with GroupID.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial