Link to home
Start Free TrialLog in
Avatar of gdemaria
gdemariaFlag for United States of America

asked on

SQL Query Challenge

SQL Server 2012 -

I have a challenging query... trying to best optimize it.   Would love some help!

I have two tables (called BEFORE and AFTER) with the same structure; two columns, a GROUP_ID and a VALUE.
I need to compare the VALUES for each GROUP_ID and if they are DIFFERENT, add the GROUP_ID to another table.

So given the BEFORE and AFTER tables below.
GROUP ID = 1 is different between the two tables because value 11 and 12 are not in AFTER
GROUP ID = 2 is different between the two tables because value 23 is not in BEFORE
GROUP ID = 3 is the same in both tables

Therefore, the RESULT table should contain  GROUP ID  with 1 and 2

BEFORE
Group ID     Value
    1                 10
    1                 11
    1                 12
    1                 13
    2                 21
    2                 22
    2                 24
    2                 25
    3                 30
    3                 31
    3                 32


AFTER
Group ID     Value
    1                 10
    1                 13
    2                 21
    2                 22
    2                 23
    2                 24
    2                 25
    3                 30
    3                 31
    3                 32

RESULT table
GROUP ID   (unique key, no duplicates)
     1
     2  


SO basically, it would have to be something like this...


INSERT INTO RESULTS (GROUP_ID)
select distinct GROUP_ID  
  from BEFORE
    ( join ) AFTER ON   B.GROUP_ID = A.GROUP_ID
                                   and B.VALUE = A.VALUE

WHERE the BEFORE and AFTER tables do not contain EXACTLY the same VALUES


So to recap, I need to compare the values of each GROUP_ID
If any thing is in BEFORE and not in AFTER or visa versa, I need to put the GROUP_ID into RESULTS
If they are exactly the same, the GROUP_ID does not go into RESULTS.

What's the best way to do this??
ASKER CERTIFIED SOLUTION
Avatar of PortletPaul
PortletPaul
Flag of Australia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of gdemaria

ASKER

Thanks very much Paul!  
This solution does work, so that's great!
Because my tables are big, it takes 45 seconds to run.   I do have good indexes, so I was just wondering if you had an ideas for another format that might run faster?  

What I'm thinking is that this needs to match every record and the reduce the record set with a distinct.   Is there a way to just find a non-matching record and then stop processing using perhaps TOP 1 in a sub query or maybe with a CTE or something else?

Just wondering about improving performance; otherwise it's great!
full outer joins are expensive
can you attach a .sqlplan file (execution plan) perhaps?
You could try this alternative:

select
     group_id
from before b
where not exists (
  select null from after a
  where b.group_id = a.group_id and b.value = a.value 
  )
union
select
     group_id
from after a
where not exists (
  select null from before b
  where b.group_id = a.group_id and b.value = a.value 
  )

Open in new window

I do have good indexes

How are the tables clustered?  In particular, for this specific query, best would be clustered on ( GroupID, Value ).
Hey Scott - thanks for joining.    Right, clustered index on the group and then value
Great, glad it helped.

Btw, if you have a non-clus index(es) that start with column "GroupID", you can get rid of it(them) now, because now the clus index starts with GroupID.