Solved

Is there any other way to rewrite below query without GROUP BY HAVING and  NOT EXISTS clauses ?

Posted on 2013-10-31
11
259 Views
Last Modified: 2013-11-12
I heard that GROUP BY HAVING and  NOT EXISTS clauses  will give performance bottleneck s when we do multiple joins .Is there any other way to rewrite below query without these clauses ?

SELECT A.ID
  FROM EMPLOYEES E
INNER JOIN SITES S     ON S.ID = PS.SITE_ID 
INNER JOIN LOCATIONS SL  ON S.ID = SL.SITE_ID
WHERE 
SL.ID =  221 AND

NOT EXISTS(
SELECT 1
  FROM EMPLOYEE_INSTANCES EMP_INNER
  ,POLICY_SITES PS
  WHERE 
  EMP_INNER.EMP_ID =E.ID AND 
 EMP_INNER.STATUS ='V' AND
 E.POLICY_ID=PS.POLICY_ID AND
 PS.SITE_ID=SL.SITE_ID AND
 EMP_INNER.LOCATION_ID =SL.ID
 GROUP BY SITE_ID,PS.TOTAL_COPIES
 HAVING PS.TOTAL_COPIES = COUNT(DISTINCT EMP_INNER.LOCATION_ID)
)

Open in new window

0
Comment
Question by:chaitu chaitu
11 Comments
 
LVL 28

Expert Comment

by:sammySeltzer
ID: 39613960
That depends on how large your data is.

You can also try using the except clause
0
 
LVL 20

Author Comment

by:chaitu chaitu
ID: 39614031
we have 6 million records in EMPLOYEES and EMPLOYEE_INSTANCES tables.
you mean to say we can replace NOT EXISTS with EXCEPT clause.

what abt GROUP BY HAVING clause?
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39614066
I'm not surprised you're asking the question
- trouble is I have no idea what the functionality is e.g.

what "business rule" does this ensure?
HAVING PS.TOTAL_COPIES = COUNT(DISTINCT EMP_INNER.LOCATION_ID)

you might save some cycles by placing that awkward exists query into a CTE (maybe)

BUT, the query you have given us is incomplete:

SELECT
        A.ID
FROM EMPLOYEES E
        INNER JOIN SITES S
                ON S.ID = PS.SITE_ID --<< where is PS ???
        INNER JOIN LOCATIONS SL
                ON S.ID = SL.SITE_ID

have you "simplified" it?
0
 
LVL 20

Author Comment

by:chaitu chaitu
ID: 39614079
we mention the value in total_copies in this column and that should be equal to the no of coont in  EMP_INSTANCES table.


SELECT A.ID
  FROM EMPLOYEES E
INNER JOIN SITES S     ON S.ID = PS.SITE_ID
INNER JOIN POLICY_SITES PS1  ON PS1.POLICY_ID= E.POLICY_ID
INNER JOIN LOCATIONS SL  ON S.ID = SL.SITE_ID
WHERE
SL.ID =  221 AND

NOT EXISTS(
SELECT 1
  FROM EMPLOYEE_INSTANCES EMP_INNER
  ,POLICY_SITES PS
  WHERE
  EMP_INNER.EMP_ID =E.ID AND
 EMP_INNER.STATUS ='V' AND
 E.POLICY_ID=PS.POLICY_ID AND
 PS.SITE_ID=SL.SITE_ID AND
 EMP_INNER.LOCATION_ID =SL.ID
 GROUP BY SITE_ID,PS.TOTAL_COPIES
 HAVING PS.TOTAL_COPIES = COUNT(DISTINCT EMP_INNER.LOCATION_ID)
)
0
 
LVL 31

Expert Comment

by:awking00
ID: 39614659
Still incomplete -
>>SELECT A.ID   ==> What is A the alias of?
  FROM EMPLOYEES E
INNER JOIN SITES S     ON S.ID = PS.SITE_ID
INNER JOIN POLICY_SITES PS1  ON PS1.POLICY_ID= E.POLICY_ID
INNER JOIN LOCATIONS SL  ON S.ID = SL.SITE_ID<<
Rather than trying to re-write your query, perhaps you can just provide some sample data for your tables and the expected output with the criteria you're using for that output. We might then see a better approach that can help you.
0
Highfive + Dolby Voice = No More Audio Complaints!

Poor audio quality is one of the top reasons people don’t use video conferencing. Get the crispest, clearest audio powered by Dolby Voice in every meeting. Highfive and Dolby Voice deliver the best video conferencing and audio experience for every meeting and every room.

 
LVL 48

Expert Comment

by:PortletPaul
ID: 39615127
Please provide the complete query

or, at least all of the table and join details without simplification
0
 
LVL 35

Expert Comment

by:David Todd
ID: 39616174
Hi Folks,

The PS is used further down, inside the subquery, but wont be visible outside that, so yes I agree that the query is broken.

Also please use inner joins in the subquery as its quite hard to read and guess at what you are trying to achieve.

Regards
  David
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39616260
I'm sorry, but what about A.ID ??

please provide an unbroken query - is there a problem with doing so?
having all the tables and joins is quite important to meeting your request.

when you say please use inner joins do you mean ansi syntax? that's not a problem and will happen most of the time anyway. but I await that corrected query.
0
 
LVL 20

Author Comment

by:chaitu chaitu
ID: 39616576
This is the sample data.
as per the policy_id ,the total number copies must be maintained in the employees_instances based on the same site id.

for example for policy=123,the total copies for site 11111 is 1 and for site-55555 is 2.
if you see same number of copies is maintained in employees_instances(records 1,2,3) for policy.

for 567 policy then number of copies for site-55555 is 1.so one copy is there for that site in employees_instances(see record id=4).

insert into employees(id,name,policy_id) values(1,'test',123)
insert into employees(id,name,policy_id) values(2,'test',345)
insert into employees(id,name,policy_id) values(3,'test',567)

insert into employees_instances(id,name,emp_id,location_id,status) values(1,'test',1,111111,'V')
insert into employees_instances(id,name,emp_id,location_id,status) values(2,'test',1,555551,'V')
insert into employees_instances(id,name,emp_id,location_id,status) values(3,'test',1,555552,'V')

insert into employees_instances(id,name,emp_id,location_id,status) values(4,'test',2,555552,'V')

insert into employees_instances(id,name,emp_id,location_id,status) values(5,'test',3,111111,'V')

insert into sites(id,name) values(11111,'testsite1')
insert into sites(id,name) values(55555,'testsite5')

insert into policy_sites(policy_id,site_id,TOTAL_COPIES) values(123,11111,1)
insert into policy_sites(policy_id,site_id,TOTAL_COPIES) values(123,55555,2)

insert into policy_sites(policy_id,site_id,TOTAL_COPIES) values(345,11111,0)
insert into policy_sites(policy_id,site_id,TOTAL_COPIES) values(345,55555,1)

insert into policy_sites(policy_id,site_id,TOTAL_COPIES) values(567,11111,1)
insert into policy_sites(policy_id,site_id,TOTAL_COPIES) values(567,55555,0)

insert into locations(id,site_id) values(111111,11111)
insert into locations(id,site_id) values(555551,55555)
insert into locations(id,site_id) values(555552,55555)


SELECT E.ID
  FROM EMPLOYEES E
INNER JOIN SITES S     ON S.ID = PS1.SITE_ID 
INNER JOIN POLICY_SITES PS1  ON PS1.POLICY_ID= E.POLICY_ID
INNER JOIN LOCATIONS SL  ON S.ID = SL.SITE_ID
WHERE 
SL.ID =  555551 AND
NOT EXISTS(
SELECT 1
  FROM EMPLOYEE_INSTANCES EMP_INNER
  ,POLICY_SITES PS
  WHERE 
  EMP_INNER.EMP_ID =E.ID AND 
 EMP_INNER.STATUS ='V' AND
 E.POLICY_ID=PS.POLICY_ID AND
 PS.SITE_ID=SL.SITE_ID AND
 EMP_INNER.LOCATION_ID =SL.ID
 GROUP BY SITE_ID,PS.TOTAL_COPIES
 HAVING PS.TOTAL_COPIES = COUNT(DISTINCT EMP_INNER.LOCATION_ID)
)

Open in new window

0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39616687
Thanks, it's working.

You are aware that the following snippet is Causing a Cartesian product I hope:

AND NOT EXISTS (
        SELECT
                1
        FROM EMPLOYEE_INSTANCES EMP_INNER
           , POLICY_SITES PS


i.e. take the numbers of rows in employee_instances
and MULTIPLY by numbers of rows in policy_sites
= scale of Cartesian product

Should I assume this is deliberate?
0
 
LVL 48

Accepted Solution

by:
PortletPaul earned 500 total points
ID: 39619626
I'm not convinced the following is an improvement,
SELECT
        E.ID
FROM EMPLOYEES E
        INNER JOIN POLICY_SITES PS
                ON E.POLICY_ID = PS.POLICY_ID
        INNER JOIN SITES S
                ON PS.SITE_ID = S.ID
        INNER JOIN LOCATIONS L
                ON S.ID = L.SITE_ID
        INNER JOIN (
                    SELECT DISTINCT
                          emp_id
                    FROM (
                          SELECT
                                  EI.emp_id
                                , ps1.total_copies
                          FROM EMPLOYEE_INSTANCES EI
                          INNER JOIN EMPLOYEES E ON EI.emp_id = e.id
                          INNER JOIN POLICY_SITES PS1 ON E.POLICY_ID = PS1.POLICY_ID
                          GROUP BY
                                  EI.emp_id
                                , ps1.total_copies
                          HAVING PS1.TOTAL_COPIES <> COUNT(DISTINCT EI.LOCATION_ID)
                          ) as tc
                     ) as dtc
                ON E.ID = dtc.emp_id
WHERE L.ID = 555551
;

Open in new window

It's as far as I could take for now. Perhaps it will help.

{+edit} flipping the joins in the innermost subquery may also help, it appears to alter the execution plan:
SELECT
        E.ID
FROM EMPLOYEES E
        INNER JOIN POLICY_SITES PS ON E.POLICY_ID = PS.POLICY_ID
        INNER JOIN SITES S ON PS.SITE_ID = S.ID
        INNER JOIN LOCATIONS L ON S.ID = L.SITE_ID
        INNER JOIN (
                    SELECT DISTINCT
                          emp_id
                    FROM (
                          SELECT
                                  EI.emp_id
                                , ps1.total_copies
                          FROM EMPLOYEES E 
                          INNER JOIN EMPLOYEE_INSTANCES EI ON E.ID = EI.emp_id
                          INNER JOIN POLICY_SITES PS1 ON E.POLICY_ID = PS1.POLICY_ID
                          GROUP BY
                                  EI.emp_id
                                , ps1.total_copies
                          HAVING PS1.TOTAL_COPIES <> COUNT(DISTINCT EI.LOCATION_ID)
                          ) as tc
                     ) as dtc
                ON E.ID = dtc.emp_id
WHERE L.ID = 555551
;

-- http://sqlfiddle.com/#!3/031a5/20

Open in new window

0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
mySql Syntax 7 33
Is there a canned solution for a custom genealogy website 2 40
Row-Level Security 2 20
SqlAdvisor 2016 3 11
How to leverage one TLS certificate to encrypt Microsoft SQL traffic and Remote Desktop Services, versus creating multiple tickets for the same server.
This article describes how to use the timestamp of existing data in a database to allow Tableau to calculate the prior work day instead of relying on case statements or if statements to calculate the days of the week.
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now