Link to home
Start Free TrialLog in
Avatar of Camillia
CamilliaFlag for United States of America

asked on

Different between these 2 sql statements

I had this question after viewing Simple SQL - Only select row with specific status.

I got help from the experts in the link above. I'm running the sql against the real database and I get a difference in number of rows I get back.

This is SQL 2014

This gives me 59152 rows


 WITH cteLoadS AS
(
	SELECT  CommSalesOrder_CommSalesOrderId, loadStatus, COUNT(CommSalesOrder_CommSalesOrderId) OVER(PARTITION BY CommSalesOrder_CommSalesOrderId, loadStatus) AS CountByStatus,
		COUNT(CommSalesOrder_CommSalesOrderId) OVER(PARTITION BY CommSalesOrder_CommSalesOrderId) AS CountByCommSalesId
	FROM loads
)
SELECT DISTINCT C.*, Loadstatus
FROM cteLoadS AS cte
INNER JOIN CommSalesOrders AS C
	ON cte.CommSalesOrder_CommSalesOrderId = C.CommSalesOrderId
WHERE cte.CountByCommSalesId = cte.CountByStatus
	AND cte.loadStatus = 4

Open in new window


This one gives me 64449 rows. I'd think they should return the same number of rows. Just looking at the 2 sql statements, why would be different as far as the number of rows it's bringing?

select c.*
from (
    select CommSalesOrder_CommSalesOrderId
    from loads
    group by CommSalesOrder_CommSalesOrderId, loadstatus
    having max(case when loadStatus !=4 then 0 else 1 end) = 1
) as l
inner join CommSalesOrders c on c.CommSalesOrderId = l.CommSalesOrder_CommSalesOrderId

Open in new window

Avatar of Arana (G.P.)
Arana (G.P.)

Have you identified any different result ? you can compare both queries to find out which ones are not showing in the other query.

I see you are using the HAVING clause, I personally try to avoid it, for performance reasons , that is why in your previous question i used EXISTS  (did you try it?)  it is way faster than 'having' as it doesnt process all the results, uses indexes and other optimizations.

Can you try it with the example I gave you and see how many rows you get?
Avatar of Camillia

ASKER

Your sql in the related question didn't work...it didn't run. I can try it again but I got an error when I tried it with the sample table I posted.

I'll dump the results in 2 different tables and compare and see what's missing
hmm weird as i tested it in ideone.com as i just took away the #'s and it worked for me
 
create table commsales
(
  id int,
  Seller varchar(20)

);


Create table loads
(
  Id int,
  CommSalesId int,
  loadStatus varchar(1)

);


insert into commsales
select  1, 'ABC';

insert into commsales
select  2, 'XYZ';

insert into loads
select  1, 1, 'S';

insert into loads
select  2, 1, 'S';

insert into loads
select  3, 1, 'N';

insert into loads
select  1, 2, 'S';

insert into loads
select  1, 2, 'S';
go

select distinct * from
(
select *  from commsales c
 inner join loads l on c.id = l.CommSalesId
  and not exists
(select 1 from  loads where loads.loadstatus='N' and c.id=loads.commsalesid));

Open in new window

I'll try it in the morning when I go to work. Thanks
It's for sure the combination of DISTINCT and the INNER JOIN. Cause you run DISTINCT over *. The separate GROUP BY is reducing the number of rows, this behavior is not the same with your DISTINCT.

As a rule of thumb: Don't use DISTINCT on complex queries.

So as a test: Just use a GROUP BY instead of the DISTINCT.
Camillia, in my previous code I tried now in sql server and i get errors, I dont know why it works on ideone.com but not on sqlserver itself, so i modfied this part a bit:

select distinct *  from #commsales c
 inner join #loads l on c.id = l.CommSalesId
  and not exists
(select 1 from  #loads where loadstatus='N' and c.id=commsalesid)

it gives the result you expected and should be faster and give consistent results with the first example you have.
ste5an  -

So as a test: Just use a GROUP BY instead of the DISTINCT.

So, change this to use "Group by" as a test?

 WITH cteLoadS AS
(
	SELECT  CommSalesOrder_CommSalesOrderId, loadStatus, COUNT(CommSalesOrder_CommSalesOrderId) OVER(PARTITION BY CommSalesOrder_CommSalesOrderId, loadStatus) AS CountByStatus,
		COUNT(CommSalesOrder_CommSalesOrderId) OVER(PARTITION BY CommSalesOrder_CommSalesOrderId) AS CountByCommSalesId
	FROM loads
)
SELECT DISTINCT C.*, Loadstatus
FROM cteLoadS AS cte
INNER JOIN CommSalesOrders AS C
	ON cte.CommSalesOrder_CommSalesOrderId = C.CommSalesOrderId
WHERE cte.CountByCommSalesId = cte.CountByStatus
	AND cte.loadStatus = 4

Open in new window

all that stuff about distinct and group by differences are true and part of your problem here
but forget all those and try this query, much simpler and will give you want you wanted from your original question.  There you wanted to only output orders and sellers with all load statuses = 'S' which I take is equivalent to the 4 here?  But I may have translated your samples incorrectly from the old question to this one.  Anyway, the query needed should be much easier to do than any of those complex examples above.
select CSO.*, L.* 
FROM dbo.CommSalesOrders CSO
OUTER APPLY (SELECT TOP 1 'anything' unfilledLoads FROM dbo.loads L WHERE L.CommSalesOrder_CommSalesOrderId = CSO.CommSalesOrderId AND L.loadStatus <> 4) Loads
INNER JOIN dbo.loads L ON L.CommSalesOrder_CommSalesOrderId = CSO.CommSalesOrderId
WHERE Loads.unfilledLoads IS NULL

Open in new window

Results like this:
User generated image
Thanks, Chris. I'll try it in the morning.
SOLUTION
Avatar of ste5an
ste5an
Flag of Germany image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanks, Paul. I'm at work now. Let me see.