Request a query with having statement.

jl66
jl66 used Ask the Experts™
on
Have a data set
Col1   Col2
1         A
1         B
2         AA
2         AC
2         BB
3         X
4         Y
....

Want to have any duplicate records having count(1) >1 which contain name = 'AC'? That is
2         AA
2         AC
2         BB

Is there any way to use a single sql statement to achieve it???

Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
jl66Consultant

Author

Commented:
name = 'AC' should be Col2='AC'
Sorry for that.
select count(1),"AC" from table
where col2="AC"
having count(1)>1

if that's what you want
flow01IT-specialist

Commented:
2         AA
2         AC
2         BB
is this the expected result ?  only 1 row contains AC

or

select col1,col2,count(1) from table
where col2="AC"
group by col1,col2
having count(1) > 1
jl66Consultant

Author

Commented:
Thanks alot for the info.

aiklamha: the query yields nothing.
flow01: Yes. How to get them with your query?
jl66Consultant

Author

Commented:
Want to get the duplicate records with at least one of them with the same col1 whose col2 = 'AC', that is in the test data:
2         AA
2         AC
2         BB
Top Expert 2009
Commented:
Try single quotes.


select col1, col2, count(1)
   from table
   where col2 = 'AC'
group by col1, col2
  having count(1) > 1
HainKurtSr. System Analyst

Commented:
here
with myData as (
select 1 as col1, 'A' as col2
union select 1, 'B'
union select 2, 'AA'
union select 2, 'AC'
union select 2, 'BB'
union select 3, 'X'
union select 4, 'Y'
union select 5, 'AB'
union select 5, 'AC'
)
select * from mydata t inner join 
(
select t1.col1 from myData t1 inner join (select distinct col1 from myData where col2='AC') t2 on t1.col1=t2.col1
group by t1.col1 
having COUNT(1)>1
) x on t.col1=x.col1

col1	col2	col1
2	AA	2
2	AC	2
2	BB	2
5	AB	5
5	AC	5

Open in new window

HainKurtSr. System Analyst

Commented:
actually this, replace table name with yours
select t.* from mydata t inner join 
(
select t1.col1 from myData t1 inner join (select distinct col1 from myData where col2='AC') t2 on t1.col1=t2.col1
group by t1.col1 
having COUNT(1)>1
) x on t.col1=x.col1

Open in new window

Top Expert 2009

Commented:
>>aiklamha: the query yields nothing.

Because Oracle interprets double quotes as an identifier (column) and not a column value.

Try mine above.
HainKurtSr. System Analyst
Commented:
oracle version ;)

ignore "with myData as (...)" part and replace myData with your table name
with myData as (
select 1 as col1, 'A' as col2 from dual
union select 1, 'B' from dual
union select 2, 'AA' from dual
union select 2, 'AC' from dual
union select 2, 'BB' from dual
union select 3, 'X' from dual
union select 4, 'Y' from dual
union select 5, 'AB' from dual
union select 5, 'AC' from dual
)
SELECT t.*
  FROM    mydata t
       INNER JOIN
          (  SELECT t1.col1
               FROM    myData t1
                    INNER JOIN
                       (SELECT DISTINCT col1
                          FROM myData
                         WHERE col2 = 'AC') t2
                    ON t1.col1 = t2.col1
           GROUP BY t1.col1
             HAVING COUNT (1) > 1) x
       ON t.col1 = x.col1

2	AA
2	AC
2	BB
5	AB
5	AC

Open in new window

Most Valuable Expert 2011
Top Expert 2012
Commented:
no need for a join and double io hits to the source table
 a couple of analytics and one pass are sufficient


SELECT col1, col2
  FROM (SELECT yourtable.*,
               COUNT(*) OVER (PARTITION BY col1) cnt,
               MIN(DECODE(col2, 'AC', 'x')) OVER (PARTITION BY col1) x
          FROM yourtable)
 WHERE cnt > 1 AND x = 'x'
Most Valuable Expert 2011
Top Expert 2012

Commented:
note,  depending on indexing and the data distribution, it might still be more efficient to do multiple queries

as with all performance claims,  test for yourself,  don't trust us  :)
HainKurtSr. System Analyst

Commented:
I tried mine & sdstuber on oracle with my sample data

Mine: cost:36 Bytes:9 Cardinality:1
His: cost:28 Bytes:189 Cardinality:9

??? not sure what it means :)
HainKurtSr. System Analyst

Commented:
with a bigger set

Mine: cost:3180 Bytes:9 Cardinality:1
His: cost:3172 Bytes:22197 Cardinality:1057

???
Most Valuable Expert 2011
Top Expert 2012

Commented:
assuming both return the correct results it means the optimizer doesn't know that they will be returning the same number of rows.

since we know the cardinality is NOT 1 for the small sample above, I assume the optimizer is making a mistake parsing yours
HainKurtSr. System Analyst
Commented:
I used this


Plan
SELECT STATEMENT  ALL_ROWSCost: 99  Bytes: 9  Cardinality: 1  										
	47 TEMP TABLE TRANSFORMATION  									
		33 LOAD AS SELECT  								
			32 SORT UNIQUE  Cost: 90  Cardinality: 30  							
				31 UNION-ALL  						
					1 FAST DUAL  Cost: 2  Cardinality: 1  					
					2 FAST DUAL  Cost: 2  Cardinality: 1  					
					3 FAST DUAL  Cost: 2  Cardinality: 1  					
					4 FAST DUAL  Cost: 2  Cardinality: 1  					
					5 FAST DUAL  Cost: 2  Cardinality: 1  					
					6 FAST DUAL  Cost: 2  Cardinality: 1  					
					7 FAST DUAL  Cost: 2  Cardinality: 1  					
					8 FAST DUAL  Cost: 2  Cardinality: 1  					
					9 FAST DUAL  Cost: 2  Cardinality: 1  					
					10 FAST DUAL  Cost: 2  Cardinality: 1  					
					11 FAST DUAL  Cost: 2  Cardinality: 1  					
					12 FAST DUAL  Cost: 2  Cardinality: 1  					
					13 FAST DUAL  Cost: 2  Cardinality: 1  					
					14 FAST DUAL  Cost: 2  Cardinality: 1  					
					15 FAST DUAL  Cost: 2  Cardinality: 1  					
					16 FAST DUAL  Cost: 2  Cardinality: 1  					
					17 FAST DUAL  Cost: 2  Cardinality: 1  					
					18 FAST DUAL  Cost: 2  Cardinality: 1  					
					19 FAST DUAL  Cost: 2  Cardinality: 1  					
					20 FAST DUAL  Cost: 2  Cardinality: 1  					
					21 FAST DUAL  Cost: 2  Cardinality: 1  					
					22 FAST DUAL  Cost: 2  Cardinality: 1  					
					23 FAST DUAL  Cost: 2  Cardinality: 1  					
					24 FAST DUAL  Cost: 2  Cardinality: 1  					
					25 FAST DUAL  Cost: 2  Cardinality: 1  					
					26 FAST DUAL  Cost: 2  Cardinality: 1  					
					27 FAST DUAL  Cost: 2  Cardinality: 1  					
					28 FAST DUAL  Cost: 2  Cardinality: 1  					
					29 FAST DUAL  Cost: 2  Cardinality: 1  					
					30 FAST DUAL  Cost: 2  Cardinality: 1  					
		46 HASH JOIN  Cost: 9  Bytes: 9  Cardinality: 1  								
			43 VIEW ISSP_OWNER. Cost: 7  Bytes: 3  Cardinality: 1  							
				42 FILTER  						
					41 HASH GROUP BY  Cost: 7  Bytes: 6  Cardinality: 1  					
						40 HASH JOIN  Cost: 6  Bytes: 6  Cardinality: 1  				
							37 VIEW ISSP_OWNER. Cost: 3  Bytes: 3  Cardinality: 1  			
								36 HASH UNIQUE  Cost: 3  Bytes: 6  Cardinality: 1  		
									35 VIEW ISSP_OWNER. Cost: 2  Bytes: 180  Cardinality: 30  	
										34 TABLE ACCESS FULL TABLE (TEMP) SYS.SYS_TEMP_0FD9D660D_FA94724F Cost: 2  Bytes: 480  Cardinality: 30  
							39 VIEW ISSP_OWNER. Cost: 2  Bytes: 90  Cardinality: 30  			
								38 TABLE ACCESS FULL TABLE (TEMP) SYS.SYS_TEMP_0FD9D660D_FA94724F Cost: 2  Bytes: 480  Cardinality: 30  		
			45 VIEW ISSP_OWNER. Cost: 2  Bytes: 180  Cardinality: 30  							
				44 TABLE ACCESS FULL TABLE (TEMP) SYS.SYS_TEMP_0FD9D660D_FA94724F Cost: 2  Bytes: 480  Cardinality: 30

Open in new window

with myData as (
select 1 as col1, 'A' as col2 from dual
union select 1, 'B' from dual
union select 2, 'AA' from dual
union select 2, 'AC' from dual
union select 2, 'BB' from dual
union select 3, 'X' from dual
union select 4, 'Y' from dual
union select 5, 'AB' from dual
union select 5, 'AC' from dual
union select 6, 'B' from dual
union select 6, 'AA' from dual
union select 6, 'AC' from dual
union select 7, 'BB' from dual
union select 7, 'X' from dual
union select 7, 'Y' from dual
union select 8, 'AB' from dual
union select 8, 'AC' from dual
union select 8, 'B' from dual
union select 8, 'AA' from dual
union select 9, 'AC' from dual
union select 10, 'BB' from dual
union select 11, 'X' from dual
union select 11, 'Y' from dual
union select 11, 'AB' from dual
union select 12, 'AC' from dual
union select 13, 'B' from dual
union select 14, 'AA' from dual
union select 15, 'AC' from dual
union select 15, 'BB' from dual
union select 15, 'X' from dual
)
SELECT t.*
  FROM    mydata t
       INNER JOIN
          (  SELECT t1.col1
               FROM    myData t1
                    INNER JOIN
                       (SELECT DISTINCT col1
                          FROM myData
                         WHERE col2 = 'AC') t2
                    ON t1.col1 = t2.col1
           GROUP BY t1.col1
             HAVING COUNT (1) > 1) x
       ON t.col1 = x.col1

Open in new window

HainKurtSr. System Analyst

Commented:
and this

Plan
SELECT STATEMENT  ALL_ROWSCost: 91  Bytes: 630  Cardinality: 30  						
	35 VIEW ISSP_OWNER. Cost: 91  Bytes: 630  Cardinality: 30  					
		34 WINDOW SORT  Cost: 91  Bytes: 180  Cardinality: 30  				
			33 VIEW ISSP_OWNER. Cost: 90  Bytes: 180  Cardinality: 30  			
				32 SORT UNIQUE  Cost: 90  Cardinality: 30  		
					31 UNION-ALL  	
						1 FAST DUAL  Cost: 2  Cardinality: 1  
						2 FAST DUAL  Cost: 2  Cardinality: 1  
						3 FAST DUAL  Cost: 2  Cardinality: 1  
						4 FAST DUAL  Cost: 2  Cardinality: 1  
						5 FAST DUAL  Cost: 2  Cardinality: 1  
						6 FAST DUAL  Cost: 2  Cardinality: 1  
						7 FAST DUAL  Cost: 2  Cardinality: 1  
						8 FAST DUAL  Cost: 2  Cardinality: 1  
						9 FAST DUAL  Cost: 2  Cardinality: 1  
						10 FAST DUAL  Cost: 2  Cardinality: 1  
						11 FAST DUAL  Cost: 2  Cardinality: 1  
						12 FAST DUAL  Cost: 2  Cardinality: 1  
						13 FAST DUAL  Cost: 2  Cardinality: 1  
						14 FAST DUAL  Cost: 2  Cardinality: 1  
						15 FAST DUAL  Cost: 2  Cardinality: 1  
						16 FAST DUAL  Cost: 2  Cardinality: 1  
						17 FAST DUAL  Cost: 2  Cardinality: 1  
						18 FAST DUAL  Cost: 2  Cardinality: 1  
						19 FAST DUAL  Cost: 2  Cardinality: 1  
						20 FAST DUAL  Cost: 2  Cardinality: 1  
						21 FAST DUAL  Cost: 2  Cardinality: 1  
						22 FAST DUAL  Cost: 2  Cardinality: 1  
						23 FAST DUAL  Cost: 2  Cardinality: 1  
						24 FAST DUAL  Cost: 2  Cardinality: 1  
						25 FAST DUAL  Cost: 2  Cardinality: 1  
						26 FAST DUAL  Cost: 2  Cardinality: 1  
						27 FAST DUAL  Cost: 2  Cardinality: 1  
						28 FAST DUAL  Cost: 2  Cardinality: 1  
						29 FAST DUAL  Cost: 2  Cardinality: 1  
						30 FAST DUAL  Cost: 2  Cardinality: 1

Open in new window

with myData as (
select 1 as col1, 'A' as col2 from dual
union select 1, 'B' from dual
union select 2, 'AA' from dual
union select 2, 'AC' from dual
union select 2, 'BB' from dual
union select 3, 'X' from dual
union select 4, 'Y' from dual
union select 5, 'AB' from dual
union select 5, 'AC' from dual
union select 6, 'B' from dual
union select 6, 'AA' from dual
union select 6, 'AC' from dual
union select 7, 'BB' from dual
union select 7, 'X' from dual
union select 7, 'Y' from dual
union select 8, 'AB' from dual
union select 8, 'AC' from dual
union select 8, 'B' from dual
union select 8, 'AA' from dual
union select 9, 'AC' from dual
union select 10, 'BB' from dual
union select 11, 'X' from dual
union select 11, 'Y' from dual
union select 11, 'AB' from dual
union select 12, 'AC' from dual
union select 13, 'B' from dual
union select 14, 'AA' from dual
union select 15, 'AC' from dual
union select 15, 'BB' from dual
union select 15, 'X' from dual
)
SELECT col1, col2
  FROM (SELECT mydata.*,
               COUNT(*) OVER (PARTITION BY col1) cnt,
               MIN(DECODE(col2, 'AC', 'x')) OVER (PARTITION BY col1) x
          FROM mydata)
 WHERE cnt > 1 AND x = 'x'

Open in new window

HainKurtSr. System Analyst

Commented:
and I am using TOAD :)
Most Valuable Expert 2011
Top Expert 2012
Commented:
I get the same plan.  Cardinalities are estimates,  Oracle is wrong about the counts returned by my query too


there are 15 rows returned, but your count of 1 and my count of 30 aren't correct.

15 and 30 are at along the same order of magnitude even if one is double

check out the stats(using toad, right click, select Auto Trace, run the queries, check the Auto Trace tab)

mine does about 48 gets total (consistent and current) with 2 sorts and 42 recursive calls
yours does about 70 gets total (consistent and current) with 1 sort and 44 recursive calls.

which is as I would expect,  for small, unindexed data, mine is more efficient due to less io
as the volumes increase mine should scale better

however, if there are indexes with low cardinality on the AC rows and indexes with low cardnality on the corresponding col1 rows, then it's possible yours will scale better than mine.


With the dearth of info about the asker's data, neither of us can prove either one of our queries is universally better for the asker, he must test for himself and check the stats/time to confirm.  Also, the results might vary based on the input.  maybe AC is rare, so yours scales very well with indexes. but maybe XY is common so it fails miserably on that one.
jl66Consultant

Author

Commented:
Thank everyone so much.
sdstuber:and HainKurt: you are excellent!!
I am checking everyone's solutions.
jl66Consultant

Author

Commented:
Thanks for everyone, esp. thanks for sdstuber:and HainKurt. Without changing anything(50Mil with current indexes), sdstuber's query is a bit quicker than the HainKurt's. Excellent job!

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial