Solved

SQL Query to find different values in field2 that have the same value in field1

Posted on 2011-03-18
17
1,264 Views
Last Modified: 2012-05-11
Hello Experts,

I am trying to write a SQL Query that will find rows in a table, where there are different
values found in field two that the same value in field1.  But, only when the inner query
results, show that the field1value appears more than once.

For example, if we have the value below:

field1   field2
------    ------
AD1     001
AD1     002
AB2     001
AB3     001
AB2     002

Expected Results:

field1  
------  
AD1
AB2

I would like to be able to identify those rows, where AD1 has more than one value in field2.
And, I know that I can write a SQLquery as follows, but I don't wan't to group the rows in the
outside query.

SELECT
 FIELD1
,FIELD2,
,FIELD3,
,COUNT(*)
FROM TABLE1
WHERE FIELD1IN (SELECT FIELD1
                              FROM TABLE1
                              GROUP BY FIELD1, FIELD2)
GROUP BY FIELD1,FIELD2,FIELD3
HAVING COUNT(*)>1

Big Thanks in advance!

Studious1
0
Comment
Question by:studious01
  • 6
  • 4
  • 4
  • +1
17 Comments
 
LVL 51

Expert Comment

by:HainKurt
ID: 35167944
this will give you field1 with multiple values field2

select field1  
from mytable
group by field1  
having count(1)>1

now this one will give you field1  & min(field2)

select field1 , min(field2) field2
from mytable
group by field1  
having count(1)>1

if you wish to have these rows (with all columns in the table)

SELECT *
  FROM    mytable t1
       INNER JOIN
          (  SELECT field1, MIN (field2) field2
               FROM mytable
           GROUP BY field1
             HAVING COUNT (1) > 1) t2
       ON t1.field1 = t2.field1 AND t1.field2 = t2.field2
0
 
LVL 51

Expert Comment

by:HainKurt
ID: 35167957
or same result with using row_number

SELECT *
  FROM (SELECT t.*,
               ROW_NUMBER () OVER (PARTITION BY field1 ORDER BY field2) rn
          FROM mytable) x
 WHERE rn = 1
0
 
LVL 51

Expert Comment

by:HainKurt
ID: 35167977
using row number will give you this result

field1  
------  
AD1
AB2

if you want all rows with field1 from above result

SELECT *
  FROM    mytable t1
       INNER JOIN
          (  SELECT field1
               FROM mytable
           GROUP BY field1
             HAVING COUNT (1) > 1) t2
       ON t1.field1 = t2.field1
0
 
LVL 34

Accepted Solution

by:
johnsone earned 500 total points
ID: 35168016
I seem to have a different understanding of this than HainKurt.

I believe that you are only looking for values where there is a duplicate in field1 but a difference in field2.

I believe that you can accomplish this with either of these queries:


select distinct a.field1 from mytable a, mytable b where a.field1 = b.field1 and a.field2 != b.field2;

select field1 from (select field1, field2, rank() over (partition by field1 order by field1, field2) rnk from tab) t where rnk > 1;

The last query posted by HainKurt will give you records where there is a duplicate in field1, but if field2 is the same it will also be shown.
0
 
LVL 34

Expert Comment

by:johnsone
ID: 35168023
You will probably need a distinct on my second query if there can be more than one different value.
0
 
LVL 51

Expert Comment

by:HainKurt
ID: 35168121
fix for my post 35167977

using row number will give you this result

field1  
------  
AD1
AB2
AB3

select field1  
from mytable
group by field1  
having count(1)>1

field1  
------  
AD1
AB2

and finally this

SELECT *
  FROM    mytable t1
       INNER JOIN
          (  SELECT field1
               FROM mytable
           GROUP BY field1
             HAVING COUNT (1) > 1) t2
       ON t1.field1 = t2.field1

field1   field2
------    ------
AD1     001
AD1     002
AB2     001
AB2     002

0
 

Author Comment

by:studious01
ID: 35168253
Wow. I really appreciate all of the responses.  But, I decided to do some Googling and came across a website that got me thinking of a possible solution.  

SELECT
a.field1
a.field2
a.field3
a.field4
FROM table1 A
INNER JOIN (SELECT field1, count(*) FROM table1 where field1
                    IN (SELECT field1
                         FROM table1
                         WHERE table1.field4='test'
                         GROUP BY table1.field1, table.field2)
                    GROUP BY field1
                    HAVING COUNT(*)>1)B
ON a.field1=b.field2
AND table1.field4='test'

Please let me know if you see any possible flaws in this query.
0
 
LVL 34

Expert Comment

by:johnsone
ID: 35168679
I believe that the solutions given by HainKurt, as well as the one offer by the author will give you a false positive in the case where this exists:

field1      field2
AB4        003
AB4        003

You need to account for a case where the values in both fields are the same.  The way I interpret the original question, you would not want this to be returned.

The 2 solutions that I posted will not show this duplicate case.
0
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

 

Author Comment

by:studious01
ID: 35168753
Yes, I agree that there will be a false positive, because field2 isn't included in the group by clause.
But, I don't understand how this could happen in my query, when the similiar values within field 1 and field 2 are being rolled up using "GROUP BY table1.field1, table.field"

So in the case of:

field1      field2
AB4        003
AB4        003

the result will be just one row with the values of AB4  and 003.

Please explain.

Thanks!
0
 
LVL 34

Expert Comment

by:johnsone
ID: 35169026
Set up the test.
create table table1 (field1 varchar2(10), field2 varchar2(10));
insert into table1 values ('AD1','001');
insert into table1 values ('AD1','002');
insert into table1 values ('AB2','001');
insert into table1 values ('AB3','001');
insert into table1 values ('AB2','002');
insert into table1 values ('AB4','003');
insert into table1 values ('AB4','003');

SELECT
a.field1,
a.field2
FROM table1 A
INNER JOIN (SELECT field1, count(*) FROM table1 where field1
                    IN (SELECT field1
                         FROM table1
                         GROUP BY table1.field1, table1.field2)
                    GROUP BY field1
                    HAVING COUNT(*)>1)B
ON a.field1=b.field1;

This yields the result:


 FIELD1 | FIELD2
--------+--------
 AB2    | 002
 AB2    | 001
 AD1    | 001
 AD1    | 002
 AB4    | 003
 AB4    | 003

Open in new window

0
 

Author Comment

by:studious01
ID: 35169781
OK, I see what you are saying.  Good point!  

And, it is pretty interesting how this query works.

select distinct a.field1 from mytable a, mytable b where a.field1 = b.field1 and a.field2 != b.field2;

It is similar to this query:
select distinct a.field1
from table1a
inner join table1b
on a.field1=b.field1
and a.field2<>b.field2

So, in essence, what is happening is that the value in a.field2 will never return a matching value in b.field2.  But, will return the non-matching value for each row. Nice!

Also, I had to add a outer query, because I still need to return the other fields from the table in my result set.  Another important thing that I learned is that it is better to start with a smaller set of data.

new sql query
select field1,field2,field3, field4
from table1
where field1 in
(select distinct a.field1
from table1a
inner join table1b
on a.field1=b.field1
and a.field2<>b.field2)
order by field1

Outstanding!  

Thanks again!

P.S. Can you recommend one or more SQL or PL/SQL books that you have read?





0
 
LVL 73

Expert Comment

by:sdstuber
ID: 35170339
I know this question is closed but thought you should consider analytics

SELECT field1, field2
  FROM (SELECT field1,
               field2,
               COUNT(DISTINCT field1) OVER (PARTITION BY field2) cnt
          FROM table1)
 WHERE cnt > 1


This requires only one pass through the table so it's much more efficient.
Even for the small test case above.

Running the accepted query required 21 consistent gets and 8 sorts  (as reported by autotrace)
Running the query posted here only needs 7 gets and 3 sorts to return the same results.

As the data volume increases the difference will be more dramatic.
Of course, test for yourself, don't trust me.  :)
0
 
LVL 73

Expert Comment

by:sdstuber
ID: 35170345
As for books,  the Oracle SQL and PL/SQL manuals are pretty good.

Also, anything by Tom Kyte

Steven Fuererstein has great books on PL/SQL
0
 
LVL 34

Expert Comment

by:johnsone
ID: 35171543
Actually, I believe the accepted solution is not the correct one.

Post 35168016 contains the correct queries, including one with an analytic answer, however, I like your count one better.
0
 
LVL 34

Expert Comment

by:johnsone
ID: 35171549
To correct the accepted query, you would need to change it to:


SELECT
a.field1,
a.field2
FROM table1 A
INNER JOIN (SELECT field1, count(*) FROM  (SELECT field1
                         FROM table1
                         GROUP BY table1.field1, table1.field2)
                    GROUP BY field1
                    HAVING COUNT(*)>1)B
ON a.field1=b.field1;
0
 
LVL 73

Expert Comment

by:sdstuber
ID: 35172142
oops, I looked for previous analytics before posting and must have missed it,  but I agree the COUNT should work better than the RANK
0
 

Author Comment

by:studious01
ID: 35172171
sdstuber - I really appreciate your feedback. And, thanks for the book author recommendations.
johnsone - I have requested for the selected answer to be changed to ID 35168016.

Thanks Experts!
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Join & Write a Comment

Suggested Solutions

Why doesn't the Oracle optimizer use my index? Querying too much data Most Oracle developers know that an index is useful when you can use it to restrict your result set to a small number of the total rows in a table. So, the obvious side…
Working with Network Access Control Lists in Oracle 11g (part 2) Part 1: http://www.e-e.com/A_8429.html Previously, I introduced the basics of network ACL's including how to create, delete and modify entries to allow and deny access.  For many…
Via a live example show how to connect to RMAN, make basic configuration settings changes and then take a backup of a demo database
This video shows how to configure and send email from and Oracle database using both UTL_SMTP and UTL_MAIL, as well as comparing UTL_SMTP to a manual SMTP conversation with a mail server.

707 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now