Link to home
Start Free TrialLog in
Avatar of Jim Youmans
Jim YoumansFlag for United States of America

asked on

Query to find rows with same id in column1 and different values in column2

DB2 11.5 on windows

This should be easy but for some reason I just can't wrap my head around it.  I have a table that has credit card info.  cc_first6, cc_last4, cc_hash

I want to query the table to find records that have the same first6 and last4 with different cc_hash values.

I wrote a query that finds the rows with same hash multiple times but I need the opposite.  Here is the query I have.

select
cc_first6, cc_last4, cc_hash, count(*) TheCount from cc_table where LASTTIME > current_date - 1 month group by CC_FIRST6,cc_last4,cc_hash having count(*) > 1 order by cc_first6,cc_last4

Open in new window

I don't have to have the cc_hash in the select list but I do need a count of different hashes for same first6 and last4. 

Any suggestions?


Jim

Avatar of slightwv (䄆 Netminder)
slightwv (䄆 Netminder)

First though:
Get the count of the card parts, then count the card parts with the hash.  If the two counts don't match, there is a different hash.

with cte as (
	select cc_first6, cc_last4, cc_hash,
		count(*) over(partition by cc_first6, cc_last4) card_count,
		count(*) over(partition by cc_first6, cc_last4, cc_hash) card_hash_count
	from cc_table
)
select cc_first6, cc_last4, cc_hash
from cte
where card_count != card_hash_count

Open in new window


Sample fiddle:
https://dbfiddle.uk/?rdbms=db2_11.1&fiddle=883b64d715b9d37bb82a4c11a32d8001
ASKER CERTIFIED SOLUTION
Avatar of Scott Pletcher
Scott Pletcher
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Updated fiddle to show Scott's method.  Really nice approach.

Note:  I added a new row to the previous fiddle to show 3 different hashes for the same card info.

Depends on what you would like as results.  The min/max with the different counts or all three differences?

No idea which one would be more efficient for larger tables.

https://dbfiddle.uk/?rdbms=db2_11.1&fiddle=bca489b2c7dee2fff161e49941956563
Just had another thought:  return the hashes as a CSV or array.

CSV example:
with cte as (
	select cc_first6, cc_last4, cc_hash,
		count(*) over(partition by cc_first6, cc_last4) card_count,
		count(*) over(partition by cc_first6, cc_last4, cc_hash) card_hash_count
	from cc_table
)
select cc_first6, cc_last4, listagg(cc_hash,',') hash_list
from cte
where card_count != card_hash_count
group by cc_first6, cc_last4

Open in new window


New fiddle with all 3 examples:
https://dbfiddle.uk/?rdbms=db2_11.1&fiddle=58b587a1235fcf9eefefd8b170b0ad91
Avatar of Jim Youmans

ASKER

Thank you so much.  Scott's query worked great!
Thank you to slightwv!!!