troubleshooting Question

Using SQLSAS or SAS to count duplicates in the same dataset

Avatar of perrysf25
perrysf25 asked on
DatabasesProgramming
2 Comments1 Solution696 ViewsLast Modified:
Hi,
I have an output dataset that I now need to count and identify duplicates in the same dataset. The dataset only has two variables .. ID_1 and ID_2. I have already run a proc summary on the dataset and classed it by IC_1 ID_2. So there are distinct ID_1 / ID_2 matches. But there are multiple ID_2's that can be associated with an ID_1. So for example, if ID_1 = 001 there could be ID_1 = 001, ID_2= AA; ID_1=001, ID_2=BB; ID_1=001, ID_2=CC etc etc..So some ID_1's have only one ID_2 associated with it and some have multiple ID_2's.
Ok so here is what I need to do. I need to be able to take a ID_1 / ID_2 match and search the rest of the ID_2s in the dataset to see if it is associated with any other ID_1s...excluding the initial ID_1 / ID_2 match. For example, if ID_1=001 and ID_2 = AA, then I want to find all the other ID_1s that have an ID_2=AA associated with it excluding ID_1=001.

Can I use proc sql to easily go through the dataset and count the number of ID_1's that each ID_2 is associated with. And also output each ID_1 that is associated with the ID_2 being looked at.

Appeciate any help!!
ASKER CERTIFIED SOLUTION
tobey1

Our community of experts have been thoroughly vetted for their expertise and industry experience.

Join our community to see this answer!
Unlock 1 Answer and 2 Comments.
Start Free Trial
Learn from the best

Network and collaborate with thousands of CTOs, CISOs, and IT Pros rooting for you and your success.

Andrew Hancock - VMware vExpert
See if this solution works for you by signing up for a 7 day free trial.
Unlock 1 Answer and 2 Comments.
Try for 7 days

”The time we save is the biggest benefit of E-E to our team. What could take multiple guys 2 hours or more each to find is accessed in around 15 minutes on Experts Exchange.

-Mike Kapnisakis, Warner Bros