troubleshooting Question

Using SQLSAS or SAS to count duplicates in the same dataset

Avatar of perrysf25
perrysf25 asked on
2 Comments1 Solution696 ViewsLast Modified:
I have an output dataset that I now need to count and identify duplicates in the same dataset. The dataset only has two variables .. ID_1 and ID_2. I have already run a proc summary on the dataset and classed it by IC_1 ID_2. So there are distinct ID_1 / ID_2 matches. But there are multiple ID_2's that can be associated with an ID_1. So for example, if ID_1 = 001 there could be ID_1 = 001, ID_2= AA; ID_1=001, ID_2=BB; ID_1=001, ID_2=CC etc etc..So some ID_1's have only one ID_2 associated with it and some have multiple ID_2's.
Ok so here is what I need to do. I need to be able to take a ID_1 / ID_2 match and search the rest of the ID_2s in the dataset to see if it is associated with any other ID_1s...excluding the initial ID_1 / ID_2 match. For example, if ID_1=001 and ID_2 = AA, then I want to find all the other ID_1s that have an ID_2=AA associated with it excluding ID_1=001.

Can I use proc sql to easily go through the dataset and count the number of ID_1's that each ID_2 is associated with. And also output each ID_1 that is associated with the ID_2 being looked at.

Appeciate any help!!

Our community of experts have been thoroughly vetted for their expertise and industry experience.

Join our community to see this answer!
Unlock 1 Answer and 2 Comments.
Start Free Trial
Learn from the best

Network and collaborate with thousands of CTOs, CISOs, and IT Pros rooting for you and your success.

Andrew Hancock - VMware vExpert
See if this solution works for you by signing up for a 7 day free trial.
Unlock 1 Answer and 2 Comments.
Try for 7 days

”The time we save is the biggest benefit of E-E to our team. What could take multiple guys 2 hours or more each to find is accessed in around 15 minutes on Experts Exchange.

-Mike Kapnisakis, Warner Bros