I have an output dataset that I now need to count and identify duplicates in the same dataset. The dataset only has two variables .. ID_1 and ID_2. I have already run a proc summary on the dataset and classed it by IC_1 ID_2. So there are distinct ID_1 / ID_2 matches. But there are multiple ID_2's that can be associated with an ID_1. So for example, if ID_1 = 001 there could be ID_1 = 001, ID_2= AA; ID_1=001, ID_2=BB; ID_1=001, ID_2=CC etc etc..So some ID_1's have only one ID_2 associated with it and some have multiple ID_2's.
Ok so here is what I need to do. I need to be able to take a ID_1 / ID_2 match and search the rest of the ID_2s in the dataset to see if it is associated with any other ID_1s...excluding the initial ID_1 / ID_2 match. For example, if ID_1=001 and ID_2 = AA, then I want to find all the other ID_1s that have an ID_2=AA associated with it excluding ID_1=001.
Can I use proc sql to easily go through the dataset and count the number of ID_1's that each ID_2 is associated with. And also output each ID_1 that is associated with the ID_2 being looked at.
Appeciate any help!!