Link to home
Start Free TrialLog in
Avatar of endrec
endrec

asked on

How to find duplicate items in a self-referential table?

I am trying to find which items in the column Parent have more than one occurance and do not appear at all in the column GrandParent, however I can't seem to get it.

I would like to return the columns Parent and GrandParent with the following criteria.
Parents that do not have a GrandParent and occur more than once in the ParentTable column.

Fake Data
Parent      GrandParent
1                 NULL
2                 1
3                 NULL
3                 NULL

Desired Result
 - Exclude the row with the number 1 because it has children.
 - Only return 3 because it occurs twice with no parent number.

Number      NumofOccur
3                 2

Code That Does Not Work
 
SELECT A.PARENT, COUNT(PARENT) NDANOCOUNT
From TABLETEST A
WHERE A.GRANDPARENT IS NOT NULL AND A.PARENT NOT IN
 
                                    ( 
 
                                          SELECT B.GRANDPARENT
                                          FROM TABLETEST B
                                          WHERE B.GRANDPARENT IS NOT NULL
 
                                    )
 
GROUP BY A.PARENT
HAVING COUNT(PARENT) > 1
ORDER BY COUNT(PARENT) DESC

Open in new window

Avatar of TextReport
TextReport
Flag of United Kingdom of Great Britain and Northern Ireland image

Your test data doesn't inclyude any records that match your criteria

The only Grandparent is excluded by the first condition WHERE A.GRANDPARENT IS NOT NULL

Cheers, Andrew
ASKER CERTIFIED SOLUTION
Avatar of Sean Stuber
Sean Stuber

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
As stated the check against grandparent is not required.
Cheers, Andrew
SELECT A.PARENT, COUNT(PARENT) NDANOCOUNT
From TABLETEST A
WHERE  A.PARENT NOT IN 
	( 
	SELECT B.GRANDPARENT
        FROM TABLETEST B
        WHERE B.GRANDPARENT IS NOT NULL
 	)
 
GROUP BY A.PARENT
HAVING COUNT(PARENT) > 1
ORDER BY COUNT(PARENT) DESC

Open in new window

Avatar of Sean Stuber
Sean Stuber

TextReport, was your "as stated"  comment directed toward my suggestion?

If so, why?  I see our queries as equivalent, except you added an extra sort
We both check grandparent not null and compare parents against grandparents.  (I use not exists, you use not in)
Different syntax but same operations.

If not directed at my suggestion, then what did you mean?
I'm not offended, just curious.
thanks
No not at all, just wanted to point out that endrec was almost there.
Cheers, Andrew