Link to home
Start Free TrialLog in
Avatar of Yaniv Schiff
Yaniv SchiffFlag for United States of America

asked on

find duplicate values in two tables and only show unique results

I have two tables in Access that have the exact same columns but different data. I want to query Table A and Table B to find all records with duplicate values in Column A (lets say there are Columns A-Z). I want to perform a Group By function so that i don't see multiple results for the same duplicate records. I have successfully done this by joining Table A and Table be ON Column A and only showing Column A with a Group By applied. However, i also need to show Columns B-Z. How do i add these columns and still keep the Group By on Column A.

I also tried using the DISTINCT operator in the Select clause, but again, i don't know how to only apply the DISTINCT to Column A.

SELECT Distinct Table_A.[Column1]
FROM Table_A. INNER JOIN Table_B ON Table_A.[Column1] =Table_B.[Column1]


Result = 3647


Do i need to make this query in two passes, 1 to find the duplicates, 2 to perform the group by?
Avatar of UnifiedIS
UnifiedIS

If I understand correctly, you shouldn't include a group by
Distinct will cover all columns in the result so add the rest of your columns to your select and remove the group by
SELECT Distinct Table_A.[Column1], add the rest of the columns here
FROM Table_A. INNER JOIN Table_B ON Table_A.[Column1] =Table_B.[Column1]
Avatar of Yaniv Schiff

ASKER

I tried that, when i add Column B from Table A to the Select statement, my results stay the same, but as soon as i add Column B from Table B, my results quadruple.

Here is the actuall code:
Query1:

SELECT DISTINCT Item003_AllFilesPresent_all.[Hash Value]
FROM Item003_AllFilesPresent_all INNER JOIN Item004_AllFilesPresent_all ON Item003_AllFilesPresent_all.[Hash Value] = Item004_AllFilesPresent_all.[Hash Value]
WHERE (((Item003_AllFilesPresent_all.[Hash Category]) Is Null) AND ((Item003_AllFilesPresent_all.[Physical Size])>"48"));

Returns 1043 hits

Query 2:
SELECT DISTINCT Item003_AllFilesPresent_all.[Hash Value], Item003_AllFilesPresent_all.[Full Path], Item004_AllFilesPresent_all.[Full Path]
FROM Item003_AllFilesPresent_all INNER JOIN Item004_AllFilesPresent_all ON Item003_AllFilesPresent_all.[Hash Value] = Item004_AllFilesPresent_all.[Hash Value]
WHERE (((Item003_AllFilesPresent_all.[Hash Category]) Is Null) AND ((Item003_AllFilesPresent_all.[Physical Size])>"48"));

Returns 24165 hits

Avatar of Lowfatspread
so please explainwhat you're trying to achieve...

tableB obvious has many rows for the same hash_value...
what data are you trying to display from it?  


ps please alias you statements
 
IS THIS MORE LIKE WHAT YOU WANT?
 
SELECT I3.[Hash Value]
     , I3.[Full Path]
     , MAX(I4.[Full Path]) AS B_fullpath
 FROM Item003_AllFilesPresent_all as I3
INNER JOIN Item004_AllFilesPresent_all as I4
   ON I3.[Hash Value] = I4.[Hash Value]
WHERE I3.[Hash Category] Is Null
  AND I3.[Physical Size]>"48"
group by I3.[Hash_Value]
        ,I3.[Full Path] 
;

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of Jinesh Kamdar
Jinesh Kamdar
Flag of India image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
@Forensicon: Thx for the points, but i would have appreciated at least a split with LFS.