asked on

find duplicate values in two tables and only show unique results

I have two tables in Access that have the exact same columns but different data. I want to query Table A and Table B to find all records with duplicate values in Column A (lets say there are Columns A-Z). I want to perform a Group By function so that i don't see multiple results for the same duplicate records. I have successfully done this by joining Table A and Table be ON Column A and only showing Column A with a Group By applied. However, i also need to show Columns B-Z. How do i add these columns and still keep the Group By on Column A.

I also tried using the DISTINCT operator in the Select clause, but again, i don't know how to only apply the DISTINCT to Column A.

SELECT Distinct Table_A.[Column1]
FROM Table_A. INNER JOIN Table_B ON Table_A.[Column1] =Table_B.[Column1]

Result = 3647

Do i need to make this query in two passes, 1 to find the duplicates, 2 to perform the group by?

UnifiedIS

If I understand correctly, you shouldn't include a group by
Distinct will cover all columns in the result so add the rest of your columns to your select and remove the group by
SELECT Distinct Table_A.[Column1], add the rest of the columns here
FROM Table_A. INNER JOIN Table_B ON Table_A.[Column1] =Table_B.[Column1]

Yaniv Schiff

ASKER

I tried that, when i add Column B from Table A to the Select statement, my results stay the same, but as soon as i add Column B from Table B, my results quadruple.

Here is the actuall code:
Query1:

SELECT DISTINCT Item003_AllFilesPresent_all.[Hash Value]
FROM Item003_AllFilesPresent_all INNER JOIN Item004_AllFilesPresent_all ON Item003_AllFilesPresent_all.[Hash Value] = Item004_AllFilesPresent_all.[Hash Value]
WHERE (((Item003_AllFilesPresent_all.[Hash Category]) Is Null) AND ((Item003_AllFilesPresent_all.[Physical Size])>"48"));

Returns 1043 hits

Query 2:
SELECT DISTINCT Item003_AllFilesPresent_all.[Hash Value], Item003_AllFilesPresent_all.[Full Path], Item004_AllFilesPresent_all.[Full Path]
FROM Item003_AllFilesPresent_all INNER JOIN Item004_AllFilesPresent_all ON Item003_AllFilesPresent_all.[Hash Value] = Item004_AllFilesPresent_all.[Hash Value]
WHERE (((Item003_AllFilesPresent_all.[Hash Category]) Is Null) AND ((Item003_AllFilesPresent_all.[Physical Size])>"48"));

Returns 24165 hits

Lowfatspread

so please explainwhat you're trying to achieve...

tableB obvious has many rows for the same hash_value...
what data are you trying to display from it?

ps please alias you statements
 
IS THIS MORE LIKE WHAT YOU WANT?
 
SELECT I3.[Hash Value]
     , I3.[Full Path]
     , MAX(I4.[Full Path]) AS B_fullpath
 FROM Item003_AllFilesPresent_all as I3
INNER JOIN Item004_AllFilesPresent_all as I4
   ON I3.[Hash Value] = I4.[Hash Value]
WHERE I3.[Hash Category] Is Null
  AND I3.[Physical Size]>"48"
group by I3.[Hash_Value]
        ,I3.[Full Path] 
;

Open in new window

ASKER CERTIFIED SOLUTION

Jinesh Kamdar

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Jinesh Kamdar

@Forensicon: Thx for the points, but i would have appreciated at least a split with LFS.