Yaniv Schiff
asked on
find duplicate values in two tables and only show unique results
I have two tables in Access that have the exact same columns but different data. I want to query Table A and Table B to find all records with duplicate values in Column A (lets say there are Columns A-Z). I want to perform a Group By function so that i don't see multiple results for the same duplicate records. I have successfully done this by joining Table A and Table be ON Column A and only showing Column A with a Group By applied. However, i also need to show Columns B-Z. How do i add these columns and still keep the Group By on Column A.
I also tried using the DISTINCT operator in the Select clause, but again, i don't know how to only apply the DISTINCT to Column A.
SELECT Distinct Table_A.[Column1]
FROM Table_A. INNER JOIN Table_B ON Table_A.[Column1] =Table_B.[Column1]
Result = 3647
Do i need to make this query in two passes, 1 to find the duplicates, 2 to perform the group by?
I also tried using the DISTINCT operator in the Select clause, but again, i don't know how to only apply the DISTINCT to Column A.
SELECT Distinct Table_A.[Column1]
FROM Table_A. INNER JOIN Table_B ON Table_A.[Column1] =Table_B.[Column1]
Result = 3647
Do i need to make this query in two passes, 1 to find the duplicates, 2 to perform the group by?
ASKER
I tried that, when i add Column B from Table A to the Select statement, my results stay the same, but as soon as i add Column B from Table B, my results quadruple.
Here is the actuall code:
Query1:
SELECT DISTINCT Item003_AllFilesPresent_al l.[Hash Value]
FROM Item003_AllFilesPresent_al l INNER JOIN Item004_AllFilesPresent_al l ON Item003_AllFilesPresent_al l.[Hash Value] = Item004_AllFilesPresent_al l.[Hash Value]
WHERE (((Item003_AllFilesPresent _all.[Hash Category]) Is Null) AND ((Item003_AllFilesPresent_ all.[Physi cal Size])>"48"));
Returns 1043 hits
Query 2:
SELECT DISTINCT Item003_AllFilesPresent_al l.[Hash Value], Item003_AllFilesPresent_al l.[Full Path], Item004_AllFilesPresent_al l.[Full Path]
FROM Item003_AllFilesPresent_al l INNER JOIN Item004_AllFilesPresent_al l ON Item003_AllFilesPresent_al l.[Hash Value] = Item004_AllFilesPresent_al l.[Hash Value]
WHERE (((Item003_AllFilesPresent _all.[Hash Category]) Is Null) AND ((Item003_AllFilesPresent_ all.[Physi cal Size])>"48"));
Returns 24165 hits
Here is the actuall code:
Query1:
SELECT DISTINCT Item003_AllFilesPresent_al
FROM Item003_AllFilesPresent_al
WHERE (((Item003_AllFilesPresent
Returns 1043 hits
Query 2:
SELECT DISTINCT Item003_AllFilesPresent_al
FROM Item003_AllFilesPresent_al
WHERE (((Item003_AllFilesPresent
Returns 24165 hits
so please explainwhat you're trying to achieve...
tableB obvious has many rows for the same hash_value...
what data are you trying to display from it?
tableB obvious has many rows for the same hash_value...
what data are you trying to display from it?
ps please alias you statements
IS THIS MORE LIKE WHAT YOU WANT?
SELECT I3.[Hash Value]
, I3.[Full Path]
, MAX(I4.[Full Path]) AS B_fullpath
FROM Item003_AllFilesPresent_all as I3
INNER JOIN Item004_AllFilesPresent_all as I4
ON I3.[Hash Value] = I4.[Hash Value]
WHERE I3.[Hash Category] Is Null
AND I3.[Physical Size]>"48"
group by I3.[Hash_Value]
,I3.[Full Path]
;
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
@Forensicon: Thx for the points, but i would have appreciated at least a split with LFS.
Distinct will cover all columns in the result so add the rest of your columns to your select and remove the group by
SELECT Distinct Table_A.[Column1], add the rest of the columns here
FROM Table_A. INNER JOIN Table_B ON Table_A.[Column1] =Table_B.[Column1]