Find unmatched records between 2 datasets

Hello,

Am having a 'memory is full' moment.

I have 2 datasets with the exact same columns and data format, but one contains more records than the other. I am attempting to identify the records that are present in one but not the other.

My data is an output from Alteryx, but have brought it into Access to attempt to resolve this.

There are 32 columns in both tables. First one is Compare1 with 26364 records, the second is Compare 2 with 26260 records.

There is not a primary key, nor any unique set of columns to do a comparison on.

Your assistance is greatly appreciated.
SASnewbieAsked:
Who is Participating?
 
SimonConnect With a Mentor Commented:
Do a left join on all of the 32 colums then a right join.

This assumes you have a column that doesn't allow nulls that you can use to check for existence of a row in the tables.

SELECT * FROM COMPARE1 C1 LEFT JOIN COMPARE2 C2 ON C1.COL1=C2.COL1 AND C1.COL2=C2.COL2 (etc)
WHERE C2.COL1 IS NULL.

then

SELECT * FROM COMPARE1 C1 RIGHT JOIN COMPARE2 C2 ON C1.COL1=C2.COL1 AND C1.COL2=C2.COL2 (etc)
WHERE C1.COL1 IS NULL.
0
 
Rey Obrero (Capricorn1)Commented:
how many field does your table have?
0
 
SASnewbieAuthor Commented:
Hi and thank you for responding so quickly. There are 62 fields in each, not 32 as I listed before.
0
Cloud Class® Course: Certified Penetration Testing

This CPTE Certified Penetration Testing Engineer course covers everything you need to know about becoming a Certified Penetration Testing Engineer. Career Path: Professional roles include Ethical Hackers, Security Consultants, System Administrators, and Chief Security Officers.

 
SASnewbieAuthor Commented:
Also, I do have NULLS
0
 
Rey Obrero (Capricorn1)Commented:
can you pick / select (3 or more) the fields so that when joined together will comprise as a unique field?
0
 
DatabaseMX (Joe Anderson - Microsoft Access MVP)Database ArchitectCommented:
"memory is full' moment."
LOL ... good one!

Humm .... maybe create a UNION query with UNION ALL ... which will include duplicates ... a and maybe from that you can deduce what is not duplicated, ie missing from the one recordset.

mx
0
 
SASnewbieAuthor Commented:
no.... but I am going to rename the fields col1-col62
0
 
SASnewbieAuthor Commented:
:) Hi mx. I'll try that. Thank you!
0
 
SASnewbieAuthor Commented:
mx, that returns the 26260 records. Worth a try!!

Sure seems like a method would be available by now to bump one table against another to find differences without having to identify unique fields.
0
 
SASnewbieAuthor Commented:
Sorry mx, I should have mentioned that all of the 26260 records are contained in the dataset with 26364 records.
0
 
PatHartmanCommented:
If you don't want to specify the fields to match on, how would you expect the magic process to figure it out?

Start with a select distinct on each table to see if there are duplicates that will be filtered away.

If that doesn't solve the problem, then you can write VBA code if you don't want to create a query.  Without a unique identifier, you are reduced to comparing every column in set a to every column in set b.  Rey's suggestion to choose a few of the most likely candidates for uniqueness is also a good start.

What do you hope to accomplish by renaming all the columns?

I was just reminded that FMS offers a product that compares tables.  You might start with that.  

http://www.fmsinc.com/microsoftaccess/difference/properties-fields/tables.asp
0
 
SASnewbieAuthor Commented:
by the order of the names and order of fields
0
 
SASnewbieAuthor Commented:
renaming the columns shortens and simplifies the left join and then right join sql that Rey provided:
SELECT * FROM COMPARE1 C1 LEFT JOIN COMPARE2 C2
ON C1.Col1=C2.Col1
AND C1.Col2=C2.Col2
AND C1.Col3=C2.Col3
AND C1.Col4=C2.Col4
AND C1.Col5=C2.Col5
AND C1.Col6=C2.Col6
AND C1.Col7=C2.Col7
AND C1.Col8=C2.Col8
AND C1.Col9=C2.Col9
AND C1.Col10=C2.Col10
AND C1.Col11=C2.Col11
AND C1.Col12=C2.Col12
AND C1.Col13=C2.Col13
AND C1.Col14=C2.Col14
AND C1.Col15=C2.Col15
AND C1.Col16=C2.Col16
AND C1.Col17=C2.Col17
AND C1.Col18=C2.Col18
AND C1.Col19=C2.Col19
AND C1.Col20=C2.Col20
AND C1.Col21=C2.Col21
AND C1.Col22=C2.Col22
AND C1.Col23=C2.Col23
AND C1.Col24=C2.Col24
AND C1.Col25=C2.Col25
AND C1.Col26=C2.Col26
AND C1.Col27=C2.Col27
AND C1.Col28=C2.Col28
AND C1.Col29=C2.Col29
AND C1.Col30=C2.Col30
AND C1.Col31=C2.Col31
AND C1.Col32=C2.Col32
AND C1.Col33=C2.Col33
AND C1.Col34=C2.Col34
AND C1.Col35=C2.Col35
AND C1.Col36=C2.Col36
AND C1.Col37=C2.Col37
AND C1.Col38=C2.Col38
AND C1.Col39=C2.Col39
AND C1.Col40=C2.Col40
AND C1.Col41=C2.Col41
AND C1.Col42=C2.Col42
AND C1.Col43=C2.Col43
AND C1.Col44=C2.Col44
AND C1.Col45=C2.Col45
AND C1.Col46=C2.Col46
AND C1.Col47=C2.Col47
AND C1.Col48=C2.Col48
AND C1.Col49=C2.Col49
AND C1.Col50=C2.Col50
AND C1.Col51=C2.Col51
AND C1.Col52=C2.Col52
AND C1.Col53=C2.Col53
AND C1.Col54=C2.Col54
AND C1.Col55=C2.Col55
AND C1.Col56=C2.Col56
AND C1.Col57=C2.Col57
AND C1.Col58=C2.Col58
AND C1.Col59=C2.Col59
AND C1.Col60=C2.Col60
AND C1.Col61=C2.Col61
AND C1.Col62=C2.Col62

WHERE C2.Col1 IS NULL
0
 
DatabaseMX (Joe Anderson - Microsoft Access MVP)Database ArchitectCommented:
"mx, that returns the 26260 r"
Did you use UNION ALL

Here is an example
Table aTblA (odd and even numbers)
F1
1
2
3
4
5
6
7
8
9

Table aTblA  (odd numbers only)
F1
1
3
5
7
9


Union SQL
SELECT aTblA.*
FROM aTblA
UNION ALL SELECT aTblB.*
FROM aTblB

ORDER BY 1

Returns

F1
1
1
2  **
3
3
4  **
5
5
6 **
7
7
8  **
9
9

Notice that only the Even Numbers are NOT duplicated, so this shows  you what is not in the other table.

Save the UNION query as some name ( query83 in my example)

Then create this query

SELECT Query83.F1, Count(Query83.F1) AS CountOfF1
FROM Query83
GROUP BY Query83.F1
HAVING (((Count(Query83.F1))=1))
ORDER BY Query83.F1;

Result is records NOT in the other table.

F1      CountOfF1
2      1
4      1
6      1
8      1
0
 
SimonCommented:
@SASNewbie, what result did you get from the left join on all columns?

You'd need to pick a column without nulls for this clause

WHERE C2.Col1 IS NULL

i.e. it could be C2.Col2 or C2.Col3...
0
 
DatabaseMX (Joe Anderson - Microsoft Access MVP)Database ArchitectCommented:
Of course ... I forgot you have so MANY columns :-)
0
 
SASnewbieAuthor Commented:
Thanks everyone! I am working through your suggestions.

@mx I did use Union All, and am attempting your last example. Can you suggest how to determine the column to choose. I used Col1 and returned 4 records (which is a great start), but I know there are a total of 104 records that are different.

@simon, I used the date fieild, Col1, which contains no nulls.
0
 
SASnewbieAuthor Commented:
@mx, well it did return 4 the first time I ran it... hmm...
0
 
SimonConnect With a Mentor Commented:
@SAS, I tested this in case NULLs messed it up.

SELECT tbl1.*
FROM tbl1 LEFT JOIN tbl2 ON (tbl1.col5 = tbl2.col5) AND (tbl1.col4 = tbl2.col4) AND (tbl1.col1 = tbl2.col1) AND (tbl1.col2 = tbl2.col2) AND (tbl1.col3 = tbl2.col3)
WHERE (((tbl2.col1) Is Null));

I can confirm that it does return every row from tbl1 that is not exactly duplicated in tbl2.

One other thought though is that if there are a number of exact duplicate rows within each table the above query will not show that one of the tables had 4 copies of that row whilst the other table only had 1 copy of it.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.