Solved

Find unmatched records between 2 datasets

Posted on 2015-01-27
19
183 Views
Last Modified: 2015-04-09
Hello,

Am having a 'memory is full' moment.

I have 2 datasets with the exact same columns and data format, but one contains more records than the other. I am attempting to identify the records that are present in one but not the other.

My data is an output from Alteryx, but have brought it into Access to attempt to resolve this.

There are 32 columns in both tables. First one is Compare1 with 26364 records, the second is Compare 2 with 26260 records.

There is not a primary key, nor any unique set of columns to do a comparison on.

Your assistance is greatly appreciated.
0
Comment
Question by:SASnewbie
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 10
  • 3
  • 3
  • +2
19 Comments
 
LVL 18

Accepted Solution

by:
Simon earned 500 total points
ID: 40573845
Do a left join on all of the 32 colums then a right join.

This assumes you have a column that doesn't allow nulls that you can use to check for existence of a row in the tables.

SELECT * FROM COMPARE1 C1 LEFT JOIN COMPARE2 C2 ON C1.COL1=C2.COL1 AND C1.COL2=C2.COL2 (etc)
WHERE C2.COL1 IS NULL.

then

SELECT * FROM COMPARE1 C1 RIGHT JOIN COMPARE2 C2 ON C1.COL1=C2.COL1 AND C1.COL2=C2.COL2 (etc)
WHERE C1.COL1 IS NULL.
0
 
LVL 120

Expert Comment

by:Rey Obrero (Capricorn1)
ID: 40573851
how many field does your table have?
0
 

Author Comment

by:SASnewbie
ID: 40573865
Hi and thank you for responding so quickly. There are 62 fields in each, not 32 as I listed before.
0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 

Author Comment

by:SASnewbie
ID: 40573871
Also, I do have NULLS
0
 
LVL 120

Expert Comment

by:Rey Obrero (Capricorn1)
ID: 40573878
can you pick / select (3 or more) the fields so that when joined together will comprise as a unique field?
0
 
LVL 75
ID: 40573882
"memory is full' moment."
LOL ... good one!

Humm .... maybe create a UNION query with UNION ALL ... which will include duplicates ... a and maybe from that you can deduce what is not duplicated, ie missing from the one recordset.

mx
0
 

Author Comment

by:SASnewbie
ID: 40573883
no.... but I am going to rename the fields col1-col62
0
 

Author Comment

by:SASnewbie
ID: 40573893
:) Hi mx. I'll try that. Thank you!
0
 

Author Comment

by:SASnewbie
ID: 40573901
mx, that returns the 26260 records. Worth a try!!

Sure seems like a method would be available by now to bump one table against another to find differences without having to identify unique fields.
0
 

Author Comment

by:SASnewbie
ID: 40573914
Sorry mx, I should have mentioned that all of the 26260 records are contained in the dataset with 26364 records.
0
 
LVL 38

Expert Comment

by:PatHartman
ID: 40573926
If you don't want to specify the fields to match on, how would you expect the magic process to figure it out?

Start with a select distinct on each table to see if there are duplicates that will be filtered away.

If that doesn't solve the problem, then you can write VBA code if you don't want to create a query.  Without a unique identifier, you are reduced to comparing every column in set a to every column in set b.  Rey's suggestion to choose a few of the most likely candidates for uniqueness is also a good start.

What do you hope to accomplish by renaming all the columns?

I was just reminded that FMS offers a product that compares tables.  You might start with that.  

http://www.fmsinc.com/microsoftaccess/difference/properties-fields/tables.asp
0
 

Author Comment

by:SASnewbie
ID: 40573969
by the order of the names and order of fields
0
 

Author Comment

by:SASnewbie
ID: 40573976
renaming the columns shortens and simplifies the left join and then right join sql that Rey provided:
SELECT * FROM COMPARE1 C1 LEFT JOIN COMPARE2 C2
ON C1.Col1=C2.Col1
AND C1.Col2=C2.Col2
AND C1.Col3=C2.Col3
AND C1.Col4=C2.Col4
AND C1.Col5=C2.Col5
AND C1.Col6=C2.Col6
AND C1.Col7=C2.Col7
AND C1.Col8=C2.Col8
AND C1.Col9=C2.Col9
AND C1.Col10=C2.Col10
AND C1.Col11=C2.Col11
AND C1.Col12=C2.Col12
AND C1.Col13=C2.Col13
AND C1.Col14=C2.Col14
AND C1.Col15=C2.Col15
AND C1.Col16=C2.Col16
AND C1.Col17=C2.Col17
AND C1.Col18=C2.Col18
AND C1.Col19=C2.Col19
AND C1.Col20=C2.Col20
AND C1.Col21=C2.Col21
AND C1.Col22=C2.Col22
AND C1.Col23=C2.Col23
AND C1.Col24=C2.Col24
AND C1.Col25=C2.Col25
AND C1.Col26=C2.Col26
AND C1.Col27=C2.Col27
AND C1.Col28=C2.Col28
AND C1.Col29=C2.Col29
AND C1.Col30=C2.Col30
AND C1.Col31=C2.Col31
AND C1.Col32=C2.Col32
AND C1.Col33=C2.Col33
AND C1.Col34=C2.Col34
AND C1.Col35=C2.Col35
AND C1.Col36=C2.Col36
AND C1.Col37=C2.Col37
AND C1.Col38=C2.Col38
AND C1.Col39=C2.Col39
AND C1.Col40=C2.Col40
AND C1.Col41=C2.Col41
AND C1.Col42=C2.Col42
AND C1.Col43=C2.Col43
AND C1.Col44=C2.Col44
AND C1.Col45=C2.Col45
AND C1.Col46=C2.Col46
AND C1.Col47=C2.Col47
AND C1.Col48=C2.Col48
AND C1.Col49=C2.Col49
AND C1.Col50=C2.Col50
AND C1.Col51=C2.Col51
AND C1.Col52=C2.Col52
AND C1.Col53=C2.Col53
AND C1.Col54=C2.Col54
AND C1.Col55=C2.Col55
AND C1.Col56=C2.Col56
AND C1.Col57=C2.Col57
AND C1.Col58=C2.Col58
AND C1.Col59=C2.Col59
AND C1.Col60=C2.Col60
AND C1.Col61=C2.Col61
AND C1.Col62=C2.Col62

WHERE C2.Col1 IS NULL
0
 
LVL 75
ID: 40574046
"mx, that returns the 26260 r"
Did you use UNION ALL

Here is an example
Table aTblA (odd and even numbers)
F1
1
2
3
4
5
6
7
8
9

Table aTblA  (odd numbers only)
F1
1
3
5
7
9


Union SQL
SELECT aTblA.*
FROM aTblA
UNION ALL SELECT aTblB.*
FROM aTblB

ORDER BY 1

Returns

F1
1
1
2  **
3
3
4  **
5
5
6 **
7
7
8  **
9
9

Notice that only the Even Numbers are NOT duplicated, so this shows  you what is not in the other table.

Save the UNION query as some name ( query83 in my example)

Then create this query

SELECT Query83.F1, Count(Query83.F1) AS CountOfF1
FROM Query83
GROUP BY Query83.F1
HAVING (((Count(Query83.F1))=1))
ORDER BY Query83.F1;

Result is records NOT in the other table.

F1      CountOfF1
2      1
4      1
6      1
8      1
0
 
LVL 18

Expert Comment

by:Simon
ID: 40574062
@SASNewbie, what result did you get from the left join on all columns?

You'd need to pick a column without nulls for this clause

WHERE C2.Col1 IS NULL

i.e. it could be C2.Col2 or C2.Col3...
0
 
LVL 75
ID: 40574086
Of course ... I forgot you have so MANY columns :-)
0
 

Author Comment

by:SASnewbie
ID: 40575033
Thanks everyone! I am working through your suggestions.

@mx I did use Union All, and am attempting your last example. Can you suggest how to determine the column to choose. I used Col1 and returned 4 records (which is a great start), but I know there are a total of 104 records that are different.

@simon, I used the date fieild, Col1, which contains no nulls.
0
 

Author Comment

by:SASnewbie
ID: 40575037
@mx, well it did return 4 the first time I ran it... hmm...
0
 
LVL 18

Assisted Solution

by:Simon
Simon earned 500 total points
ID: 40575661
@SAS, I tested this in case NULLs messed it up.

SELECT tbl1.*
FROM tbl1 LEFT JOIN tbl2 ON (tbl1.col5 = tbl2.col5) AND (tbl1.col4 = tbl2.col4) AND (tbl1.col1 = tbl2.col1) AND (tbl1.col2 = tbl2.col2) AND (tbl1.col3 = tbl2.col3)
WHERE (((tbl2.col1) Is Null));

I can confirm that it does return every row from tbl1 that is not exactly duplicated in tbl2.

One other thought though is that if there are a number of exact duplicate rows within each table the above query will not show that one of the tables had 4 copies of that row whilst the other table only had 1 copy of it.
0

Featured Post

Do you have a plan for Continuity?

It's inevitable. People leave organizations creating a gap in your service. That's where Percona comes in.

See how Pepper.com relies on Percona to:
-Manage their database
-Guarantee data safety and protection
-Provide database expertise that is available for any situation

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Traditionally, the method to display pictures in Access forms and reports is to first download them from URLs to a folder, record the path in a table and then let the form or report pull the pictures from that folder. But why not let Windows retr…
It is possible to export the data of a SQL Table in SSMS and generate INSERT statements. It's neatly tucked away in the generate scripts option of a database.
Show developers how to use a criteria form to limit the data that appears on an Access report. It is a common requirement that users can specify the criteria for a report at runtime. The easiest way to accomplish this is using a criteria form that a…
In Microsoft Access, when working with VBA, learn some techniques for writing readable and easily maintained code.

630 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question