# Find difference between data in rows

I have a table  with Name, Address, City, Sate, Zip, Phone, and email address, that has duplicate rows caused by differences in the data, for example Bob Smith has 3 rows because the address is (in row 1) 123 My St (in row 2) 123 My St. (Note the period) and (in Row 3) 123 My Street I used a CTE to find the duplicates but I need a way to determine the differences in the rows. another example is the the name column had Bill and the other had Bill Smith but the address and email is the same. I have attached the CTE I use to find the DUPs. any help would be appreciated.

``````WITH ATI_ML_D
AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY buyer_email ORDER BY buyer_email DESC) as RowNumber
AS E)
FROM ATI_ML_D
WHERE RowNumber >1
``````
###### Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Microsoft SQL Server Developer, Architect, and AuthorCommented:
Give us a 'for example' of 'find the differences', and what you envision as your overall requirement.
Lots of possibilities here..
Return a set with the column_name that's different?
Automatically delete one row if a value is NULL, leaving the other(s)?
Logic that determines that St is correct and St. is not?

>any help would be appreciated.
I have an article out there called SQL Server Delete Duplicate Rows Solutions, but it looks like you've got that angle covered.
0
Author Commented:
Jim, I think  Return a set with the column_name that's different is the approach I am looking for, i would like to be able to see the differences in the result and flag the ones to delete. as for the logic to determines that St is correct and St. how would I go about that and I would be concerned more with the name issue where the difference is that the last name is missing but the other data matches
0
Database and Application DeveloperCommented:
I have used some sample data I had.
``````create table #w  (
account     varchar(4)      NOT NULL,
Item        varchar(2)      NOT NULL,
[month]     varchar(2)      NOT NULL,
amount      smallint        NOT NULL,
[date]      date            NOT NULL )

INSERT INTO #w VALUES ('ac1', 'i1', 'm1', 100, '2015-02-01')
INSERT INTO #w VALUES ('ac1', 'i1', 'm1', 100, '2015-02-01')
INSERT INTO #w VALUES ('ac1', 'i2', 'm1', 200, '2015-02-01')
INSERT INTO #w VALUES ('ac1', 'i2', 'm1', 100, '2015-02-01');
``````

Now starting with:
ac1      i1      m1      100      2015-02-01     \   these two records are identical
ac1      i1      m1      100      2015-02-01    /
ac1      i2      m1      200      2015-02-01   \   these two are not
ac1      i2      m1      100      2015-02-01   /

The solution below returns...
ac1      i2      m1      100      2015-02-01   \  unmatched records are displayed
ac1      i2      m1      200      2015-02-01  /

``````Select account, item, [month], amount, [date]
From (SELECT  t1.account, t1.item, t1.[month], t1.amount
, t1.[date], Count(*) As Qty
FROM    #w t1
Group By t1.account, t1.item,  t1.[month], t1.amount, t1.[date]) d
Where qty =1;
``````
0
Author Commented:
eghtebas,
This is displaying everything, I only want it to display the mismatched columns, another issue is there are duplicate names but the have different address. This table is a warranty registration table with 30k plus rows and it would stand to reason that there would be duplicate names, I would also like to separate the entries that have the same name but different address's
0
Database and Application DeveloperCommented:
I am going to use something like:

http://www.experts-exchange.com/Programming/Languages/SQL_Syntax/Q_28663431.html

but I need the original table to test it. Optionally, please take a look at this example to work it out yourself. True your query producing some result but not exactly what you want.

Mike
0
Author Commented:
That example is concatenating the results set of 2 tables into one, that's even close to what I want to do, unless i am missing something in that example.
0
Database and Application DeveloperCommented:
I am so sorry. My last post was by mistake has been added to wrong question.
0
Author Commented:
Anyone have a solution
0
Database and Application DeveloperCommented:
skull52

re:> This is displaying everything

There was two solutions. One displayed everything but the second one also shown below shows mismatches.

``````Select account, item, [month], amount, [date]
From (SELECT  t1.account, t1.item, t1.[month], t1.amount
, t1.[date], Count(*) As Qty
FROM    #w t1
Group By t1.account, t1.item,  t1.[month], t1.amount, t1.[date]) d
Where qty =1;
``````

The solution below returns...
ac1      i2      m1      100      2015-02-01   \  unmatched records are displayed
ac1      i2      m1      200      2015-02-01  /
0

Experts Exchange Solution brought to you by

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

MSSQL Senior EngineerCommented:
skull52, do you still need help with this question?
0
###### It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.