Solved

How to display the difference found between same-formated-tables of different Databases in Ms SQL 2005 and 2000

Posted on 2012-03-14
24
209 Views
Last Modified: 2012-03-19
We have 4 databases.  Each database has the exact same formatted tables, just with different data.  

There is a specific table (CustTable) that within all 4 databases, has the same data; that is an AccountNumber and Description.  In all 4 DB there are the same AccountNumber.  We have found that within the Description column, there are differences, not many, but does exist.

We are trying to do is prepare a script to display only the differences in description between all database of the same Table.

See pic included for an example of our situation and what we are trying to do.

We want to compare the following manner:

Comp1 >> Comp2, Comp2 >> Comp3 and Comp3 >> Comp4.

The script we are using, upto now is the following
0
Comment
Question by:rayluvs
  • 14
  • 5
  • 5
24 Comments
 

Author Comment

by:rayluvs
ID: 37720683
Keep forgetting to include attachments... please reference the included data...

SELECT 'DB' 'COMP1-COMP2',(rtrim(a.AccountNumberSegment2) + '-' + rtrim(a.AccountNumberSegment3)), a.actdescr, b.actdescr 
       FROM COMP1..CustTable a
       inner JOIN COMP2..gl00100 b ON a.AccountNumberSegment2=b.AccountNumberSegment2 AND a.AccountNumberSegment3=b.AccountNumberSegment3 
WHERE a.actdescr <> b.actdescr ORDER BY 1

Open in new window

DifferencesinDB-Table.jpg
0
 

Author Comment

by:rayluvs
ID: 37720692
Forgot to mention that the AccountNumber is actually 3 segment number; that is 3 columns to represent an AccountNumber.
0
 
LVL 25

Assisted Solution

by:jogos
jogos earned 100 total points
ID: 37720856
EXCEPT is also an option
http://msdn.microsoft.com/en-us/library/ms188055.aspx

But don't you want 1 list with things that don't exists in all databases

select AccNbr, actdescr, count(*) , min(db), max(db)
from 
(select 'comp1' as db , (rtrim(AccountNumberSegment2) + '-' + rtrim(AccountNumberSegment3))  as AccNbr, actdescr
 from comp1.
 union all 
  select 'comp2' as db , (rtrim(AccountNumberSegment2) + '-' + rtrim(AccountNumberSegment3))  as AccNbr, actdescr
  from comp2
 union all 
  select 'comp3' as db , (rtrim(AccountNumberSegment2) + '-' + rtrim(AccountNumberSegment3))  as AccNbr, actdescr
  from comp3
 union all 
  select 'comp4' as db , (rtrim(AccountNumberSegment2) + '-' + rtrim(AccountNumberSegment3))  as AccNbr, actdescr
  from comp4
 ) 
 group by AccNbr,Actdescr
 having count(*) < 4

Open in new window

0
Complete VMware vSphere® ESX(i) & Hyper-V Backup

Capture your entire system, including the host, with patented disk imaging integrated with VMware VADP / Microsoft VSS and RCT. RTOs is as low as 15 seconds with Acronis Active Restore™. You can enjoy unlimited P2V/V2V migrations from any source (even from a different hypervisor)

 
LVL 42

Expert Comment

by:dqmq
ID: 37720864
SELECT (rtrim(a.AccountNumberSegment2) + '-' + rtrim(a.AccountNumberSegment3)), a.actdescr as Desc1, b.actdescr as Desc2,  c.actdescr as Desc3, d.actdescr as Desc4

FROM
COMP1..CustTable a
inner JOIN COMP2..gl00100 b ON a.AccountNumberSegment2=b.AccountNumberSegment2 AND a.AccountNumberSegment3=b.AccountNumberSegment3
inner JOIN COMP3..gl00100 c ON
a.AccountNumberSegment2=c.AccountNumberSegment2 AND a.AccountNumberSegment3=c.AccountNumberSegment3
inner JOIN COMP4..gl00100 d ON
a.AccountNumberSegment2=d.AccountNumberSegment2 AND a.AccountNumberSegment3=d.AccountNumberSegment3
WHERE a.actdescr <> b.actdescr or b.actdescr <> c.actdescr or c.actdescr <> d.actdescr
0
 

Author Comment

by:rayluvs
ID: 37721078
will try
0
 

Author Comment

by:rayluvs
ID: 37722175
Ust tried it and some rows were not shown.

To compare I used the script inincluded in the question and compare from comp1 to comp2, comp1 to comp3, comp1 to comp4 and found rows not displayed by your script.
0
 
LVL 42

Expert Comment

by:dqmq
ID: 37723471
hmmm...a couple things come to mind.

First, you said Account Number is really a composite of three columns, but your query (and mine, which is modeled after it) only joins on two of the three columns.  

However, I expect that the real issue is that some of the tables have Account Number combinations that do not exist in others.  My query only returns results where the Account Number exists in all 4 tables.  What does these return:

Select count(*) from comp1 a  where not exists (select * from comp2  b where  a.AccountNumberSegment2=b.AccountNumberSegment2 AND a.AccountNumberSegment3=b.AccountNumberSegment3)


Select count(*) from comp1 a  where not exists (select * from comp3  b where  a.AccountNumberSegment2=b.AccountNumberSegment2 AND a.AccountNumberSegment3=b.AccountNumberSegment3)


Select count(*) from comp1 a  where not exists (select * from comp4  b where  a.AccountNumberSegment2=b.AccountNumberSegment2 AND a.AccountNumberSegment3=b.AccountNumberSegment3)
0
 
LVL 25

Expert Comment

by:jogos
ID: 37723484
<<However, I expect that the real issue is that some of the tables have Account Number combinations that do not exist in others. >>
That's where the 'group by AccNbr,Actdescr  having count(*) < 4' sollution is identifying them.
0
 

Author Comment

by:rayluvs
ID: 37723518
To try to simplify it is display the accounts that are found in all Comps and also that at least one of the Description within the 4 are not the same.

The script I provided in ID: 37720683, is just a search of One Comp "compare" to the rest  (Comp1 comparing to Comp2,Comp3, etc)

In reality, since we haven't come up with a single script, I have at least 20 of those script comparing the following manner (it is 5 database, not 4):

         Comp1 compare Comp2, Comp3, Comp4, Comp5
         Comp2 compare Comp3, Comp4, Comp5
         Comp3 compare Comp1, Comp2, Comp4, Comp5
         Comp4 compare Comp1, Comp2, Comp3, Comp5
         Comp5 compare Comp1, Comp2, Comp3, Comp4

Can a single script do this compare and display the result as indicated in  ID: 37720683?
0
 
LVL 25

Expert Comment

by:jogos
ID: 37723541
Again comment  #a37720856 identifies the items that are not the same in all databases. Did you test this?

But it gives both the good count(*) = 3 and the bad count(*) = 1.
Problem is to identify the one and only good description , if you don't have certainty there is one database that must be the reference.

So 4 logical steps
1) list of not fully the same (my union ... having count(*) <)
2) (perfectly integratable in step 1)  match the good description with it (always that of COMP1?)
3) With row_number() over() http://msdn.microsoft.com/en-us/library/ms189461.aspx give for each db a rank on the 'different' records
4) join all different records together with a FULL JOIN on rank (step 3) to get exactly the result you want
http://msdn.microsoft.com/en-us/library/ms187518.aspx

Steps 3 and 4 are for having that formated as in your example
0
 

Author Comment

by:rayluvs
ID: 37723641
actually, no.  Since we saw the next entry, we though it was your and discard your entry.  Will test tomorrow.

Also to comment that yes your are correct, we have to have a DB as references; this is Comp1
0
 
LVL 25

Expert Comment

by:jogos
ID: 37723777
<<Comp1 >> Comp2, Comp2 >> Comp3 and Comp3 >> Comp4.>>
was not showing as comp1 was the reference, but with reference it's easier

Without testing (no sql here) i tried to build something going in the direction you want


Step 1, 2 and 3 combined in the cte-definition (http://msdn.microsoft.com/en-us/library/ms175972.aspx). Test that first, see if you get out of it what you want and see what the row_number has done.

Step 4 the allignement

;with cte_descr (AccNbr, actdescr,refactdescr, rownbr)
as 
(
select 'comp2' as db , (rtrim(t1.AccountNumberSegment2) + '-' + rtrim(t1.AccountNumberSegment3))  as AccNbr, actdescr
, ROW_NUMBER() OVER(ORDER BY rtrim(t1.AccountNumberSegment2) + '-' + rtrim(t1.AccountNumberSegment3) ) as rownbr
 from comp1 as t1
 left join comp2 as t2 
    on ( rtrim(t2.AccountNumberSegment2) + '-' + rtrim(t2.AccountNumberSegment3) 
       =rtrim(t1.AccountNumberSegment2) + '-' + rtrim(t1.AccountNumberSegment3)
        and t2.actdescr <> t1.actdescr
        )
 union all 
select 'comp3' as db , (rtrim(t1.AccountNumberSegment2) + '-' + rtrim(t1.AccountNumberSegment3))  as AccNbr, actdescr
, ROW_NUMBER() OVER(ORDER BY rtrim(t1.AccountNumberSegment2) + '-' + rtrim(t1.AccountNumberSegment3) ) as rownbr
 from comp1 as t1
 left join comp3 as t2 
    on ( rtrim(t2.AccountNumberSegment2) + '-' + rtrim(t2.AccountNumberSegment3) 
       =rtrim(t1.AccountNumberSegment2) + '-' + rtrim(t1.AccountNumberSegment3)
        and t2.actdescr <> t1.actdescr
        )
 union all 
select 'comp4' as db , (rtrim(t1.AccountNumberSegment2) + '-' + rtrim(t1.AccountNumberSegment3))  as AccNbr, actdescr
, ROW_NUMBER() OVER(ORDER BY rtrim(t1.AccountNumberSegment2) + '-' + rtrim(t1.AccountNumberSegment3) ) as rownbr
 from comp1 as t1
 left join comp4 as t2 
    on ( rtrim(t2.AccountNumberSegment2) + '-' + rtrim(t2.AccountNumberSegment3) 
       =rtrim(t1.AccountNumberSegment2) + '-' + rtrim(t1.AccountNumberSegment3)
        and t2.actdescr <> t1.actdescr
        )
 union all 
select 'comp5' as db , (rtrim(t1.AccountNumberSegment2) + '-' + rtrim(t1.AccountNumberSegment3))  as AccNbr, actdescr
, ROW_NUMBER() OVER(ORDER BY rtrim(t1.AccountNumberSegment2) + '-' + rtrim(t1.AccountNumberSegment3) ) as rownbr
 from comp1 as t1
 left join comp5 as t2 
    on ( rtrim(t2.AccountNumberSegment2) + '-' + rtrim(t2.AccountNumberSegment3) 
       =rtrim(t1.AccountNumberSegment2) + '-' + rtrim(t1.AccountNumberSegment3)
        and t2.actdescr <> t1.actdescr
        )
 ) 
-- exploit the cte for your output-format (step 4)
select x2.acccdescr as c2descr ,x2.refaccescr as c2ref
,x3.acccdescr as c3descr ,x3.refaccescr as c3ref
,x4.acccdescr as c4descr ,x4.refaccescr as c4ref
,x5.acccdescr as c5descr ,x5.refaccescr as c5ref
from cte_descr  as x2
FULL join cte_descr as x3 on x3.rownbr = x2.rownbr and x3.db = 'comp3'
FULL join cte_descr as x4 on x4.rownbr = x2.rownbr and x4.db = 'comp4'
FULL join cte_descr as x5 on x4.rownbr = x2.rownbr and x5.db = 'comp5'
where isnull(x2.db,'comp2')='comp2'

Open in new window

0
 

Author Comment

by:rayluvs
ID: 37724598
Will check today
0
 
LVL 42

Expert Comment

by:dqmq
ID: 37725624
Now that you've identified Comp1 as the reference table, try this:


SELECT (rtrim(a.AccountNumberSegment2) + '-' + rtrim(a.AccountNumberSegment3)), a.actdescr as Desc1, b.actdescr as Desc2,  c.actdescr as Desc3, d.actdescr as Desc4

FROM
COMP1..CustTable a
left JOIN COMP2..gl00100 b ON a.AccountNumberSegment2=b.AccountNumberSegment2 AND a.AccountNumberSegment3=b.AccountNumberSegment3
left JOIN COMP3..gl00100 c ON
a.AccountNumberSegment2=c.AccountNumberSegment2 AND a.AccountNumberSegment3=c.AccountNumberSegment3
left JOIN COMP4..gl00100 d ON
a.AccountNumberSegment2=d.AccountNumberSegment2 AND a.AccountNumberSegment3=d.AccountNumberSegment3
WHERE a.actdescr <> b.actdescr or a.actdescr <> c.actdescr or a.actdescr <> d.actdescr
or b.actdescr is null or c.actdescr is null or d.actdescr is null

ps.  Still don't understand why AccountNumberSegment1 is not part of the join expresssion????
0
 

Author Comment

by:rayluvs
ID: 37725645
Becuase AccountNumberSegment1 is the same value in all Comps.
0
 
LVL 42

Accepted Solution

by:
dqmq earned 400 total points
ID: 37725684
Or you might try this style:

Select AccountNumber, max(comp1) comp1, max(comp2) comp2, max(comp3) comp3, max(comp4) comp4
from
(
SELECT (rtrim(a.AccountNumberSegment2) + '-' + rtrim(a.AccountNumberSegment3)) as AccountNumber, a.actdescr as comp1, b.actdescr as comp2, null as comp3, null as comp4
       FROM COMP1..CustTable a
       inner JOIN COMP2..gl00100 b ON a.AccountNumberSegment2=b.AccountNumberSegment2 AND a.AccountNumberSegment3=b.AccountNumberSegment3
WHERE a.actdescr <> b.actdescr
union all
SELECT (rtrim(a.AccountNumberSegment2) + '-' + rtrim(a.AccountNumberSegment3)), null, null,  b.actdescr, null  
       FROM COMP1..CustTable a
       inner JOIN COMP3..gl00100 b ON a.AccountNumberSegment2=b.AccountNumberSegment2 AND a.AccountNumberSegment3=b.AccountNumberSegment3
WHERE a.actdescr <> b.actdescr
union all
SELECT (rtrim(a.AccountNumberSegment2) + '-' + rtrim(a.AccountNumberSegment3)), null, null, null, b.actdescr
       FROM COMP1..CustTable a
       inner JOIN COMP4..gl00100 b ON a.AccountNumberSegment2=b.AccountNumberSegment2 AND a.AccountNumberSegment3=b.AccountNumberSegment3
WHERE a.actdescr <> b.actdesc
)
group by AccountNumber
order by AccountNumber
0
 

Author Comment

by:rayluvs
ID: 37731261
Just tried it.  It gave me a error since "AccountNumber" doesnt exist.  The column we are working with is "AccountNumber2" and "AccountNumber3".  Column "AccountNumber1" has the same value in all DB.

What should replace "AccountNumber" on the first select and group by and order by?
0
 
LVL 42

Expert Comment

by:dqmq
ID: 37731421
If you run the inner (union) query by itself, the first column it returns is named AccountNumber. That's what you should use in the outer select, group by, and order by.  

You may need to do this:

Select AccountNumber
...
WHERE a.actdescr <> b.actdesc
) as tmp             --table alias may be required                  
group by AccountNumber
order by AccountNumber
0
 

Author Comment

by:rayluvs
ID: 37731520
You are totally correct.  Replaced it and gave us a result.  We will verify contents.... sorry about that...
0
 

Author Comment

by:rayluvs
ID: 37731571
We still testing but its five databases (ID 37723518).

I have fixed the script and added the additional Union All and Select, but how can add the column for the additional DB?
0
 

Author Comment

by:rayluvs
ID: 37731575
Ok... figured it out.
0
 

Author Comment

by:rayluvs
ID: 37736575
We have incorporating your script in ID: 37725684, and validating at least 20% of the rows and has worked excellently.

Thank you very much!!!!
0
 
LVL 25

Expert Comment

by:jogos
ID: 37736600
And the way you wanted to format it in the output?
0
 

Author Closing Comment

by:rayluvs
ID: 37736684
Thanx.
0

Featured Post

VMware Disaster Recovery and Data Protection

In this expert guide, you’ll learn about the components of a Modern Data Center. You will use cases for the value-added capabilities of Veeam®, including combining backup and replication for VMware disaster recovery and using replication for data center migration.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this article we will get to know that how can we recover deleted data if it happens accidently. We really can recover deleted rows if we know the time when data is deleted by using the transaction log.
Load balancing is the method of dividing the total amount of work performed by one computer between two or more computers. Its aim is to get more work done in the same amount of time, ensuring that all the users get served faster.
Via a live example, show how to shrink a transaction log file down to a reasonable size.
Viewers will learn how to use the SELECT statement in SQL and will be exposed to the many uses the SELECT statement has.

777 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question