jmanGJHS97
asked on
Column-by-column comparison of the data in two SQL Server 2000 databases
Can anyone give me a SQL script that takes the tables in two databases and compares the data column-by-column to see where the differences are? I have an application that I want to verify that production database contains exactly the same data as the development database.
I'd be willing to run such a script once per table and input the table name manually; that's no big deal. The comparison of the values in every row and column is the thing I'm most interested in.
Thanks.
I'd be willing to run such a script once per table and input the table name manually; that's no big deal. The comparison of the values in every row and column is the thing I'm most interested in.
Thanks.
on the other hand: if you want to have the 2 "identical", why not just do a backup+restore? :)
ASKER
I've got a utility to use, but the issue is that the production database has a bunch of clustered primary keys, and the development database doesn't, so the utility won't run because the schemas aren't the same.
Also, it's not that I want to make an identical copy of the database. What I want is to verify whether they already are identical.
We had an attack on our site, and we need to see what changed on the production database, if anything, so we want to compare it to dev.
Also, it's not that I want to make an identical copy of the database. What I want is to verify whether they already are identical.
We had an attack on our site, and we need to see what changed on the production database, if anything, so we want to compare it to dev.
You can run following statement for each table on dev and prod db's and see if tables contain same data or not:
select CHECKSUM_AGG(BINARY_CHECKS UM(*)) FROM [Type Your Table Name Here]
If you want to see row differences, and have SQL Server 2005 on your development platform, you can compare tables and generate script for equalizing data using the utility called tablediff.exe located at
\Program Files\Microsoft SQL Server\90\COM
directory. It can be used compare both SQL Server 2000 and 2005 databases. The syntax is simple and like this:
tablediff -sourceserver "InstanceName" -sourcedatabase "DBName" -sourcetable "TableName" -destinationserver "InstanceName" -destinationdatabase "DBName" -destinationtable "DN_TableName"
-o "out.txt"
-o switch creates a text file from which you can see differences.
Following article gives details on the utility.
http://msdn.microsoft.com/en-us/library/ms162843.aspx
select CHECKSUM_AGG(BINARY_CHECKS
If you want to see row differences, and have SQL Server 2005 on your development platform, you can compare tables and generate script for equalizing data using the utility called tablediff.exe located at
\Program Files\Microsoft SQL Server\90\COM
directory. It can be used compare both SQL Server 2000 and 2005 databases. The syntax is simple and like this:
tablediff -sourceserver "InstanceName" -sourcedatabase "DBName" -sourcetable "TableName" -destinationserver "InstanceName" -destinationdatabase "DBName" -destinationtable "DN_TableName"
-o "out.txt"
-o switch creates a text file from which you can see differences.
Following article gives details on the utility.
http://msdn.microsoft.com/en-us/library/ms162843.aspx
ASKER
Ok, for example, I did this:
select CHECKSUM_AGG(BINARY_CHECKS UM(*)) FROM [prod_db].[dbo].[Customer]
select CHECKSUM_AGG(BINARY_CHECKS UM(*)) FROM [development_db].[dbo].[Cu stomer]
And the results were:
1749344427
300041798
What does this tell me?
select CHECKSUM_AGG(BINARY_CHECKS
select CHECKSUM_AGG(BINARY_CHECKS
And the results were:
1749344427
300041798
What does this tell me?
It means the tables are different. If they contain exactly same data, binary checksum gives same results. In order to check row equality by just using SQL, you can remove CHECKSUM_AGG, then the query returns a checksum for each row that can be compared againts other table.
My question is that, do every table of your db has a primary key (even if it is not defined). If not, how do we know if a row corresponds to other?
My question is that, do every table of your db has a primary key (even if it is not defined). If not, how do we know if a row corresponds to other?
ASKER
Ok, so there's several things going on here. First, the table schemas are not the same. For example, one table in the prod db would be this:
CREATE TABLE [tblAffiliate] (
[AffiliateID] [int] IDENTITY (1, 1) NOT NULL ,
[strEmail] [nvarchar] (50) COLLATE SQL_Latin1_General_CP1_CI_ AS NULL ,
[dtmDate] [nvarchar] (50) COLLATE SQL_Latin1_General_CP1_CI_ AS NULL ,
CONSTRAINT [PK_tblAffiliate] PRIMARY KEY CLUSTERED
(
[AffiliateID]
) ON [PRIMARY]
) ON [PRIMARY]
GO
The dev equivalent is:
CREATE TABLE [tblAffiliate] (
[AffiliateID] [int] NOT NULL ,
[strEmail] [nvarchar] (50) COLLATE SQL_Latin1_General_CP1_CI_ AS NULL ,
[dtmDate] [nvarchar] (50) COLLATE SQL_Latin1_General_CP1_CI_ AS NULL
) ON [PRIMARY]
GO
So, the fields are the same, but the prod db has that constraint that establishes a clustered primary key, and the dev db doesn't have a primary key at all (unless I'm missing something). Would this alone be enough for the SQL statement you gave me to return that they are not the same?
Given that the tables are known to not have the exact same schema, I need to know if the data contained in each table is the same in both databases.
CREATE TABLE [tblAffiliate] (
[AffiliateID] [int] IDENTITY (1, 1) NOT NULL ,
[strEmail] [nvarchar] (50) COLLATE SQL_Latin1_General_CP1_CI_
[dtmDate] [nvarchar] (50) COLLATE SQL_Latin1_General_CP1_CI_
CONSTRAINT [PK_tblAffiliate] PRIMARY KEY CLUSTERED
(
[AffiliateID]
) ON [PRIMARY]
) ON [PRIMARY]
GO
The dev equivalent is:
CREATE TABLE [tblAffiliate] (
[AffiliateID] [int] NOT NULL ,
[strEmail] [nvarchar] (50) COLLATE SQL_Latin1_General_CP1_CI_
[dtmDate] [nvarchar] (50) COLLATE SQL_Latin1_General_CP1_CI_
) ON [PRIMARY]
GO
So, the fields are the same, but the prod db has that constraint that establishes a clustered primary key, and the dev db doesn't have a primary key at all (unless I'm missing something). Would this alone be enough for the SQL statement you gave me to return that they are not the same?
Given that the tables are known to not have the exact same schema, I need to know if the data contained in each table is the same in both databases.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
If the schema are different, then, the checksum route is not for you - they are different structures and so will be different...
can you "see" the prod and dev databases from the same server ?
There is a manual way that could be scripted fairly easily, of recursing the information_schema.column_ names on one table and checking if that column name exists on the other table and then checking content. Does that sound like the type of thing you are looking for ?
can you "see" the prod and dev databases from the same server ?
There is a manual way that could be scripted fairly easily, of recursing the information_schema.column_
ASKER
inter,
I'll give that a try. Thanks.
mark_wills,
Yes, the prod and dev databases can be seen from the same server. Yes, that's what I'd be interested in.
I'll give that a try. Thanks.
mark_wills,
Yes, the prod and dev databases can be seen from the same server. Yes, that's what I'd be interested in.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Great, I'll give it a whirl.
ASKER
I ended up adding the contstraints to the tables in order to use the 3rd party utility we purchased previously. I did try both of your solutions, and they seem to work fine as well, so I'm going to credit you both for your help.
Thanks a lot.
Thanks a lot.
http://www.red-gate.com/products/SQL_Compare/index.htm