Solved

Adding an Aggregate function to a Query

Posted on 2003-12-01
17
358 Views
Last Modified: 2010-08-05
I have a replicated SQLServer2000 DataBase using merge replication.  replicated tables have automatically managed identity ranges.

I have a query (see below) which already gives me some valuable info about the identity values at a database. My desire is to include the maximum value if the identity column for each table at the queried database.  Essentially, I would like to include the following function in the query: MAX(IDENTITYCOL).  Obviously I am getting errors since I am not properly grouping my results.

I thought I would create a separate SELECT statement to collect this info and join them, but I don't know how.  The bulk of the query was taken from this site, and I'm not 100% on how it works.  Here's the query:

----------------------------------------------------------------
SELECT Q2.SERVER_NAME, Q1.*, Q2.[NEXT_SEED], Q2.[RANGE], Q2.[CURRENT_MAX] FROM
(SELECT  
t.TABLE_NAME,
c.name AS COLUMN_NAME,
IDENT_SEED(t.TABLE_NAME) As [SEED],
IDENT_INCR(t.TABLE_NAME) AS [INCREMENT],
IDENT_CURRENT(t.TABLE_NAME) AS [CURRENT]
FROM
INFORMATION_SCHEMA.TABLES AS t INNER JOIN sysobjects AS s ON s.name = t.TABLE_NAME
INNER JOIN syscolumns AS c ON c.id = s.id
WHERE
    ((t.TABLE_NAME LIKE '%') And
    (OBJECTPROPERTY(OBJECT_ID(t.TABLE_NAME), 'TableHasIdentity') = 1)
    AND (t.TABLE_TYPE = 'BASE TABLE'))
    AND c.autoval IS NOT NULL )  AS Q1
INNER JOIN

(SELECT @@SERVERNAME as [SERVER_NAME], [name] AS [TABLE_NAME], [NEXT_SEED], [RANGE], [CURRENT_MAX]
FROM MSrepl_identity_range
INNER JOIN sysobjects ON
sysobjects.id = MSrepl_identity_range.objid) AS Q2

ON Q1.[TABLE_NAME] = Q2.[TABLE_NAME]
----------------------------------------------------------------

Thanks for any help!


0
Comment
Question by:youngrascal
  • 9
  • 6
17 Comments
 
LVL 9

Expert Comment

by:PaulBarbin
ID: 9852070
Just for kicks try running this query.  
I made some assumptions that I need you to confirm.  First, the db that this is run from is NOT the distribution database, its one of the databases that is published for merge replication (and that replication uses identity ranges).  Second, your distribution database is on the same server and its called "Distribution".
If it doesn't work, let me know what the problem is and we can take it from here.


SELECT       Q2.SERVER_NAME,
            Q1.*,
            Q2.[NEXT_SEED],
            Q2.[RANGE],
            Q2.[CURRENT_MAX]
FROM      (      SELECT  t.TABLE_NAME,
                  c.name AS COLUMN_NAME,
                  IDENT_SEED(t.TABLE_NAME) As [SEED],
                  IDENT_INCR(t.TABLE_NAME) AS [INCREMENT],
                  IDENT_CURRENT(t.TABLE_NAME) AS [CURRENT]
            FROM       INFORMATION_SCHEMA.TABLES AS t
                 JOIN sysobjects AS s ON
                  s.name = t.TABLE_NAME
                 JOIN syscolumns AS c ON
                  c.id = s.id
            WHERE     t.TABLE_NAME LIKE '%' AND
                      OBJECTPROPERTY(OBJECT_ID(t.TABLE_NAME), 'TableHasIdentity') = 1 AND
                  t.TABLE_TYPE = 'BASE TABLE'  AND
                      c.autoval IS NOT NULL )  AS Q1
     JOIN
            (      SELECT       @@SERVERNAME as [SERVER_NAME],
                        [name] AS [TABLE_NAME],
                        [NEXT_SEED],
                        [RANGE],
                        [CURRENT_MAX]
                  FROM       distribution..MSrepl_identity_range mi
                       JOIN sysobjects so ON
                        so.name = mi.tablename AND
                        db_name() = mi.publisher_db      ) AS Q2 ON
                      Q1.[TABLE_NAME] = Q2.[TABLE_NAME]


Paul
0
 

Author Comment

by:youngrascal
ID: 9853685
Paul,

Thanks for the quick reply.  In fact, those assumptions are untrue.  I have about five replicated [pull] subscriber databases each running on its own MSDE server throughout my WAN plus the distribution database on SQL Server 2000.  

The query you gave me, although it executed, gave me data results different than the original query...?


I would like to run the query on all databases.  I like to check the MAX(IDENTITYCOL) for each table and compare it with the IDENT_CURRENT().  This gives me an idea of which replica has added the last record in each table (that database would have the its max identity value equal to the current identity value).

Previously, I accomplished this with the following query:

------------------------------------------------------------------
SELECT Q2.[SERVER], Q1.[TABLE], [MAX], [CURRENT], ([MAX]-[CURRENT]) AS [DIFFERENCE], NEXT_SEED, RANGE, CURRENT_MAX, AVAILABLE  FROM (
SELECT 'MYTABLE' AS [TABLE], MAX(IDENTITYCOL) AS [MAX], IDENT_CURRENT('[MYTABLE]') AS [CURRENT], (2147483648 - MAX(IDENTITYCOL)) AS [AVAILABLE] FROM [MYTABLE]
) AS Q1

INNER JOIN

(SELECT @@SERVERNAME as [SERVER], [name] AS [TABLE], next_seed, range, current_max
FROM MSrepl_identity_range
INNER JOIN sysobjects ON
sysobjects.id = MSrepl_identity_range.objid) AS Q2

ON Q1.[TABLE] = Q2.[TABLE]
------------------------------------------------------------------

I would mostly bee interested in the DIFFERENCE column; that is, MAX(IDENTITYCOL) - IDENT_CURRENT('[MYTABLE]').  The down fall is that I have to explicitly type in the table name MYTABLE for each table I wanted to examine.  And since I compare the DIFFERENCE value between servers (to find the one where the difference is zero), this can get pretty tedious.

Really what I am looking for is a dynamic query that can tell me for every table that is replicated, what is the maximum value in the identity column and what is the current identity value for that table at that particular database.  The rationale is that I can find out which database is the last database to add to that table.  (This requirement is in addition to being able to view the other identity info as well, such as the range size.)

Thanks Paul, I hope you understand what I am looking for here.
0
 

Author Comment

by:youngrascal
ID: 9853949
Maybe what I need is a way to
a) dynamically determine the max value of each idenity column in the database and
b) join it in a meaningful way to the original query above

0
 
LVL 9

Expert Comment

by:PaulBarbin
ID: 9859839
I think we are close, have you verified that the query that we started with doesn't work?  It seems like it would.  Maybe if you could identify where its not working, we could make slight changes to it.  This is definitely the way to get the dynamic table name in your query....

Paul

0
 
LVL 9

Expert Comment

by:PaulBarbin
ID: 9859870
btw, when I look in BOL, it doesn't show a column called objid in the MSrepl_identity_range table.  Should the link be done with tablename?

Paul
0
 

Author Comment

by:youngrascal
ID: 9860292
Paul,

Your query works when I run it on my publisher database, but when I run it on any of the replicas, no rows are returned (different servers).

But the data that your query returns is somehow different from the data that I expect to get.  I ran your query against my [old, non-dynamic] query:

----------------------------------------------------------------
-- Paul's Query:
SELECT Q2.[SERVER], Q1.[TABLE], [MAX], [CURRENT], ([MAX]-[CURRENT]) AS [DIFFERENCE], NEXT_SEED, RANGE, CURRENT_MAX, AVAILABLE  FROM (
SELECT 'OWNER' AS [TABLE], MAX(IDENTITYCOL) AS [MAX], IDENT_CURRENT('[OWNER]') AS [CURRENT], (2147483648 - MAX(IDENTITYCOL)) AS [AVAILABLE] FROM [OWNER]
) AS Q1

INNER JOIN

(SELECT @@SERVERNAME as [SERVER], [name] AS [TABLE], next_seed, range, current_max
FROM MSrepl_identity_range
INNER JOIN sysobjects ON
sysobjects.id = MSrepl_identity_range.objid) AS Q2

ON Q1.[TABLE] = Q2.[TABLE]

GO

-- My Query:
SELECT      Q2.SERVER_NAME,
          Q1.*,
          Q2.[NEXT_SEED],
          Q2.[RANGE],
          Q2.[CURRENT_MAX]
FROM     (     SELECT  t.TABLE_NAME,
               c.name AS COLUMN_NAME,
               IDENT_SEED(t.TABLE_NAME) As [SEED],
               IDENT_INCR(t.TABLE_NAME) AS [INCREMENT],
               IDENT_CURRENT(t.TABLE_NAME) AS [CURRENT]
          FROM      INFORMATION_SCHEMA.TABLES AS t
               JOIN sysobjects AS s ON
               s.name = t.TABLE_NAME
               JOIN syscolumns AS c ON
               c.id = s.id
          WHERE     t.TABLE_NAME LIKE 'OWNER' AND
                   OBJECTPROPERTY(OBJECT_ID(t.TABLE_NAME), 'TableHasIdentity') = 1 AND
               t.TABLE_TYPE = 'BASE TABLE'  AND
                   c.autoval IS NOT NULL )  AS Q1
     JOIN
          (     SELECT      @@SERVERNAME as [SERVER_NAME],
                    [name] AS [TABLE_NAME],
                    [NEXT_SEED],
                    [RANGE],
                    [CURRENT_MAX]
               FROM      distribution..MSrepl_identity_range mi
                    JOIN sysobjects so ON
                    so.name = mi.tablename AND
                    db_name() = mi.publisher_db     ) AS Q2 ON
                      Q1.[TABLE_NAME] = Q2.[TABLE_NAME]
                     
----------------------------------------------------------------
With the following results:
                     
                     
SERVER      TABLE      MAX      CURRENT      DIFFERENCE      NEXT_SEED      RANGE      CURRENT_MAX      AVAILABLE
SQLP1      OWNER      2146521458      2146521458      0      2146522000      100      2146522000      962190

(1 row(s) affected)

SERVER_NAME      TABLE_NAME      COLUMN_NAME      SEED      INCREMENT      CURRENT      NEXT_SEED      RANGE      CURRENT_MAX
SQLP1      OWNER      RecordID      1      1      2146521458      2146522500      100      2146521999

(1 row(s) affected)
                     

I wonder why the Next_Seed value would be different (2146522000 vs 2146522500).  Funny thing though too is that the identity range for this table is 500 at the publisher and 100 at the subscribers.  I wonder why both queries return 100.  The above statement batch was executed on the publisher database.

Thanks for any help.
0
 
LVL 9

Expert Comment

by:PaulBarbin
ID: 9860587
btw, I think you are mistaken on which one is mine and which is yours.  The first one in your last post is your query and mine is the second, right?

The reason that I think you aren't getting any data at the other servers is that there is no distribution database on the other servers, right?  Or if there is, it doesn't have any data in it, because that data is stored on the published server.

I still don't see the column called objid in the msrepl_identity_range table, but if you're not getting an error message, then change my query to match yours and see how it works:


SELECT      Q2.SERVER_NAME,
          Q1.*,
          Q2.[NEXT_SEED],
          Q2.[RANGE],
          Q2.[CURRENT_MAX]
FROM     (     SELECT  t.TABLE_NAME,
               c.name AS COLUMN_NAME,
               IDENT_SEED(t.TABLE_NAME) As [SEED],
               IDENT_INCR(t.TABLE_NAME) AS [INCREMENT],
               IDENT_CURRENT(t.TABLE_NAME) AS [CURRENT]
          FROM      INFORMATION_SCHEMA.TABLES AS t
               JOIN sysobjects AS s ON
               s.name = t.TABLE_NAME
               JOIN syscolumns AS c ON
               c.id = s.id
          WHERE     t.TABLE_NAME LIKE 'OWNER' AND
                   OBJECTPROPERTY(OBJECT_ID(t.TABLE_NAME), 'TableHasIdentity') = 1 AND
               t.TABLE_TYPE = 'BASE TABLE'  AND
                   c.autoval IS NOT NULL )  AS Q1
     JOIN
          (     SELECT      @@SERVERNAME as [SERVER_NAME],
                    [name] AS [TABLE_NAME],
                    [NEXT_SEED],
                    [RANGE],
                    [CURRENT_MAX]
               FROM      distribution..MSrepl_identity_range mi
                    JOIN sysobjects so ON
                    so.id = mi.objid AND
                    db_name() = mi.publisher_db     ) AS Q2 ON
                      Q1.[TABLE_NAME] = Q2.[TABLE_NAME]

Paul
0
 

Author Comment

by:youngrascal
ID: 9862569
Paul,

Yes, i'm sorry I mislabelled the queries in that batch.

You are also correct in that I do not have distribution databases on my other servers.  I only have one distribution server which also houses my publication database.  This publication database replicates to five other dbs, all on their own MSDE server.

These are the columns in my table msrepl_identity_range:
 objid, int
 next_seed, bigint
 pub_range, bigint
 range, bigint
 max_identity, bigint
 threshold, int
 current_max, bigint

I guess they are different from yours, go figure.

I tried your query and I received the following error:
Server: Msg 207, Level 16, State 3, Line 1
Invalid column name 'objid'.

I surmise that the difference between your query and mine is that I am querying a local table msrepl_identity_range, where as you are querying  distribution..MSrepl_identity_range.  

Is distribution..MSrepl_identity_range.current_max equivalent to MAX(IDENTITYCOL)? I thought that current_max is the highest value that can be assigned in the table's current identity range.
0
Get up to 2TB FREE CLOUD per backup license!

An exclusive Black Friday offer just for Expert Exchange audience! Buy any of our top-rated backup solutions & get up to 2TB free cloud per system! Perform local & cloud backup in the same step, and restore instantly—anytime, anywhere. Grab this deal now before it disappears!

 
LVL 9

Expert Comment

by:PaulBarbin
ID: 9862627
I get it! I didn't actually setup a publication that uses identity ranges (as I didn't know how to do that without looking it up).  So, I didn't have a local table and you do.  So, does removing the "distribution.." from my query, gives you the same results as with your original query?

According to BOL, Current_Max column is: "Current max that can be assigned but not necessarily be assigned"  which I THINK means the same thing as you said in your last sentence.

Are we there yet?

Paul

0
 

Author Comment

by:youngrascal
ID: 9875711
Paul,

Your query works (after removing references to the distribution db), but it does not return what I am looking for, which is MAX(IDENTITYCOL).

I still need a way to dynamically get the max values for the idenity columns for each replicated table in my database.  Theoretically, these values should be the same in the same table across each replicated db.

Thanks for your help, but we're not quite there yet.
0
 
LVL 9

Expert Comment

by:PaulBarbin
ID: 9875759
Sorry for my confusion, but why is that not the same as current identity?

Paul
0
 

Author Comment

by:youngrascal
ID: 9883484
As I understand it,

Because I have an identity range for each database replica, the current Identity is specific to each replica.  Whereas the maximum value for the idenity column should be the same across all replicas (assuming that the replicas are synched).

i.e.

DB1 range: 100-199
DB2 range: 200-299
DB3 range: 300-399

The current identity at DB1 will fall between 100-199, the Max value for that column can be much higher if DBab or DBc has added a record.  If DB2 adds a record, the Max value will be 200.  If DB1 then adds a record, the max value will still be 200, although the current identity @ DB1 will be 100 (or 101?).

This is useful to be for a few reasons, but mostly I keep track of the highest identity value because some of my identity columns have been getting close to the 2^31 integer limit.

0
 

Author Comment

by:youngrascal
ID: 9885505
Oops, that should read:

"the Max value for that column can be much higher if DB2 or DB3 has added a record."
0
 

Author Comment

by:youngrascal
ID: 10112340
I have had someone else suggest to me to use the undocumented stored procedure sp_MSforeachtable to go through each table in the database.  This seems to solve my problem.

0
 

Author Comment

by:youngrascal
ID: 10116263
I appreciate the Expert's time and effort.  But I have not been able to resolve my problem of using a single query to obtain both the MAX(IDENTITYCOL) and other replication data from all users tables.

I have been able to produce the desired results by creating a query that returns MAX(IDENTITYCOL) for each table in the database using the undocumented stored procedure "sp_MSforeachtable"  I will exapnd the query to include the other replicated information.

Although, I would still prefer to be able to solve the problem as I had outlined at the beginning of this question, with a single query ratehr than with a stored procedure.

Thanks!
0
 

Accepted Solution

by:
SpazMODic earned 0 total points
ID: 10137721
PAQed, with points refunded (250)

SpazMODic
EE Moderator
0

Featured Post

Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
ssms - object execution statistics 12 37
Help with SQL Query 23 39
Backup Job question 4 18
Test a query 23 11
Slowly Changing Dimension Transformation component in data task flow is very useful for us to manage and control how data changes in SSIS.
International Data Corporation (IDC) prognosticates that before the current the year gets over disbursing on IT framework products to be sent in cloud environs will be $37.1B.
This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.
Viewers will learn how to use the SELECT statement in SQL and will be exposed to the many uses the SELECT statement has.

707 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now