• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 413
  • Last Modified:

Adding an Aggregate function to a Query

I have a replicated SQLServer2000 DataBase using merge replication.  replicated tables have automatically managed identity ranges.

I have a query (see below) which already gives me some valuable info about the identity values at a database. My desire is to include the maximum value if the identity column for each table at the queried database.  Essentially, I would like to include the following function in the query: MAX(IDENTITYCOL).  Obviously I am getting errors since I am not properly grouping my results.

I thought I would create a separate SELECT statement to collect this info and join them, but I don't know how.  The bulk of the query was taken from this site, and I'm not 100% on how it works.  Here's the query:

----------------------------------------------------------------
SELECT Q2.SERVER_NAME, Q1.*, Q2.[NEXT_SEED], Q2.[RANGE], Q2.[CURRENT_MAX] FROM
(SELECT  
t.TABLE_NAME,
c.name AS COLUMN_NAME,
IDENT_SEED(t.TABLE_NAME) As [SEED],
IDENT_INCR(t.TABLE_NAME) AS [INCREMENT],
IDENT_CURRENT(t.TABLE_NAME) AS [CURRENT]
FROM
INFORMATION_SCHEMA.TABLES AS t INNER JOIN sysobjects AS s ON s.name = t.TABLE_NAME
INNER JOIN syscolumns AS c ON c.id = s.id
WHERE
    ((t.TABLE_NAME LIKE '%') And
    (OBJECTPROPERTY(OBJECT_ID(t.TABLE_NAME), 'TableHasIdentity') = 1)
    AND (t.TABLE_TYPE = 'BASE TABLE'))
    AND c.autoval IS NOT NULL )  AS Q1
INNER JOIN

(SELECT @@SERVERNAME as [SERVER_NAME], [name] AS [TABLE_NAME], [NEXT_SEED], [RANGE], [CURRENT_MAX]
FROM MSrepl_identity_range
INNER JOIN sysobjects ON
sysobjects.id = MSrepl_identity_range.objid) AS Q2

ON Q1.[TABLE_NAME] = Q2.[TABLE_NAME]
----------------------------------------------------------------

Thanks for any help!


0
youngrascal
Asked:
youngrascal
  • 9
  • 6
1 Solution
 
PaulBarbinCommented:
Just for kicks try running this query.  
I made some assumptions that I need you to confirm.  First, the db that this is run from is NOT the distribution database, its one of the databases that is published for merge replication (and that replication uses identity ranges).  Second, your distribution database is on the same server and its called "Distribution".
If it doesn't work, let me know what the problem is and we can take it from here.


SELECT       Q2.SERVER_NAME,
            Q1.*,
            Q2.[NEXT_SEED],
            Q2.[RANGE],
            Q2.[CURRENT_MAX]
FROM      (      SELECT  t.TABLE_NAME,
                  c.name AS COLUMN_NAME,
                  IDENT_SEED(t.TABLE_NAME) As [SEED],
                  IDENT_INCR(t.TABLE_NAME) AS [INCREMENT],
                  IDENT_CURRENT(t.TABLE_NAME) AS [CURRENT]
            FROM       INFORMATION_SCHEMA.TABLES AS t
                 JOIN sysobjects AS s ON
                  s.name = t.TABLE_NAME
                 JOIN syscolumns AS c ON
                  c.id = s.id
            WHERE     t.TABLE_NAME LIKE '%' AND
                      OBJECTPROPERTY(OBJECT_ID(t.TABLE_NAME), 'TableHasIdentity') = 1 AND
                  t.TABLE_TYPE = 'BASE TABLE'  AND
                      c.autoval IS NOT NULL )  AS Q1
     JOIN
            (      SELECT       @@SERVERNAME as [SERVER_NAME],
                        [name] AS [TABLE_NAME],
                        [NEXT_SEED],
                        [RANGE],
                        [CURRENT_MAX]
                  FROM       distribution..MSrepl_identity_range mi
                       JOIN sysobjects so ON
                        so.name = mi.tablename AND
                        db_name() = mi.publisher_db      ) AS Q2 ON
                      Q1.[TABLE_NAME] = Q2.[TABLE_NAME]


Paul
0
 
youngrascalAuthor Commented:
Paul,

Thanks for the quick reply.  In fact, those assumptions are untrue.  I have about five replicated [pull] subscriber databases each running on its own MSDE server throughout my WAN plus the distribution database on SQL Server 2000.  

The query you gave me, although it executed, gave me data results different than the original query...?


I would like to run the query on all databases.  I like to check the MAX(IDENTITYCOL) for each table and compare it with the IDENT_CURRENT().  This gives me an idea of which replica has added the last record in each table (that database would have the its max identity value equal to the current identity value).

Previously, I accomplished this with the following query:

------------------------------------------------------------------
SELECT Q2.[SERVER], Q1.[TABLE], [MAX], [CURRENT], ([MAX]-[CURRENT]) AS [DIFFERENCE], NEXT_SEED, RANGE, CURRENT_MAX, AVAILABLE  FROM (
SELECT 'MYTABLE' AS [TABLE], MAX(IDENTITYCOL) AS [MAX], IDENT_CURRENT('[MYTABLE]') AS [CURRENT], (2147483648 - MAX(IDENTITYCOL)) AS [AVAILABLE] FROM [MYTABLE]
) AS Q1

INNER JOIN

(SELECT @@SERVERNAME as [SERVER], [name] AS [TABLE], next_seed, range, current_max
FROM MSrepl_identity_range
INNER JOIN sysobjects ON
sysobjects.id = MSrepl_identity_range.objid) AS Q2

ON Q1.[TABLE] = Q2.[TABLE]
------------------------------------------------------------------

I would mostly bee interested in the DIFFERENCE column; that is, MAX(IDENTITYCOL) - IDENT_CURRENT('[MYTABLE]').  The down fall is that I have to explicitly type in the table name MYTABLE for each table I wanted to examine.  And since I compare the DIFFERENCE value between servers (to find the one where the difference is zero), this can get pretty tedious.

Really what I am looking for is a dynamic query that can tell me for every table that is replicated, what is the maximum value in the identity column and what is the current identity value for that table at that particular database.  The rationale is that I can find out which database is the last database to add to that table.  (This requirement is in addition to being able to view the other identity info as well, such as the range size.)

Thanks Paul, I hope you understand what I am looking for here.
0
 
youngrascalAuthor Commented:
Maybe what I need is a way to
a) dynamically determine the max value of each idenity column in the database and
b) join it in a meaningful way to the original query above

0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
PaulBarbinCommented:
I think we are close, have you verified that the query that we started with doesn't work?  It seems like it would.  Maybe if you could identify where its not working, we could make slight changes to it.  This is definitely the way to get the dynamic table name in your query....

Paul

0
 
PaulBarbinCommented:
btw, when I look in BOL, it doesn't show a column called objid in the MSrepl_identity_range table.  Should the link be done with tablename?

Paul
0
 
youngrascalAuthor Commented:
Paul,

Your query works when I run it on my publisher database, but when I run it on any of the replicas, no rows are returned (different servers).

But the data that your query returns is somehow different from the data that I expect to get.  I ran your query against my [old, non-dynamic] query:

----------------------------------------------------------------
-- Paul's Query:
SELECT Q2.[SERVER], Q1.[TABLE], [MAX], [CURRENT], ([MAX]-[CURRENT]) AS [DIFFERENCE], NEXT_SEED, RANGE, CURRENT_MAX, AVAILABLE  FROM (
SELECT 'OWNER' AS [TABLE], MAX(IDENTITYCOL) AS [MAX], IDENT_CURRENT('[OWNER]') AS [CURRENT], (2147483648 - MAX(IDENTITYCOL)) AS [AVAILABLE] FROM [OWNER]
) AS Q1

INNER JOIN

(SELECT @@SERVERNAME as [SERVER], [name] AS [TABLE], next_seed, range, current_max
FROM MSrepl_identity_range
INNER JOIN sysobjects ON
sysobjects.id = MSrepl_identity_range.objid) AS Q2

ON Q1.[TABLE] = Q2.[TABLE]

GO

-- My Query:
SELECT      Q2.SERVER_NAME,
          Q1.*,
          Q2.[NEXT_SEED],
          Q2.[RANGE],
          Q2.[CURRENT_MAX]
FROM     (     SELECT  t.TABLE_NAME,
               c.name AS COLUMN_NAME,
               IDENT_SEED(t.TABLE_NAME) As [SEED],
               IDENT_INCR(t.TABLE_NAME) AS [INCREMENT],
               IDENT_CURRENT(t.TABLE_NAME) AS [CURRENT]
          FROM      INFORMATION_SCHEMA.TABLES AS t
               JOIN sysobjects AS s ON
               s.name = t.TABLE_NAME
               JOIN syscolumns AS c ON
               c.id = s.id
          WHERE     t.TABLE_NAME LIKE 'OWNER' AND
                   OBJECTPROPERTY(OBJECT_ID(t.TABLE_NAME), 'TableHasIdentity') = 1 AND
               t.TABLE_TYPE = 'BASE TABLE'  AND
                   c.autoval IS NOT NULL )  AS Q1
     JOIN
          (     SELECT      @@SERVERNAME as [SERVER_NAME],
                    [name] AS [TABLE_NAME],
                    [NEXT_SEED],
                    [RANGE],
                    [CURRENT_MAX]
               FROM      distribution..MSrepl_identity_range mi
                    JOIN sysobjects so ON
                    so.name = mi.tablename AND
                    db_name() = mi.publisher_db     ) AS Q2 ON
                      Q1.[TABLE_NAME] = Q2.[TABLE_NAME]
                     
----------------------------------------------------------------
With the following results:
                     
                     
SERVER      TABLE      MAX      CURRENT      DIFFERENCE      NEXT_SEED      RANGE      CURRENT_MAX      AVAILABLE
SQLP1      OWNER      2146521458      2146521458      0      2146522000      100      2146522000      962190

(1 row(s) affected)

SERVER_NAME      TABLE_NAME      COLUMN_NAME      SEED      INCREMENT      CURRENT      NEXT_SEED      RANGE      CURRENT_MAX
SQLP1      OWNER      RecordID      1      1      2146521458      2146522500      100      2146521999

(1 row(s) affected)
                     

I wonder why the Next_Seed value would be different (2146522000 vs 2146522500).  Funny thing though too is that the identity range for this table is 500 at the publisher and 100 at the subscribers.  I wonder why both queries return 100.  The above statement batch was executed on the publisher database.

Thanks for any help.
0
 
PaulBarbinCommented:
btw, I think you are mistaken on which one is mine and which is yours.  The first one in your last post is your query and mine is the second, right?

The reason that I think you aren't getting any data at the other servers is that there is no distribution database on the other servers, right?  Or if there is, it doesn't have any data in it, because that data is stored on the published server.

I still don't see the column called objid in the msrepl_identity_range table, but if you're not getting an error message, then change my query to match yours and see how it works:


SELECT      Q2.SERVER_NAME,
          Q1.*,
          Q2.[NEXT_SEED],
          Q2.[RANGE],
          Q2.[CURRENT_MAX]
FROM     (     SELECT  t.TABLE_NAME,
               c.name AS COLUMN_NAME,
               IDENT_SEED(t.TABLE_NAME) As [SEED],
               IDENT_INCR(t.TABLE_NAME) AS [INCREMENT],
               IDENT_CURRENT(t.TABLE_NAME) AS [CURRENT]
          FROM      INFORMATION_SCHEMA.TABLES AS t
               JOIN sysobjects AS s ON
               s.name = t.TABLE_NAME
               JOIN syscolumns AS c ON
               c.id = s.id
          WHERE     t.TABLE_NAME LIKE 'OWNER' AND
                   OBJECTPROPERTY(OBJECT_ID(t.TABLE_NAME), 'TableHasIdentity') = 1 AND
               t.TABLE_TYPE = 'BASE TABLE'  AND
                   c.autoval IS NOT NULL )  AS Q1
     JOIN
          (     SELECT      @@SERVERNAME as [SERVER_NAME],
                    [name] AS [TABLE_NAME],
                    [NEXT_SEED],
                    [RANGE],
                    [CURRENT_MAX]
               FROM      distribution..MSrepl_identity_range mi
                    JOIN sysobjects so ON
                    so.id = mi.objid AND
                    db_name() = mi.publisher_db     ) AS Q2 ON
                      Q1.[TABLE_NAME] = Q2.[TABLE_NAME]

Paul
0
 
youngrascalAuthor Commented:
Paul,

Yes, i'm sorry I mislabelled the queries in that batch.

You are also correct in that I do not have distribution databases on my other servers.  I only have one distribution server which also houses my publication database.  This publication database replicates to five other dbs, all on their own MSDE server.

These are the columns in my table msrepl_identity_range:
 objid, int
 next_seed, bigint
 pub_range, bigint
 range, bigint
 max_identity, bigint
 threshold, int
 current_max, bigint

I guess they are different from yours, go figure.

I tried your query and I received the following error:
Server: Msg 207, Level 16, State 3, Line 1
Invalid column name 'objid'.

I surmise that the difference between your query and mine is that I am querying a local table msrepl_identity_range, where as you are querying  distribution..MSrepl_identity_range.  

Is distribution..MSrepl_identity_range.current_max equivalent to MAX(IDENTITYCOL)? I thought that current_max is the highest value that can be assigned in the table's current identity range.
0
 
PaulBarbinCommented:
I get it! I didn't actually setup a publication that uses identity ranges (as I didn't know how to do that without looking it up).  So, I didn't have a local table and you do.  So, does removing the "distribution.." from my query, gives you the same results as with your original query?

According to BOL, Current_Max column is: "Current max that can be assigned but not necessarily be assigned"  which I THINK means the same thing as you said in your last sentence.

Are we there yet?

Paul

0
 
youngrascalAuthor Commented:
Paul,

Your query works (after removing references to the distribution db), but it does not return what I am looking for, which is MAX(IDENTITYCOL).

I still need a way to dynamically get the max values for the idenity columns for each replicated table in my database.  Theoretically, these values should be the same in the same table across each replicated db.

Thanks for your help, but we're not quite there yet.
0
 
PaulBarbinCommented:
Sorry for my confusion, but why is that not the same as current identity?

Paul
0
 
youngrascalAuthor Commented:
As I understand it,

Because I have an identity range for each database replica, the current Identity is specific to each replica.  Whereas the maximum value for the idenity column should be the same across all replicas (assuming that the replicas are synched).

i.e.

DB1 range: 100-199
DB2 range: 200-299
DB3 range: 300-399

The current identity at DB1 will fall between 100-199, the Max value for that column can be much higher if DBab or DBc has added a record.  If DB2 adds a record, the Max value will be 200.  If DB1 then adds a record, the max value will still be 200, although the current identity @ DB1 will be 100 (or 101?).

This is useful to be for a few reasons, but mostly I keep track of the highest identity value because some of my identity columns have been getting close to the 2^31 integer limit.

0
 
youngrascalAuthor Commented:
Oops, that should read:

"the Max value for that column can be much higher if DB2 or DB3 has added a record."
0
 
youngrascalAuthor Commented:
I have had someone else suggest to me to use the undocumented stored procedure sp_MSforeachtable to go through each table in the database.  This seems to solve my problem.

0
 
youngrascalAuthor Commented:
I appreciate the Expert's time and effort.  But I have not been able to resolve my problem of using a single query to obtain both the MAX(IDENTITYCOL) and other replication data from all users tables.

I have been able to produce the desired results by creating a query that returns MAX(IDENTITYCOL) for each table in the database using the undocumented stored procedure "sp_MSforeachtable"  I will exapnd the query to include the other replicated information.

Although, I would still prefer to be able to solve the problem as I had outlined at the beginning of this question, with a single query ratehr than with a stored procedure.

Thanks!
0
 
SpazMODicCommented:
PAQed, with points refunded (250)

SpazMODic
EE Moderator
0

Featured Post

Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

  • 9
  • 6
Tackle projects and never again get stuck behind a technical roadblock.
Join Now