Link to home
Start Free TrialLog in
Avatar of Jim Youmans
Jim YoumansFlag for United States of America

asked on

DB2 HADR question

DB2 10.5 on Windows

We moved some large tables to new tablespaces in production and then restarted the server.  Now we can't log into our HADR server due to reason code 4.  From what I see reason code 4 is

Reason Code 4 = Connection requests to an HADR standby database are not allowed while the replay-only window is active. The replay-only window is active when DDL or maintenance operations are being replayed on the standby.

If the primary db was rebooted before the HADR changes were complete, could it get stuck like this?  I would assume it would just start up again when the primary came back online.

Thanks!
Avatar of Kent Olsen
Kent Olsen
Flag of United States of America image

Hi Jim,

Can you provide a little more detail?  The message suggests that you're trying to connect to the standby database (though IBM messages can be notoriously misleading).

If you are trying to connect to the standby, can you stop/restart the connecting agent?  


Kent
Avatar of Jim Youmans

ASKER

Yes I am trying to connect to the standby database. I am finding that the replay-only windows is switching on and off.  I think it is still trying to catch up from the table moves.  When I run this on the primary db

select * from table (mon_get_hadr(NULL))

I get the stats for the current change I think.  But I would like to know where I am at overall.  

Is there any way to tell how far behind it is?
Is there a way to see what command is being executed on the standby db while the replay-only window is active?
You can monitor the replay-only window from the primary or standby with db2pd

  db2pd -hadr
When I try it tells me I need an active database.  Since the HADR database is in Replay-Only mode, it won't let me connect.  Is there some way to see what is being done on the standby database?  I need to know if it is just catching up or if it is stuck in some kind of loop?
Hmmm....

You can't run that command on either server?  It should run on either server, and it is the tool that should tell you what's happening.

As much as I hate posting links...

  https://www.ibm.com/support/knowledgecenter/en/SSEPGG_11.1.0/com.ibm.db2.luw.admin.ha.doc/doc/c0056190.html

About 1/2 way down the page is the heading "Monitoring the replay-only window" and a sample of what db2pd reports.
That is where I found the query

select * from table (mon_get_hadr(NULL))

It shows me the activity but not what is actually being done.  Do you know of any way to see what is being ran on the HADR (standby) server when the replay only window is active?  Or compare logs from primary to secondary?

My guess is there is a back log of commands that need to be handled but I can't see the backlog, I can only see the one being ran.

Does that make sense?
If communications between the servers is normal, the standby will update the primary with its status at every ping (heartbeat) so you should be able to monitor the standby without having to connect to it directly.  But since that seems to be fighting back...

Can you connect to the standby server (not the database)?  An RDP session in Windows, Xterm connection in linux, etc.  If so, let's see what is running there and how much of the system resources are being tied up.

On Windows, Task Manager and Resource Monitor are the right places to start.  Process Explorer (from the Windows utility archive) if you have it installed.

On linux, run db2top.  It will show the db2 tasks, sorted by activity, so if the replay is actually doing anything, it should be at the top of the list and active.
I don't think I am being clear.  Sorry.  

I can RDP to the windows server and see that it is running OK.
I cannot connect the the db2 database, that is where the replay only window message comes into play.
When I run the select statement above, I can see the status of the current replay.  

What I am trying to find out is what DDL command is being replayed on the standby server and if there is a queue of commands.  I need to know where we are in the "catch up queue".  I am trying to find out the status of replication (in SQL Server lingo).  

Thanks!
Without connecting to the database, I don't know of any way to see that kind of detail.

From the command line, can you run db2pd -hadr?  These 4 lines should give you a feel for how close to synced the databases are:

              PRIMARY_LOG_FILE,PAGE,POS = S0000009.LOG, 1, 49262315
              STANDBY_LOG_FILE,PAGE,POS = S0000009.LOG, 1, 49262315
              HADR_LOG_GAP(bytes) = 0
              STANDBY_REPLAY_LOG_FILE,PAGE,POS = S0000009.LOG, 1, 49262315
Here is what I get when I run it on the primary database server.

User generated image
ASKER CERTIFIED SOLUTION
Avatar of Kent Olsen
Kent Olsen
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
The issue seems to have cleared up.  I am pretty sure it was catching up on the move tables commands from this weekend.  It is so frustrating to know exactly what I want to see but not know how to see it.  

Thank you so much for the help and your time!!!!

If you ever need SQL server help just ask. :)
Maybe we should introduce our bosses.  Maybe they'll let us swap responsibilities....  ;)