DB2 LUW HADR primary did not establish connection with standby within timeout and will shut down. BY FORCE option required to start primary without standby.

Hi

I am facing the below message in diag log and could not start HADR.

MESSAGE : HADR primary did not establish connection with standby within timeout
          and will shut down. BY FORCE option required to start primary without
          standby. Timeout seconds =

Can you pls share some inputs.
Prardhan NAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Tomas Helgi JohannssonCommented:
Hi!


Check for HADR errors in the db2diag.log on both standby and primary.
Also check if you have configured the HADR parameters correctly.

In some cases when HADR communications between servers are in strange state I usually
stop HADR on both primary and standby servers. Then restart HADR in the correct order that is first the standby then the primary.
http://www-01.ibm.com/support/docview.wss?uid=swg21410648
https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/DB2HADR/page/HADR%20Tutorial

Regards,
    Tomas Helgi
0
Prardhan NAuthor Commented:
i stopped and re-started in correct order. but still having the same issue.
0
Prardhan NAuthor Commented:
I am also seeing below in the DIAG log:

RETCODE : ZRC=0x8280001A=-2105540582=HDR_ZRC_NO_STANDBY
          "Comm time-out in unforced HADR primary start, to avoid split-brain"
0
OWASP Proactive Controls

Learn the most important control and control categories that every architect and developer should include in their projects.

Prardhan NAuthor Commented:
I am also seeing below in the DIAG log:

 MESSAGE : HADR EDU sqlcode:
DATA #1 : Hexdump, 4 bytes
0x0780000003CD5C1C : FFFF F918
0
Tomas Helgi JohannssonCommented:
Hi!

Is the standby in rollforward pending mode or standard ?
issue db2 get db cfg for yourdbname | grep "Rollforward pending" it should say NO if HADR standby is running correctly.

Check if there is any firewall blocking your HADR ports.
Which OS are you running DB2 on ?

Regards,
     Tomas Helgi
0
Prardhan NAuthor Commented:
I am using AIX
0
Prardhan NAuthor Commented:
Yes,  Rollforward pending                                     = NO

on stand BY DB
0
Tomas Helgi JohannssonCommented:
Hi!

Have you checked if firewall is blocking the HADR ports between primary and standby ?
Also what is your HADR_TIMEOUT value ? db2 get db cfg for yourdbname | grep "HADR_TIMEOUT" ?

Regards,
    Tomas Helgi
0
Prardhan NAuthor Commented:
HADR timeout value                       (HADR_TIMEOUT) = 120
 HADR log write synchronization mode     (HADR_SYNCMODE) = NEARSYNC
 HADR peer window duration (seconds)  (HADR_PEER_WINDOW) = 300

I could not check the port availability
0
Prardhan NAuthor Commented:
netstat -an | grep 50051

$ netstat -an | grep db2c_db2inst1_hadr

I am not retrieving any output on primary or standby
0
Prardhan NAuthor Commented:
does this mean it is port issue?
0
Tomas Helgi JohannssonCommented:
Hi!

You could also try to stop the HADR on both servers, then issue db2stop/db2start on the standby and restart the HADR.
If your HADR configuration is correct and this doesn't work then the the firewall is the most likely cause.
What version is your DB2 ?
Regards,
    Tomas Helgi
0
Prardhan NAuthor Commented:
Version 9.7
0
Prardhan NAuthor Commented:
I did the re-start of HADR and checked, it is not starting.

SQL1768N  Unable to start HADR. Reason code = "7".

facing the above error while starting in primary

can you help me how to check the firewall issue and
How to check whether a port if free or not.

$netstat -an | grep 50051

$ netstat -an | grep db2c_db2inst1_hadr

These commands are not retrieving any output.  

I am using above commands to check whether a port is free or not.

can you pls correct me if i am wrong.
0
Tomas Helgi JohannssonCommented:
Hi!

Please check this out https://www-304.ibm.com/support/docview.wss?uid=swg21460503 
Post both the HADR config for primary and standby ( db2 get db cfg for yourdbname | grep HADR )
Use ping and telnet commands on both servers to find if the ports are open and they can talk to each other on the HADR ports.

Regards,
    Tomas helgi
0
Prardhan NAuthor Commented:
Configuration parameters are good.
It was working previously. To-day it is not working.

Is this command correct : telnet Ipaddress 50050

is this command correct ?
0
Tomas Helgi JohannssonCommented:
Hi!

Yes this command is correct.

Regards,
     Tomas Helgi
0
Prardhan NAuthor Commented:
Ports are good, How to check the last or latest transaction log file processed or replayed in stand by DB?

DO I need to check in archlogs path in stand by server?
0
Tomas Helgi JohannssonCommented:
Hi!

The command
db2pd -db <database_name> -hadr

Open in new window

will give you the status of the HADR and the current log on the standby/primary.
db2 get db cfg for yourdbname | grep "First active log file" will also give you what the current active log file is on standby/primary.

Also on what FixPack level is your DB2 ?
Here are the FixPack list for 9.7. Take a look at the HADR fixes to see if any of them matches your case.
http://www-01.ibm.com/support/docview.wss?uid=swg21412438

Have you gone through the list on this
https://www-304.ibm.com/support/docview.wss?uid=swg21460503

When you stopped the HADR on standby, did you issue
db2 deactivate db <yourdbname>
before
db2 STOP HADR ON db <yourdbname>

You said that the standby had the "Rollforward pending = NO" and the HADR is stopped.
If you stopped the HADR correctly on the standby it should read "Rollforward pending = DATABASE"
try to issue db2 deactivate db <yourdbname>. If that failes then
I think the best way to solve this is to take a new backup on the primary and restore it without rollforward on the standby.
Then you should be able to start the HADR without problems.
I got into similar problems years back and the standby database went into an active state for some reason but no logs luckily where produced on the standby and therefore no harm done.

Regards,
     Tomas Helgi
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Prardhan NAuthor Commented:
Thanks Tomas for your help.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
DB2

From novice to tech pro — start learning today.