How to rejoin DR backup machines to a cluster.

How to rejoin DR backup machines to a cluster.

book

Article ID: KB0092452

calendar_today

Updated On:

Products Versions
TIBCO Enterprise Message Service Appliance -
Not Applicable -

Description

Description:
How to rejoin DR backup machines to a cluster.
Symptoms:
Error messages regarding communications errors and unexpected DR states of AltSync, CommLost, Commwait or DR_Down (generally shown via  the "info" command).


Cause:
In some instances, backup machines can be dropped from a cluster. This can be triggered by configuration changes, by commands run such as dr-config or ems-config.

Issue/Introduction

How to rejoin DR backup machines to a cluster.

Resolution

Steps to rejoin DR machines to a cluster are listed below:

The main caveat for using the “rejoin” sequence below is that the Production-A machine should be the DR_Leader, and DR should have “Active” status (not “Restore” or “Recover”). This can be viewed in the "info" command output:

********* HOST ROLE: Production-A,  DR_Leader: Active  (TIBCO-EMS-HI) *********
_____ INSTANCE 0 , ROLE = PRIMARY _____
EMS State - ACTIVE-REPLICATING : Accepting connections on tcp://10.160.5.21:7222
DR Sync Service State - DR_Down
ERROR: Sync Service Error, Sync Snapshots Disabled, Manual recovery required.
_____ INSTANCE 1 , ROLE = SECONDARY _____
EMS State - STANDBY-REPLICATING : Standing by for failover
DR Sync Service State - Waiting

STEPS:
1). Logon to Production-A EMSA.
Production-A CLI> key-accept

2). Logon to Production-B.
Production-B CLI> key-join

3). Logon to DR-A.
Backup-A CLI> key-join

4). Logon to DR-B.
Backup-B CLI> key-join

5). Log back in to Production-A.
ProductionA CLI> key-accept off

This can then be checked by running the dr-test command on each machine to ensure all machines are appropriately joined.

NOTES:
There is no harm in running the key-join again on an appliance that is already joined.
The dr-test command is available for support starting in version 2.2.1 to check cluster “joined” status. It will likely be a general “user command” and documented in an upcoming release.