What could cause both instances run as active (Dual Active instances) in Fault Tolerance EMSA appliances?
book
Article ID: KB0094423
calendar_today
Updated On:
Products
Versions
TIBCO Enterprise Message Service Appliance
-
Not Applicable
-
Description
Description: Both instances ran as active in Fault Tolerance EMSA appliances. What could cause this dual active situation?
Symptoms: N/A
Cause: EMSA appliance uses crosslink to send heartbeats and does replication
between EMS FT instances. However, the EMS server will not decide to go
ACTIVE-STANDALONE using the heartbeat only. It contacts the monitioring
agent when it detects a replication failure. The monitoring agents on
the two appliances heartbeat every second over all three links, Mgmt-IP,
EMS-IP and Crosslink-IP. Only when all three are lost will the peer
appliance be declared down causing a STANBY to transition to STANDALONE.
We have seen a users case that the nvOSd was restarted due to the EMS-IP and Crosslink-IP heartbeats being lost. This error does not affect the Mgmt-IP which is independent of nvOSd. However, in this case, the mgmt-IP was also disconnected at that time so this resulted in all three physical links being missing at the same time, leading to dual active EMS instances situation.
Issue/Introduction
What could cause both instances run as active (Dual Active instances) in Fault Tolerance EMSA appliances?
Resolution
In the support menu: call-peer <ip> can be used to test the local connectivity between appliances independent of the EMS gateway setting . You can run following command from one appliance's CLI to test the connectivity with its peer:
Make sure the cable is plugged in. Check if the cable is in good condition. Log into the backend of the box and do an "ifconfig" to see if all the IPs are up.