TIBCO Streaming Cluster Disaster Recovery

TIBCO Streaming Cluster Disaster Recovery

book

Article ID: KB0079318

calendar_today

Updated On:

Products Versions
TIBCO Streaming -

Description

What are the considerations for provisioning a StreamBase 10 cluster for disaster recovery?

Issue/Introduction

Considerations provisioning for recovery within a StreamBase cluster

Resolution

NOTE: For StreamBase 7 use disk-based query tables in a Storage Area Network (SAN) and manage replication in the SAN. Other state cannot be preserved within a StreamBase 7 system. A server may recover additional state from the message bus, from the file-system, from databases, or other networked services, but these are not automatic and require some EventFlow development for the specific business use-case.

StreamBase 10 preserves state within a cluster for any element configured to use Transactional Memory (TM) which is automatically replicated between networked nodes. Nodes in a cluster may be geographically remote from each other but visible across the internet. A node's local copy of the TM state is lost if the node is stopped abruptly. As long as one node in the cluster remains functional state is preserved. When joining the cluster the node will recover state from any uninterrupted nodes. 

Configuraton Considerations

 
  • The cluster needs to have at least two nodes located at a distance from each other so that a disaster affecting one will not affect the other. 
 
  • Put all short-term state that is to survive a disaster in Transactional Memory.
Preserved short-term state is typically the contents of Query Tables, but also includes all other operators that remember past tuples for future comparison or evaluation, such as the Aggregate, Gather, Join, and Pattern operators.
 
To put a Query Table in transactional memory, set the table's StreamBase Properties, Table Settings tab, Type = "In transactional memory".

To put all operator's state in TM in a given .sbapp, open the .sbapp's Annotations tab, set Memory Model Settings, Storage Method = "In transactional memory". Within a single .sbapp all operators which do not have their own TM setting must use the same Memory Model.
 
  • Put all long-term state into an external database with its own high-availability and disaster recovery features.
 
  • In the special case of LiveView tables which do not use TM, file-based, peer-based, and message bus recovery work within a cluster. The file-based recovery journal files may be on a SAN. The message bus may be distributed and have its own disaster recovery features.
In SB Studio see the sample projects: "HA Tables - Fault-tolerant front end configuration", and "Recovery, EMS - Fault-tolerant recover using TIBCO EMS bus".
 
  • When there is no defined data distribution policy, all nodes installed into a cluster share in the default dynamic data distribution policy as equal peers. The default data distribution policy is to place all nodes in a small set of partitions and replicate state to sets of three nodes so that if any one node is terminated the entire state is still preserved. A node joining a cluster will impact the replication of new state, and will receive state from the other nodes to ensure the data distribution becomes fair.
The default dynamic data distribution policy is sufficient for most needs, but may be customized extensively for performance critical use cases. Typical customizations are to give certain nodes priority and keep more data local to improve local performance.

Additional Information

See the "Disaster recovery (DR) availability zone" diagram and description in the Product Documentation here:
  TIBCO Streaming > Concepts Guide > Concepts in Detail > High Availability in Depth