FOM-2.1 members fail to start.

FOM-2.1 members fail to start.

book

Article ID: KB0093169

calendar_today

Updated On:

Products Versions
TIBCO Fulfillment Order Management -
Not Applicable -

Description

Description:
FOM members are not starting even though some of them might still be running.
Symptoms:
A multi-instance FOM-2.1 environment is not properly starting. Active Space does not report any member joining the manager's OMS metaspace. DOMAINMEMBERS OMS table reports INIT status and catalog is not synchronized. It appears the member initialization is hanging.

Log traces report: 
==============================
07 Aug 2014 23:22:19,275 ERROR [] [1-12773173679] [] [org.springframework.transaction.support.TransactionTemplate] [] - Application exception overridden by rollback exception
java.lang.ThreadDeath
at java.lang.Thread.stop(Thread.java:835)
at com.tibco.aff.oms.server.jms.orch.batch.impl.BatchProcessor.stop(BatchProcessor.java:758)
at com.tibco.aff.oms.server.jms.orch.listeners.InitNode.stop(InitNode.java:60)
==============================
Cause:
There are two issues involved, the first for checkpoints cleanup, the second for Orchestrator start up.

1). Checkpoints cleanup:
-----------------------
After processing, checkpoints related to finished orders should be cleaned out of the OMS DB. This is done through a periodic route called ORCHTimeBasedCleanup calling a delete query on RESOURCEUPDATECK table. If this table has too many entries, the corresponding query has performance drawbacks which is delaying/failing the cleanup.

This issue is being tracked in FOM by JIRA AF-4769. Cleanup SQL not optimized

2). Orchestrator startup:
------------------------
FOM-2.1 Orchestrator does not support uncleaned checkpoints upon start up, which is why it stops automatically.

This issue is being tracked in FOM by JIRA AF-4808. Corrupted checkpoint and Null Body JMS Messages causes the Engine Not to restart.

Issue/Introduction

FOM-2.1 members fail to start.

Resolution

1). Workaround.
--------------
A manual cleanup of the RESOURCEUPDATECK table can be done to allow Orchestrator to start up. The following query can be used: 

DELETE FROM RESOURCEUPDATECK
WHERE orderid IN (SELECT A.orderid
    FROM ORDER_DETAILS A,
    OMS_ROUTER_TBL B
    WHERE A.status IN ('CANCELLED', 'COMPLETE')
    AND A.orderId = B.orderId
    AND B.nodeid = 'Member1')

The periodic checkpoints cleanup should be disabled while setting to false the following:

<route id="ORCHTimeBasedCleanup" autoStartup="false">

in file webapps/WEB-INF/classes/spring/scorch-listeners-context.xml

2). Fixes.
---------
AF-4769  -  Cleanup SQL not optimized. is fixed in FOM-2.1.1-HF1.
Checkpoints are cleaned up asynchronously and more frequently (configurable period), which avoids the aggregation of uncleaned entries.

JIRA AF-4808  -  Corrupted checkpoint and Null Body JMS Messages causes the Engine Not to restart is fixed in FOM-2.1.1-HF1. Now Orchestrator supports uncleaned checkpoints, ignores and logs them upon start up.