TIBCO BusinessEvents Agent hangs and Cache servers reports errors "java.lang.IllegalStateException: Service is not running".

TIBCO BusinessEvents Agent hangs and Cache servers reports errors "java.lang.IllegalStateException: Service is not running".

book

Article ID: KB0090592

calendar_today

Updated On:

Products Versions
TIBCO BusinessEvents Enterprise Edition -
Not Applicable -

Description

Resolution:
Description:
===========

TIBCO BusinessEvents Agent hangs and Cache servers reports errors "java.lang.IllegalStateException: Service is not running".

Environment:
===========
TIBCO BusinessEvents 4.x
All Operating Systems

Symptoms:
=========
Inference Agent hang, CacheServer log includes the following errors/warnings

AgentMgr.Thread.1] - [root] java.lang.IllegalStateException: Service is not running: ReplicatedCache{Name=ReplicatedCache, State=(SERVICE_STOPPED), Id=3, Version=3.0}

Warning [ReplicatedCache:EventDispatcher] - [runtime.util] Thread pool [CacheCluster] is overloaded and the job submissions are facing repeated failures. Job [com.tibco.cep.runtime.service.om.coherence.cluster.CacheCluster$MemberService@4e04e8] has already been submitted [75] times and is still being rejected.
2012 Sep 22 06:01:05:275 GMT +8 PH-CS-02-1-CacheServer Error [AgentMgr.Thread.1] - [root] java.lang.IllegalStateException: Timeout waiting for catalog update

Cause:
======
Coherence mis-configuration.

Resolution:
========
Add property tangosol.coherence.guard.timeout to the cluster level of the CDD and set it to a high value (e.g. 360000).

Based on the Oracle knowledgebase:

"http://coherence.oracle.com/display/COH35UG/service-guardian" the service-guardian/timeout-milliseconds should be set equal to or greater than the packet-delivery/timeout-milliseconds value (default 60000).
see: http://docs.oracle.com/cd/E14447_01/coh.330/coh33ug/packetdelivery.htm
To increase the packet delivery timeout parameter add the below CDD property:

be.coherence.packet-publisher.packet-delivery.timeout=360000

Note: For Coherence Version 3.6.0.0 disable guarding does not work (tangosol.coherence.guard.timeout=0).  Use a large timeout value to avoid the error.

Issue/Introduction

TIBCO BusinessEvents Agent hangs and Cache servers reports errors "java.lang.IllegalStateException: Service is not running".