Domain Hawkagent OutOfMemory causes the Businessworks/Adapter processes to stop processing.

Domain Hawkagent OutOfMemory causes the Businessworks/Adapter processes to stop processing.

book

Article ID: KB0099959

calendar_today

Updated On:

Products Versions
TIBCO Runtime Agent (TRA) -
Not Applicable -

Description

Description:

Hawkagent OutOfMemoryError due to BW/Adapter applications, started from the Admin GUI or AppManage, which were started after Hawkagent will be in a hung state.

Entry in the In Domain Hawkagent hawk.log file. 

==========================

YYYY MON DD hh:mm:ss:xxx GMT +1 HawkAgent.<DomainName> Error [Application] HWKXXX-000018 In LFMicroAgent exception from scan thread: java.lang.OutOfMemoryError: Java heap space

==========================



Entry in the Domain  Hawkagent tsm.log file. 
==========================
YYYY MON DD hh:mm:ss:xxx GMT +1 tsm Error [] [TRA-002042] MachineStatusHandler:timeElapsed() failed: java.lang.OutOfMemoryError: Java heap space.COM.TIBCO.hawk.talon.MicroAgentException: java.lang.OutOfMemoryError 
==========================


In BW application logs no further jobs processed will be seen. Engines are in a hung state.



Symptoms:
Some of Businessworks/Adapter processes, started using the Admin GUI or AppManage, stopped processing jobs or were blocked.
Cause:
The primary cause of OutOfMemoryError in the Domain Hawkagent is due to a low memory setting in hawkagent_<DomainName>.tra . The cause of BW/Adapter application hanging is due to Java Implementations, when a Java subprocess is spawned by the Java Runtime.exec() . The following Java description is referenced from. http://docs.oracle.com/javase/7/docs/api/java/lang/Process.html:


============================

By default, the created subprocess (BW / Adapter ) does not have its own terminal or console. All its standard I/O (i.e. stdin, stdout, stderr) operations will be redirected to the parent (Domain Hawkagent) process, where they can be accessed via the streams obtained using the methods getOutputStream(), getInputStream(), and getErrorStream(). The parent (Domain Hawkagent)  process uses these streams to feed input to and get output from the subprocess(BW / Adapter ) . Because some native platforms only provide limited buffer size for standard input and output streams, failure to promptly write the input stream or read the output stream of the subprocess(BW / Adapter )   may cause the subprocess(BW / Adapter )  to block, or even deadlock.
============================

Here is the life cycle explained from the TRA Hawkagent perspective.

1). When a BW /Adapter engine process is spawned by Hawkagent using java Runtime.exec() and stdout/stderr is not handled, the spawned process will finally consume all the buffer memory and hang.

2). When an BW /Adapter engine is started from Admin GUI, the stdout/stderr is actually redirected to HawkAgent 's  tsm.log.

3). When TRA HawkAgent goes OOM, tsm.log stops handling the stdout/stderr of the engines started from Admin GUI on this Agent, resulting in the BW /Adapter engine blocking/hanging.

4). When TRA HawkAgent is killed, the spawned engines become orphans so the buffers are cleared and the BW /Adapter engines resume.

See Resolution section for recommendations to avoid a BW /Adapter block/hang state.

Resolution

If you are observing  BW/Adapter application being blocked/hang due to Hawkagent OOM, the following is recommended.


For a permanent resolution:


1). Periodically restart Hawkagent on a daily basis.


2). Tune the value of the initial heap and Max Heap in hawkagent_<DomainName>.tra according to your environment .


============================

java.heap.size.initial=16m
java.heap.size.max=256M
============================


For a temporary workaround:


1).  Whenever Hawkagent goes in OOM, immediately manually restart Hawkagent .


Or 


2). Start the BW/Adapter application from the command line so stderr and stdout goes to the console and not to tsm.log.


Or


3). Try the suggestion from KB article :26809 and add the property (-XX:OnOutOfMemoryError) in Hawkagent_DomainName.tra, This will make sure the Hawkagent is killed when it goes in OOM so critical BW applications will continue to process the jobs.


Or 


4). Create a Hawk rulebase to restart Hawkagent on OOM .

Issue/Introduction

Domain Hawkagent OutOfMemory causes the Businessworks/Adapter processes to stop processing.

Additional Information

http://docs.oracle.com/javase/7/docs/api/java/lang/Process.html