Streaming node start fails with error DSEBuiltin::ExceptionResourceUnavailable

Streaming node start fails with error DSEBuiltin::ExceptionResourceUnavailable

book

Article ID: KB0072486

calendar_today

Updated On:

Products Versions
TIBCO Streaming 10.x and later

Description

The 'epadmin install node' command succeeds, but the 'epadmin start node' command fails with error:
Unable to retrieve isEngineRunning from the node: Cannot send request to the administration engine of node node.cluster because the administration engine 
is not running. If the administration engine shutdown unexpectedly then the node needs to be re-installed. The node's administrator can refer to the node's 
log files for additional information.
DSEException::updateState: FATAL: recursive exception found, exception previously thrown 'DSEBuiltin::ExceptionResourceUnavailable', exception currently 
thrown 'AdminFailed'
utl_dump_stack: Start of stack trace for pid 63788, tid 63788
#0 epadmin:utl_dump_stack
#1 epadmin:DSEException::updateState(DSEException::ExceptionState)
#2 epadmin:NodePlugin::executeSwitchadminCommand(SWString const&, SWString const&, SWHeapStrList*, SwitchadminNet::ResultType*, SWString*, 
  switchadminprotocol::TabularData*)
#3 epadmin:NodePlugin::checkNodeCommandAccess(SWString const&)
#4 epadmin:NodePlugin::doStop()
#5 epadmin:NodePlugin::start()
#6 epadmin:NodeCommand::execute()
#7 epadmin:Command::start()
#8 epadmin:CommandExecutor::startSynchronous()
#9 epadmin:CommandExecutionManager::start()
#10 epadmin:SwitchadminClient::executeNodePluginCommand()
#11 epadmin:SwitchadminClient::doIt(int, char**)
#12 epadmin:main
#13 /lib64/libc.so.6:__libc_start_main
#14 epadmin() [0x481e29]

Issue/Introduction

Identify the underlying error and steps to resolve TCP-port conflicts

Resolution

The "DSEBuiltin::ExceptionResourceUnavailable" is a generic startup error when a JVM or platform port-conflict, lack of memory, or permissions problem occurs. To find the specific error, open the file:
  {node}/logs/bootstrap/*.err
and other bootstrap/ folder log files in a text editor. The '{node}' is the node directory created by 'epadmin install node'. The most common cause is a TCP port conflict, but other causes are possible. This article discusses how to resolve port conflicts. Please search other references for other causes.

Port Conflicts

If the issue is an inability to acquire a required port, this is most often caused when starting multiple nodes on the same system with similar configuration files and one node has acquired a TCP port that a node started later also needs. In the event that created the above error, an explicitly specified TCP port was the same for both nodes and was owned by the first node started.

Ports conflicts for ports defined in the StreamBaseEngine HOCON result in this error. For example, you may see this configuration for JMX:
jvmArgs =
  [
    "-Dcom.sun.management.jmxremote.port=9999"
    "-Dcom.sun.management.jmxremote.ssl=false"
    "-Dcom.sun.management.jmxremote.authenticate=false"
  ]
In this case when port 9999 is already in use node startup fails with this error.

Ports defined in the NodeDeploy HOCON result in a "bind" error during installation and not this exception at startup. Ports defined in adapter settings or configuration results in adapter errors and not this exception.

You may parameterize port values to avoid a conflict by using a substitution as so:
"-Dcom.sun.management.jmxremote.port=${JMX_PORT:-9999}"
Install each application instance using the substitution (example):
  epadmin --hostname={host} --adminport={port} install node --substitutions="JMX_PORT=9991"