Why TIBCO Spotfire Node Manager goes offline with communication errors

Why TIBCO Spotfire Node Manager goes offline with communication errors

book

Article ID: KB0077065

calendar_today

Updated On:

Products Versions
Spotfire Server All Versions

Description

By default the amount of time TIBCO Spotfire Server waits for a Node Manager to signal its presence is 12,000 milliseconds(12 seconds).
TIBCO Spotfire Server assumes that the node is offline if Node Manager does not send a signal within the configured time period.

If there are frequent communication issues between TIBCO Spotfire Server and Node Manager for some milliseconds because of network issues and for setups that are experiencing a heavy load, you can raise this value to avoid unnecessarily restarting a node manager.

Here is the log snippet of communication problems when the threshold value exceeds 12,000 milliseconds.
WARN 2019-09-30T09:45:07,659-0400 [nodemanager@SPOTFIRESYSTEM, #3, #90438] server.http.HttpCallable: Failed to send request to 'https://WIN-MGrtyV45HGDI:9446/spotfire/remote/message' : I/O error on POST request for "https://WIN-MGrtyV45HGDI:9446/spotfire/remote/message": Connect to WIN-MGrtyV45HGDI:9446 [WIN-MGrtyV45HGDI/10.97.37.148, WIN-MGrtyV45HGDI/fe80:0:0:0:850:111c:da5f:cb26%12] failed: Connection refused: connect; nested exception is org.apache.http.conn.HttpHostConnectException: Connect to WIN-MGrtyV45HGDI:9446 [WIN-MGrtyV45HGDI/10.97.37.148, WIN-MGrtyV45HGDI/fe80:0:0:0:850:111c:da5f:cb26%12] failed: Connection refused: connect
..................................
..................................
DEBUG 2019-09-30T09:48:21,841-0400 [service-inspector-33391] nodemanager.control.ServiceInspector: Node Node [serverId=340e0ec9-9c06-4ecc-a2b3-ae939f45321d, serverName=WIN-MGrtyV45HGDI, port=9446, siteID=5c2c7b84-e1f4-4187-9799-85a2a48f0ebc] missing for 9 intervals. Last seen 179 seconds ago.
INFO 2019-09-30T09:48:21,841-0400 [service-inspector-33391] nodemanager.control.ServiceInspector: 340e0ec9-9c06-4ecc-a2b3-ae939f45321d - WIN-MGrtyV45HGDI set as offline after missing 9/2.
INFO 2019-09-30T09:48:21,841-0400 [service-inspector-33391] nodemanager.control.ServiceInspector: Threshold 12000 or missing count 9/1000 exceeded for node 340e0ec9-9c06-4ecc-a2b3-ae939f45321d

 

Issue/Introduction

Node Manager goes offline with communication issues for some milliseconds

Resolution

Increase wait timeout value using the procedure mentioned in document Configure amount of time TIBCO Spotfire server waits for Node Manager to signal its presence

Note: Applying this change in general is not recommended and you need to work with your IT team on such communication issues to improve connectivity. If the issue persists and you cannot determine a root cause, please contact TIBCO Spotfire support at https://support.tibco.com for further assistance.

Additional Information

Doc: Changing how long the server waits before assuming that a node manager is offline.