TIBCO Streaming application embedded performance monitoring

Products	Versions
TIBCO Streaming	10.x and later

Description

I have concerns there are performance problems with my application, but I cannot get approval to run 'sbprofile' in my production environment.

How can I monitor my application?

Resolution

The following steps add basic monitoring to your application. The events recorded are logged to the node engine log along with the normal logging from your application. This is a lightweight approach to adding performance monitoring to your application.

The steps are:
1. Copy the ready-made SBMonitorServer EventFlow module into your project.
2. Add configuration to run this module when your application runs.
3. Rebuild your Application project.
4. Deploy and run your application normally.
5. Review the engine log at any time to see if performance events are reported.

Detailed Steps

1. In SB Studio, use menu: File > Import Samples and Community Content..., browse for "StreamBase Monitor Adapter Sample" and load it into your workspace.

Create in your project's 'src/main/eventflow' folder the new package "com.sample.adapter.sbmonitor".

Copy the 'SBMonitorServer.sbapp' module into this new package.

2. If your project is and EventFlow fragment or LiveView fragment, the configuration is different:

For EventFlow fragment projects:

Add to your EventFlowDeployment HOCON configuration in 'src/main/configurations' the container "monitor" (example):

EventFlowDeployment = {
    containers = {
      monitor = {
        moduleName = "com.sample.adapter.sbmonitor.SBMonitorServer"
      }
    }
  }

For LiveView fragment projects:

Create in the 'src/main/liveview' folder a file "SBMonitorServer.lvconf" with the following contents:

<?xml version="1.0" encoding="UTF-8"?>
<liveview-configuration xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
  xsi:noNamespaceSchemaLocation="http://www.streambase.com/schemas/lvconf/">
  <application filename="com.sample.adapter.sbmonitor.SBMonitorServer" id="SBMonitorServer"/>
</liveview-configuration>

3. Rebuild your Application project (example: mvn clean package -DskipTests).
4. Deploy and run your application normally (example: epadmin install node nodename=A.X application=App-0.0.1-ep-application.zip).

5. Review the engine log.

While the server is running use command:
epadmin servicename=A.X tail logging

After the application is stopped, obtain the engine logs using:
epadmin servicename=A.X create snapshot
Extract the snapshot ZIP file and review the log files in folder 'logs/'.

In the engine log files look for logged events from loggers:

GapDetectLog - reports if not update from the server for GapThreshold seconds (default: 2 seconds).
GCPauseLog - reports if a JVM garbage collection took greater than GCThresdholdMSPerSec milliseconds (default: 10) or GCThresdholdCountPerSec events (default: 200).
OperatorBusyLog - reports if an operator takes more than OperatorCPUThresholdMS milliseconds (default: 300) of CPU time in a single one-second sampling period.
QueueBusyLog - reports if a queue has greater than QueueSizeThreshold tuples (default: 1000).
StateChangeLog - reports Monitor INITIALIZING, CONNECTED, and DISCONNECTED events.

These threshold values are stored as Module Parameters in the "SBMonitorServer.sbapp" and may be adjusted higher or lower as it is important for your application.

As implemented the ThreadBusy and LowMemory output streams do not have an associated logger. If these are of interest, add loggers the same as shown for the other performance monitor flows. These events are:

ThreadBusy - reports when CPU time exceeds ThreadCPUThresholdMS milliseconds (default: 800).
Low Memory - reports when Heap memory use exceeds PercentUsedMemoryThreshold (default: 90%).

NOTE: These thresholds assume the regular logging period of one report each second. If this is changed in the StreamBaseEngine HOCON configuration,engineMonitor, statsFrequencyMilliseconds value, then the SBMonitorServer will report incorrectly.

Interpreting Log Messages

Performance problems are first observed when the OperatorBusyLog reports operators using 1000 milliseconds per second (constantly busy), or the QueueBusyLog continuously reports some queues with growing (not low counts and not decreasing) tuple counts. Both of these indicate that a portion of the application has too few machine resources to handle the required input data rate. The machine resources which may be exhausted or slow are CPU, Memory, Disk, Network, and external services. Other errors tend to show up as side effects of operator and queue problems. The correct change to obtain better performance will be specific to every situation.

Issue/Introduction

Administration Guidance

Welcome to "KB Articles"