Tracking EMS performance issue on Red Hat Enterprise Linux using SystemTap.

Tracking EMS performance issue on Red Hat Enterprise Linux using SystemTap.

book

Article ID: KB0079118

calendar_today

Updated On:

Products Versions
TIBCO Enterprise Message Service -
Not Applicable -

Description

Description:

KB 43188 has introduced the usage of DTrace to track EMS performance issues that are related to I/O system calls on the Solaris platform. This article will cover a similar topic, but for Linux, more specifically, RHEL. 

 

SystemTap is a similar tool on Linux to DTrace to Solaris. In certain cases, users have reported sporadic latency observed on EMS which is hard to determine the root cause mainly due to its unpredictable occurring pattern. strace can be applied to specifically determine the cause. Due to the lightweight nature of SystemTap, we can run a SystemTap script to capture all system calls made by EMS without producing a large logfile or compromising EMS performance.

Issue/Introduction

Tracking EMS performance issue on Red Hat Enterprise Linux using SystemTap.

Environment

Product: TIBCO Enterprise Message Service Version: OS: RHEL 6.x 7.x --------------------

Resolution

Two SystemTap scripts are attached.


1). latency.stp


This script will capture read/write/fdatasync/ftruncate/stat system calls made by EMS and calculate the delta time upon entry and return of the syscall. Given a threshold, the script will print out the current time (in epoch format), the syscall name and the number of milliseconds that syscall takes to run. 

Usage: ./latency.stp -x <pid_of_ems> <blocking_threshold_in_milliseconds> <duration_this_script_should_run>

For example, "./latency.stp -x 25984 2000 3600" will print out all syscalls that are taking longer than 2 seconds in the next hour (3600 seconds). When such an event is captured, the log looks like the following:

[1459538687] write blocked for 12668ms


2). writedist.stp


This script collects the running time of each write syscall and prints out a logarithmic histogram.


Usage: ./write.stp -x <pid_of_ems>  <duration_this_script_should_run>


For example, "./write.stp -x 20256 10" will collect the write syscall response time for the next 10 seconds for pid 20256. When it finishes, the output looks like the following:


Distribution of write latencies (in milliseconds)
 
value |-------------------------------------------------- count
    0 |                                                      2
    1 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@  1587
    2 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@           1292
    4 |@@@@@@@@                                            280
    8 |                                                     19
   16 |                                                      3
   32 |                                                      0
   64 |                                                      2
  128 |                                                      0
  256 |                                                      0
 



On Red Hat Enterprise Linux (RHEL), you can install SystemTap with the following command:
 
$ yum install -y systemtap systemtap-runtime

Executable permission should be given to the SystemTap script: 

From root user
chmod +x ./writedist.stp
chmod +x ./latency.stp

The SystemTap script needs to be run with root privilege.

Attachments

Tracking EMS performance issue on Red Hat Enterprise Linux using SystemTap. get_app
Tracking EMS performance issue on Red Hat Enterprise Linux using SystemTap. get_app