A newly created alert becomes disabled a few minutes after creating it in TIBCO LogLogic LMI

A newly created alert becomes disabled a few minutes after creating it in TIBCO LogLogic LMI

book

Article ID: KB0077085

calendar_today

Updated On:

Products Versions
TIBCO LogLogic Log Management Intelligence all versions up to 5.4.2

Description

Sometimes when adding a new alert that is tracking many devices (hundreds or thousands) individually you may see in sys.log (example below) an error that there was a failure loading an alert due to running out of memory.
Apr 10 00:41:49 localhost engine_lx_scheduler: %LOGLOGIC-3 module:engine_lx_scheduler(31450); file:BuildAlertTable.c(BuildAlertTable,1592); action:ERROR: failed to load Search-filter alert: out of memory;

This error will be triggered a few minutes after saving the new alert if the memory is full. All the other alerts will work, just not this new one. If a couple of devices are added it works fine, but if a large group or more than a X of devices are added, the new alerts become disabled. The exact # of devices that can be added to the alert before running out of memory will differ on a case-by-case basis because it depends on how much memory is available, which is predicated on the specific alert configuration that already exists.

 

 

Environment

This behavior does not affect LMI 5.5.0 and higher.

Resolution

The workaround is to reduce the # of devices being monitored by the alert but this isn't ideal. The long term solution is to adjust the amount of memory for LMI. Note that this isn't standard RAM seen by the OS but rather what is called shared memory solely used by LMI for its various components.

1. Create the following file:

$ touch  /loglogic/conf/mem.conf

2. Set the shared memory size by entering the following line into mem.conf:
GlobalShmemSize=97000000 

3.  Restart mtask:
$ mtask stop
$ mtask start

In HA configurations, execute steps 1 and 2 on the standby appliance first then reboot the appliance. While the standby is rebooting execute steps 1 and 2 on the master appliance and then reboot it. This will ensure old shared memory information is cleared from both appliances.

Note:
Default shared memory size for alerting is set to 2MB. Pre-defined search filter starts at 25kB usage per alert and can increase depending on search string used. All other alerts are set to use 370 bytes of memory when loaded. When using the option to track devices individually you will need to multiply this number by the number of devices being loaded to determine actual memory footprint.


Example:
5 alerts defined: 3 system alerts, 1 search filter alert and 1 search filter alert set to track individually.  1200 devices are in a group to be tracked individually

(3 * 370) + 25600 + (1200 * 25600) = 30,746,710 KB (or 29.3MB) Total amount of shared memory used for these 5 alerts


Note:
Maximum tested size is 165 MB of shared memory used.


Note:
In order to see if mem.conf is being read, go into watchqueue and choose page 6. Then at the top look for the following information:
shmStart 0x40193000, S_Global size 0x05c81a40(97000000), GBL_SHM_SIZE 0x04100000(68157440), BulkInsert Debug=0 BIT=1251922727(Wed Sep 2 13:18:47)


Look at the S_Global size:
S_Global size 0x05c81a40(97000000) << Shows 97,000,000 bytes used.
 

Issue/Introduction

This article explains why a new alert can become disabled soon after adding it in LogLogic LMI. This article also provides a solution.