Conditions under which collected data is purged from filecollector staging directories in TIBCO LogLogic LMI

Products	Versions
TIBCO LogLogic Enterprise Virtual Appliance	all versions

Description

When collecting file-based data a best practice is to monitor LMI's /var/log/sys.log for activity that suggests the system is collecting more data than it can process in a timely manner. Alerts can be configured in the LMI GUI using the sample log below. Specifically this "timely manner" timeframe is defined as a 2 day window. The reason it is 2 days is because of a built-in cleanup task in LMI that prunes collected files in a rolling 2 day window (86400 seconds in 1 day * 2 days=172800 seconds, as referenced in the event below) from the filesystem to ensure the filesystem storage does not reach 100%. This is similar but not exactly the same as the emergency purging mechanism. The emergency purging mechanism only operates on the BFQ (/loglocal/data/vol1/) and is managed by engine_archive whereas this 2 day cleanup maintenance process occurs on /loglogic/data/filecollector and is managed by engine_filecollector.

If you see messages like the following in /var/log/sys.log then you'll need to investigate and maybe tweak LMI's settings to ensure collected logs are not deleted before they can be processed from the staging directory into the BFQ. Contact TIBCO LogLogic Technical Support if you need assistance or have questions about this.

<14>May 20 21:30:25 logapp COLLECTOR_FC: %LOGLOGIC-6 module:engine_filecollector(21577); file:engine_filecollector.c(flush_space,7767); action:--file /loglogic/data/filecollector/archiver/38_20_192.168.179.89_29_1526689766_1.txt.xml deleted, reason: its mtime (1526689768) is beyond the threshold (172800). ;

Data is deleted from /loglogic/data/filecollector/ sub-directories based on 3 different conditions:
1. If all forwarding rules are removed then the remainder of files in /loglogic/data/filecollector/forwarder are deleted.
2. If a file is older than the time threshold of 2 days then it is removed. This is the focus of this article.
3. If the disk usage exceeds the disk usage threshold then the oldest file is removed from both archiver/ and forwarder/. The threshold is different for each directory though.
For /loglogic/data/filecollector/archiver the threshold is 20GB * the value in the /loglogic/conf/capability.xml file for flush_old_files_size, or 60GB = 1.2 TB.
For /loglogic/data/filecollector/forwarder the threshold is 20GB * 2 * the same value in capability.xml = 2.4TB.

The 1st and 3rd conditions usually aren't triggered but the 2nd condition can easily be triggered if the system is overloaded.

LogLogic LMI supports parallel file collection, parallel file processing and parallel file forwarding but by default these are not enabled. So if the appliance is logging errors about files being deleted from the filecollector sub-directories it could simply be because parallel processing needs to be enabled to ensure all the downloaded files are processed more quickly. If parallel processing is already enabled then the settings may need to be increased, or if that has already been attempted then it could be an indication that the system is simply unable to keep up with all the collected data, regardless of the custom settings applied to the configuration.

Environment

This issue and the referenced error can be seen on any LMI version but in order to use file collection parallelism settings the LMI system must be running LMI 5.6.2 HF3 or higher for 5.6.2, 5.6.3 HF2 or higher for 5.6.3, and 5.7.0 or higher for all other LMI versions. Parallelism is not available in LMI prior to LMI 5.6.2.

Resolution

To configure file-based parallelism create the /loglogic/conf/fc.conf file and place the following 2 settings in it. These settings have a default value of 1 unless explicitly overridden. The maximum recommended value is the # of CPU cores - 1 in the appliance. Values higher than that will not provide linear performance.
NumParallelFwdProcs=
NumParallelParserProcs=

Save the file after making the changes then restart file collector so the fc.conf file is re-read:
$ mtask -s engine_filecollector restart

See the LMI Admin Guide in the Configuring Parallel File Processing and Parallel File-Forwarding section for more information about these settings.

Issue/Introduction

This article describes the conditions that can trigger engine_filecollector to delete unprocessed data from the filecollector staging directories.

Welcome to "KB Articles"