How is the value of 'EST SIZE' statistic in the AS Node status output calculated?

How is the value of 'EST SIZE' statistic in the AS Node status output calculated?

book

Article ID: KB0071222

calendar_today

Updated On:

Products Versions
TIBCO ActiveSpaces 4.8.x

Description

The EST SIZE statistic is calculated by our storage engine, and it is an estimate of the size on disk (i.e. after compression etc) of the collection of immutable sorted-string table files (sst files) of the live data directory of the node. Checkpoints are not included in the statistic. The estimate works by starting at the bottom layer of the LSM tree and going up and adding up the sizes of all sst files that don't overlap in key range. This means that data dirs which are less compacted (i.e. more levels, more data overlapping between levels) will exclude more sst files from the estimate (and thus have a smaller estimated size) than data dirs which may contain the same data but in a more compacted form.

Note that the value of 'EST SIZE' may be different between primary and mirror girds. A primary grid data dir and mirror grid data dir may show different EST SIZES for a number of reasons. A primary grid live data dir may contain writes after the last checkpoint. Also it may contain keys that are written and then deleted before getting replicated to a mirror grid. A mirror grid node live data dir (in the case of incremental mirroring) contains a computed set of writes that are the delta of two checkpoints. Thus the shape of the mirror grid's live data LSM tree may be different from the primary grid.
 

Issue/Introduction

How is the value of 'EST SIZE' statistic in the AS Node status output calculated?

Environment

All supported platforms

Additional Information

AS Node Status, Estimated Data Size, EST SIZE