It is possible to overload the Aggregate operator with too many tuples-per-second (tps) based on how much work has to be done in the Aggregate operator thread. In order to handle a higher tps rate the work needs to be split across multiple Aggregate operators each in their own threads.
Splitting the data into the Aggregate operator is only available when using
Group Options because each individual group can be processed in a separate operator instance. If there is only one windowed group, then all input must be processed by the same Aggregate operator instance for the calculation result to be valid. In the case shown here, the field "n1" is being used to group and has 20 or more unique values by which to distinguish each group by the hash result, resulting in each tuple of a group being directed to the same Aggregate instance and all Aggregate instances being nearly uniformly used. This is the ideal use-case.
The key parts of this configuration are: 1. The Aggregate is using
Group Options to split windows by a data value.
2. The Aggregate is set to
many instances, but less than or equal to the number of unique hashes produced by the 'hash(string-field)' function.
3. The Aggregate is made
Concurrent by enabling "Run each instance of this component in a parallel region".
If the grouping field is a number, then the field value can be used directly without using the 'hash()' function.
StreamBase 7.x: To determine the number of Aggregate operators needed, drive data in at various rates and monitor the application with
Studio SB Manager,
sbmonitor, or
sbprofile to identify at what rate persistent queuing begins. Note that brief queuing for several seconds may occur with tps rates greater than 10,000 tps, but with enough instances any queuing is also frequently drawn back down to single digits queued.
TIBCO Streaming 10.x: To determine the number of Aggregate operators needed, drive data in at various rates and monitor the application with
Studio SB Manager,
SB Cluster Monitor, or
sbprofile to identify at what rate persistent queuing begins. Note that brief queuing for several seconds may occur with tps rates greater than 10,000 tps, but with enough instances any queuing is also frequently drawn back down to single digits queued.
Will I need as many individual CPU units available as Aggregate operator instances? No. In practice when an Aggregate operator instance is fully utilized (moving tuples as fast as possible), it does not use the entire CPU time-slice due to internal blocking on shared resources. This allows another thread into the core to do its work. You will need fewer individual physical CPUs than concurrent Aggregate operator instances and still see improved throughput.