Custom expression for a BinByEvenDistribution that is based on row count, not unique values.

Custom expression for a BinByEvenDistribution that is based on row count, not unique values.

book

Article ID: KB0079914

calendar_today

Updated On:

Products Versions
Spotfire Analyst All Versions

Description

The built in BinByEvenDistribution() function creates a binned column where each bin has the same number of unique values as the others. At times, it is needed to have an even distribution based on row count and not unique data values. This can be accomplished by manually creating a function using the Percentile() function.

Note: For some data sets each bin might have a slightly different number of rows, if there are data points on the boundaries between bins.

Issue/Introduction

Custom expression for a BinByEvenDistribution that is based on row count, not unique values.

Resolution

Here is an example custom expression to calculate four evenly distributed bins.
case  
when [myData]<Percentile([myData],25) then "0 - " & Max([myData])
when ([myData]>=Percentile([myData],25)) and ([myData]<Percentile([myData],50)) then MIN([myData]) & " - " & Max([myData])  
when ([myData]>=Percentile([myData],50)) and ([myData]<Percentile([myData],75)) then MIN([myData]) & " - " & Max([myData])  
when [myData]>=Percentile([myData],75) then MIN([myData]) & " - " & Max([myData])
end
 

Note: Since this example uses 4 bins the 0, 25, 50, 75, and 100th percentiles are used in the function. If you want different bins, you would modify the percentiles in the Percentile() functions and introduce additional/fewer "when" clauses as needed to match the number of bins.

Additional Information

Doc: Statistical Functions > Percentile