Algorithm used for Statistica Feature Selection Module

Algorithm used for Statistica Feature Selection Module

book

Article ID: KB0080995

calendar_today

Updated On:

Products Versions
Spotfire Statistica 13.1 and higher

Description

This article shows the algorithm used for Feature Selection in Statistica.

Issue/Introduction

Algorithm used for Statistica Feature Selection Module

Environment

Windows

Resolution

Here is the documentation on the algorithm and computation details for feature selection module:
Overview of Feature Selection Module
Computational Details of Feature Selection Module

Here is a summary of the algorithm used in Feature Selection Module:

There is no specific academic citations/references for the Feature Selection algorithm used in Statistica. The pieces of the algorithm consist of standard statistical methods for assessing departures of independence between 2 variables. Continuous predictors are binned according to a user specified number of bins based on the range and the variability of the data. Once binned, predictors are then assessed individually based on the type of response variable. The categorical response variables are tested for dependence by computing the Pearson Chi-square test for independence between itself and each predictor. The continuous response variables are assessed via a one-way ANOVA. Both of these tests are general in the sense that they test for general departures as opposed to certain types e.g. linear or monotonic trends like Pearson's correlation or Spearman's correlation.