How Statistica selects reference level or reference group for categorical variables?

How Statistica selects reference level or reference group for categorical variables?

book

Article ID: KB0074216

calendar_today

Updated On:

Products Versions
Spotfire Statistica 13.0 and higher

Description

This article explains how Statistica determines / selects reference levels or reference groups for categorical variables in analysis.

Issue/Introduction

How Statistica selects reference level or reference group for categorical variables?

Environment

Windows

Resolution

1. If the categorical variable/predictor is of type “Text”, the codes in the analysis are selected by order of appearance by default, and the last level to appear in the data becomes the reference level for the categorical variable.

2. If the categorical variable/predictor is of type “double” or "Integer" with text labels, the text label with the largest double or integer value becomes the reference level by default.

3. The user also has the ability to set the order of the levels in the analysis design by using the code selection button, and the last level specified by the user will be set as the reference level.

For example, in GLM analysis module, user can manually specify the order of the codes for categorical variables by clicking on "Factor Codes" during model specifications.

The last code specified will be used as reference level for the categorical variables in the analysis. In above screenshot example, "g3" will be used as reference level for variable "gear", and "c6" will be the reference level for variable "carb".