How to create a customized high-low close plot for prediction results from linear regression?

How to create a customized high-low close plot for prediction results from linear regression?

book

Article ID: KB0074383

calendar_today

Updated On:

Products Versions
Spotfire Statistica 12.7 and higher

Description

This article gives an example of how to create a customized high-low close plot. In particular, this high-low close plot uses data from estimated means and confidence intervals from linear regressions at specific value of the independent variables.

This example uses Poverty.sta data from Statistica example dataset folder and add three new categorical variables for illustration purpose: 

  1. "regression_set": contains value regress_1 to regress_4, and defines four subsets of the data for linear regression analysis
  2. "Vol": contains value 0 or 5
  3. "type": contains value 15 or 3
  4. within each Vol, there are two regression sets with each belongs to a specific type

PT_POOR is used as the dependent variable and POP_CHNG is used as the independent variable

Here is a screenshot of the data:

Issue/Introduction

How to create a customized high-low close plot for prediction results from linear regression?

Environment

Windows

Resolution

1. Open Poverty.sta and add those three new categorical variables

2. Under "Statistics" tab, click "Multiple Regression" 

3. Under "Quick" tab of the Multiple Linear Regression analysis dialog, click "Variables" to select PT_POOR as the dependent variable, POP_CHNG as the independent variable and click "OK" in the variable selection dialog

4. Click "OK" in the regression analysis dialog 

5. When the result dialog is prompted, click "By Group" on the right and then click "Grouping variable(s)..." to select regression_set as the group variable that the linear regression will be run by

6. Under the "By Group" dialog, select "Enabled" and "Accumulate tabular results in a single spreadsheet" , deselect other options and click "OK" 

7. Under the "Quick" tab of the results dialog, you can select confidence limits or prediction limits and change the default alpha value. Click "Predict dependent variable" to enter the value of the independent variable that the prediction will be made on (e.g. "0" as the POP_CHNG value) and then click "OK"

8. A "Predicting Values" spreadsheet will be generated with the fourth variable "b-Weight * Value" indicating the predicted mean and 95% CI from regression grouped by each regression_set

Following steps are demonstrated through the workspace instead of the interactive module since it is easier to accomplish the rest task using nodes in the workspace

9. Right click the generated spreadsheet and select "Extract as stand-alone window | Copy" to make it as a stand alone spreadsheet

10. Under "Home", click "Add to Workspace" when the predicting spreadsheet is active and also add the original input data to the same workspace

The workspace will include those two added datasets:

11. insert, configure and run "Merge Variables" node under "Data" tab to merge the predicting spreadsheet with the two added categorical variables from the original input data

12. insert, configure and run "Subset" node under "Data" tab to only include cases with predicted results and Confidence intervals 

13. insert, configure and run "Transforms" node under "Data" tab to create a new variable "predictions" that copies value from V4

14. Under "Graphs | Categorize", insert, configure and run "Categorized Means with Error Bars Plot" node.

15. To configure the graph node, go to "Advanced" tab of the node

click "Variables" to select corresponding variables and select "High-Low Close" as the Graph type, "Overlaid" as the Layout, "Median" as the Middle point and "Min-Max" as the Whisker value and run the node after configuration

The final workspace and graph are shown as below: