How to do a Model II Regression Straight Line Fit, using a data function in TIBCO Spotfire

How to do a Model II Regression Straight Line Fit, using a data function in TIBCO Spotfire

book

Article ID: KB0072300

calendar_today

Updated On:

Products Versions
Spotfire Analyst All
Spotfire Enterprise Runtime for R All

Description

There are different methods of calculating a straight line fit through data. The default Straight Line Fit available in Lines and Curves is calculated by choosing the line that minimizes the least square sum of the vertical distance between the data points and the line fit. This least squares regression is appropriate when there is uncertainty regarding the y-variable. If both variables are subject to sampling and measurement error, then major axis or reduced major axis regression is recommended.

Least Squares

User-added image

Major Axis

User-added image

Reduced Major Axis

User-added image
In the first two cases, the sum of the squared distances indicated by the green lines is minimized. In the final case, it is the areas of the triangles bounded by the horizontal and vertical green lines that are summed and minimized. The model II regression methods of ordinary least squares (OLS), major axis (MA), standard major axis (SMA) and ranged major axis (RMA) are described in Legendre and Legendre (1998, Section 10.3.2).  OLS, MA and SMA are also described in Sokal and Rohlf (1995).

Resolution

Step 1. Create the data function:

Create the data function which takes the data table and performs the regression analysis.

This can be created here:

  1. Edit > Data Function Properties > Register New
  2. Enter a name: Model II Regression Straight Line Fit.
  3. In the 'Script' tab, enter the script:
     #--------------------------------------- #Install and load the user-contributed "lmodel2" package from the open-source R community's CRAN repository: install.packages( "lmodel2", repos = "http://cran.us.r-project.org" ) library( lmodel2 ) #Create regression model, from input data table "inputdf" suppressMessages( RMA_model <- lmodel2( Y~X, data = inputdf ) ) #Extract the SMA slope and intercept #Available methods: # OLS = 1  #Matches "Lines & Curves > Straight Line Fit" # MA = 2 # SMA = 3 method <- 3 intercept <- data.frame( RMA_model[3] )[method, 2] slope <- data.frame( RMA_model[3] )[method, 3] #Output as data frame, for returning to Spotfire as a data table output <- data.frame( intercept, slope ) #---------------------------------------
  4. In the 'Input Parameters' tab, click "Add..." and give it the name "inputdf". This matches the variable name in the data function script. Set the Type to "Type: Table" and in "Allowed data types" click the "Numeric" button.
  5. In the 'Output Parameters' tab, click "Add..." and give it the name "output". This matches the variable name in the data function script. Set the Type to "Type: Table".
  6. Click "Run" button in the top options which will open the "Edit Parameters" dialog.
  7. In the "Input" tab, select the "inputdf" parameter.
  8. In the "Input handler" options, select "Columns".
  9. Select your data table and then click "Select Columns" to select the columns which are used in the x and y axis of your visualization. Note: These columns names are referenced in the data function script ("X" and "Y" in this example).
  10. In the "Output" tab, select the "output" parameter.
  11. In the "Output handler" options, select "Data table".
  12. Select "Create new data table" and give it the name "Model II Regression".
  13. Click OK.

SUMMARY:
Name: Model II Regression Straight Line Fit
Script:
 #--------------------------------------- #Install and load the user-contributed "lmodel2" package from the open-source R community's CRAN repository: install.packages( "lmodel2", repos = "http://cran.us.r-project.org" ) library( lmodel2 ) #Create regression model, from input data table "inputdf" suppressMessages( RMA_model <- lmodel2( Y~X, data = inputdf ) ) #Extract the SMA slope and intercept #Available methods: # OLS = 1  #Matches "Lines & Curves > Straight Line Fit" # MA = 2 # SMA = 3 method <- 3 intercept <- data.frame( RMA_model[3] )[method, 2] slope <- data.frame( RMA_model[3] )[method, 3] #Output as data frame, for returning to Spotfire as a data table output <- data.frame( intercept, slope ) #---------------------------------------

Input parameter:
  - Name: inputdf
  - Type: Table
  - Input: [Sample Data Set].[X],[Sample Data Set].[y]

Output parameter:
  - Name: output
  - Type: Table
  - Output: Data table: Model II Regression

Step 2. Insert the line to your visualization:
  1. Right click on your visualization (a Scatter Plot in this example).
  2. Select "Lines & Curves".
  3. Click "Add" > "Curve from Data Table".
  4. In the "Curve from Data Table" enter:
  5. Data table with curve parameters: Model II Regression
  6. Expression: x*[slope]+[intercept]
  7. Click OK.

 

Requirements:

Since this uses a data function, it requires a statistics engine to work. TERR is included in the Professional/Analyst/Desktop Client starting with version 5.0. There are no additional requirements for this to work in those clients. For this to work on Web Player, configure Spotfire to point to a TIBCO Spotfire Statistics Services (TSSS) server instance. This TSSS URL must be configured according to the "TIBCO Spotfire Statistics Services Installation and Administration Manual".

 

 

Note: The user-contributed "lmodel2" package from the open-source R community's CRAN repository is not a TIBCO product.

 

Issue/Introduction

How to do a Model II Regression Straight Line Fit, using a data function.

Additional Information

TIBCO Spotfire Curve Fit Models:
User-contributed "lmodel2" package in the open-source R community's CRAN repository: