Statistica

TIBCO Statistica Modeler is solder as Named User or customers can purchase add-ons to TIBCO Statistica® Modeler for a metadata store, job server, versioning/approval, monitoring & alerting, live scoring, manual data entry & analytics and interactive dashboards. If an add-on is purchased, then TIBCO Statistica™ Modeler includes a license server. The license server is also included if concurrent user licensing is purchased.

This article lists the features included in TIBCO Statistica Modeler.

Windows operating systems only.

TIBCO Statistica™ Modeler contains the following features:

- automation for data cleaning; dirty data is the most common analytics problem
- business rules builder
- exploratory analysis & visualizations; learn about the problem space
- descriptive statistics, nonparmetrics; learn and share factoids about the problem to build situational awareness
- linear regression models, nonlinear regression models; estimate the relationships among your variables and create predictive models (machine learning); also use simulated data to create linear regression models and learn something new
- multivariate exploratory techniques; organize data into meaningful clusters, classify variables (reduce/relate variables), principal components & classification analysis
- process analysis, quality control, multivariate statistical process control; understand critical process parameters which impact critical quality attributes
- design of experiments, power analysis and interval estimation; experiment and discover; also use simulated data to execute virtual experiements
- tabulation options; everyone needs a summary table for their presentation to management

There are two modes of interaction with the analytics; spreadsheet and workspace. For ad-hoc analysis that does not need to be duplicated, users can import data into a spreadsheet and interact with menus, variables, and rows of data. The workspace is a visual analytic workflow management tool and is recommended. This allows work to be saved and reused. No coding is needed to complete a workspace. And for the users who need to manage their code, the workspace has a "code node" which can execute C#, Python, or R code.

The Data Health Check node (data profiling) explores values, value ranges, discrete text lables, missing data, outliers, etc.. on every variable. The results of this analyses is a diagnostic report. This node can also be configured to automate fix the data problems uncovered by the analyses.

Additional options to transform and clean are available; remove duplicates, recode, rank, merge, process invariant variables, recode outliers, missing data inputation, recode missing data, subset, sample, etc..

Box-Cox is available to transform variables so that they have a distribution as close to normality as possible (Box and Cox, 1964). This allows the use of algorithms, like regression analysis, that only work with a normal distribution.

A workspace is a no code tool that:

- documents the analytic steps
- imports excel, csv, fixed width (mainframe) data
- embed sdata within workspace as a lookup table; transform "m" to Monday for readability
- imports Spotfire SBDF data file and configure analytics (see options below)
- retrieves data from database with ODBC driver and configure analytics (see options below)
- creates data mashup
- creates visualizations
- formats output for reporting
- exports results to excel, csv, Spotfire SBDF, etc..
- writes results into a database; SQL Server, Oracle, Teradata, SQL Server PDW, PostgreSQL, DB2
- workspace calls another workspace

The workspace can also be extended with R, C# or Python coding.

2D and 3D visualizations are available with the product; histogram, line, scatterplot, means with error, bag plots, quantile-quantile (beta, exponential, extreme, gamma, lognormal, normal, Rayleigh, Weibull), variability, contour, wafer, normal probability, etc.. Interactive dashboards are available for the analytic user.

- ANOVA/MANOVA
- Association Rules
- Automated Neural Networks
- Boosted Tree
- Calculators; Distributions, Pearson Product Moment Correlation Coefficient, Six Sigma
- Canonical Analysis
- Classification Trees
- Cluster Analysis
- Correlation
- Correspondence Analysis
- Cox Proportional Hazards Models
- Data Miner Recipes
- Descriptive Statistics
- Design of Experiments (DOE)
- Discriminant Function Analysis
- Distribution Fitting
- Distributions & Simulation
- Dynamic Time Warping
- Extract, Transform, and Load (analytics are used to align time based data)
- Factor Analysis
- Faster Independent Component Analysis
- Feature Selection
- Fixed Nonlinear Regression
- General CHAID Models
- General Classification and Regression Trees (C&RT)
- General Discriminant Analysis (GDA)
- General Linear Models (GLM)
- General Partial Least Squares Models (PLS)
- General Regression Models (GRM)
- Generalized Additive Models (GAM)
- Generalized Linear/Nonlinear Models (GLZ)
- Goodness of Fit, Classification, Prediction
- Independent Component Analysis
- Interactive Tree (C&RT, CHAID)
- Lasso Regression
- Link Analysis
- Log-Linear Analysis of Frequency Tables
- Machine Learning (Bayesian, Support Vectors, K-Nearest)
- Multidimensional Scaling (MDS)
- Multivariate Adaptive Regression Splines (MARSplines)
- Multiple Regression
- Nonlinear Estimation
- Nonparametric Statistics
- Power Analysis and Interval Estimation
- Multivariate Statistical Process Control (MSPC - PCA / PLS)
- Optimal Binning
- Predictor Screening
- Principal Components & Classification Analysis (PCCA)
- Process Analysis
- Quality Control Charts
- Random Forests
- Rapid Deployment of Predictive Models (PMML)
- Reliability and Item Analysis
- Sequence and Link Analysis
- Stepwise Model Builder (what-if)
- Structural Equation Modeling and Path Analysis (SEPATH)
- Survival & Failure Time Analysis
- Time series / forecasting
- t-tests and other tests of group differences
- Tabulate
- Variance Components & Mixed Model ANOVA/ANCOVA
- Weight of Evidence

Yes

No