Operator HDFS Processing Tool Support

Operator HDFS Processing Tool Support

book

Article ID: KB0082646

calendar_today

Updated On:

Products Versions
Spotfire Data Science 6.x

Description

Operator HDFS Processing Tool Support

Resolution

Operator HDFS Processing Tool Support

The following outlines which Data Processing Tool is supported by each of the various HDFS-supported Operators: 

Note: Operators marked with "n/a" do not use any of the HDFS processing tools. 

Data Load Operators (for HDFS):

 PigMapReduceSparkSqoop
Copy to Database   
(parallel mode)
Copy to Hadoop   X
Hadoop File X  


Explore Operators (for HDFS):

 PigMapReduceSparkSqoop
Bar ChartX   
Box PlotX   
Correlation X  
FrequencyX   
HistogramX   
Scatter Plot MatrixX   
Summary StatisticsX X 
T-Tests  X 
Variable Selection X  


Transform Operators (for HDFS):  

 PigMapReduceSparkSqoop
AggregationX   
Collapse X  
Column FilterX   
Distinct X  
JoinXX  
NormalizationX   
Null Value 
Replacement
X   
Pivot X  
Row FilterX   
Set Operations X  
VariableX   

Sample Operators (for HDFS):  

 PigMapReduceSparkSqoop
Random Sampling X  
Sample Selectorn/an/an/an/a

Model Operators (for HDFS):   

 PigMapReduceSparkSqoop
Alpine Forest XX 
Alpine Forest 
Regression
 XX 
Decision Tree X  

Gradient Boosting Classification

  X 

Gradient Boosting Regression

  X 
K-Means XX 
Linear  Regression XX 
Logistic Regression XX 
Naive Bayes X  
PCA X  
SVM Classification X  
Time Series X  


Predict Operators (for HDFS):

 PigMapReduceSparkSqoop
Classifier X  
Predictor X  
PCA Apply X  
Time Series 
Predictor
n/an/an/an/a 


Model Validate Operators (for HDFS): 

 PigMapReduceSparkSqoop
Alpine Forest  Evaluator X  
Confusion Matrix X  
Goodness of Fit X  
Lift X  
Regression Evaluator  X 
ROC X  

Tool Operators (for HDFS): 

 PigMapReduceSparkSqoop
Export Operatorn/an/an/an/a
Flow ControlX   
Noten/an/an/an/a
Pig ExecuteX   
R Executen/an/an/an/a
Sub-Flow*    

*Note: For the Flow Control and Sub-Flow Operators, the HDFS Data 
Processing
 tool used depends on the customer specific implementation.

Issue/Introduction

Operator HDFS Processing Tool Support