TERR processes accumulate from each use of the TERR operator

TERR processes accumulate from each use of the TERR operator

book

Article ID: KB0078124

calendar_today

Updated On:

Products Versions
TIBCO Streaming -

Description

Repeatedly running the TERR model in the TIBCO Streaming TERR operator results in extra TERR processes. Why do these appear and accumulate? They do not appear to be doing anything.
 

Issue/Introduction

Alternative to using makeCluster() within the TERR script

Resolution

A TERR script using 'parallel::makeCluster()' creates additional processes to perform calculations in parallel. The adapter does not automatically stop these processes so the script must call 'stopCluster()' itself before the script exits. This also means that the script must catch errors which would end the script so that 'stopCluster()' can be called even if there is an error.

The script may create and close additional processes with a function like this (example):
  makeModelsInParallel <- function() {
    suppressWarnings(suppressPackageStartupMessages(library(parallel)))
    nw <- max(1,detectCores()-1)
    if(identical(parallel:::getClusterOption("type"), "TERR")) {
      cl <- makeCluster(nw, JAVA_OPTIONS=rep("-Xmx1024m",nw))
    } else {
      cl <- makeCluster(nw)
    }
    on.exit(stopCluster(cl))
    clusterExport(cl, c("features","in.df"))
    clusterEvalQ(cl, {library(glmnet);library("randomForest");library(data.table)})
    # For Lap reply data, LAxPChannelFiles
    outListRF <- parLapply(cl, model.all,fun = multiRF)
    outListGLM <- parLapply(cl, model.all,fun = multiGLM)
    list(outListRF,outListGLM)
  }

  models <- makeModelsInParallel()

Alternatively, the TERR Operator supports Multiplicity as a StreamBase property (operator StreamBase Properties view, Concurrency tab, Multiplicity = multiple, Number of instances = N). To run multiple TERR processes the TERR script does not have to create its own parallel instances. This is preferable to making the TERR script more complex.