Change Hadoop /tmp Directory

Change Hadoop /tmp Directory

book

Article ID: KB0082613

calendar_today

Updated On:

Products Versions
Spotfire Data Science 6.x

Description

Change Hadoop /tmp Directory

Issue/Introduction

Change Hadoop /tmp Directory

Resolution

Change Hadoop /tmp Directory

To change the temporary directory for Hadoop jobs, follow these steps:

1. Configure these parameters in alpine.conf

alpine.hadoop.lib.dir=/spark_temp
alpine.workflow.hadoop_tmp_path=/new_temp/ # this one should be configured on the UI in Actions -> Preferences -> Workflow
alpine.hdfs.baseTempDir=/new_temp/
2. Make sure that Actions -> Preferences -> Workflow has the new /new_temp/ value set after restarting Spotfire Data Science.

3. Configure the workflow temp dir to point to the new /new_temp/

4. Add pig.temp.dir with value /new_temp/ to the Hadoop datasource parameters.
 

How to migrate /tmp/ directory in 5.9.1 and later

For new installations of 5.9.1, we've moved the location of temporary files in Tibco Spotfire Data Science to $CHORUS_DATA/tmp/. This creates a more stable location to expand Tibco Spotfire Data Science's JAR files when it starts. 

However, upgrading from a previous version of Tibco Spotfire Data Science to 5.9.1 requires one small change.

Navigate to deploy.properties and add this line:

alpine.java.io.tmpdir=$CHORUS_DATA/tmp
Save the file and restart Alpine for the change to take effect.


Change setting from using /tmp to $CHORUS_DATA/tmp

These steps will not be relevant after 5.9.1, at least for fresh installs.  In 5.9.1, we will be using $CHORUS_DATA/tmp instead of /tmp to store the start up agent and jetty files.  It will be set in the build and will be automatically configured during a fresh install.

For versions 5.8.x and 5.9.0 (not sure about 5.7), these are the 3 files that have to be changed:

It's important to mkdir the new location of $CHORUS_DATA/tmp and set -Djava.io.tmpdir=<new path location>

  1. $CHORUS_HOME/shared/chorus.properties
    [chorus@ip-10-10-0-247 shared]$ grep tmp chorus.properties
    
    java_options = -Djava.security.egd=file:/dev/./urandom -server -Xmx4096m -Xms2048m -Xmn1365m -XX:MaxPermSize=256m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:ParallelGCThreads=3 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=./ -XX:+CMSClassUnloadingEnabled -Djava.io.tmpdir=/data/chorus/tmp
    [chorus@ip-10-10-0-247 shared]$
  2. For the files below, just add to the end of the file.
    • ​$CHORUS_HOME/alpine-current/agent/templates/jetty.ini
      [chorus@ip-10-10-0-247 ~]$ grep tmp $CHORUS_HOME/alpine-current/agent/templates/jetty.ini
      -Djava.io.tmpdir=/data/chorus/tmp
    •  $CHORUS_HOME/alpine-current/jetty-distribution-9.2.12.v20150709/start.ini
      [chorus@ip-10-10-0-247 ~]$ grep tmp $CHORUS_HOME/alpine-current/jetty-distribution-9.2.12.v20150709/start.ini
      -Djava.io.tmpdir=/data/chorus/tmp
  3. After changing all three files, restart Chorus.

If you are upgrading to 5.9.1 from an older build, the only change you'll need to add the following:

alpine.java.io.tmpdir=$CHORUS_DATA/tmp to deploy.properties making sure that directory exists already.

Note: Some /tmp disk space is still required for upgrades; As of Alpine Version 5.9, reserve ~6Gb in /tmp for running the upgrade script.