How to install additional R packages for a Spotfire Server TERR service deployed in a Kubernetes cluster

How to install additional R packages for a Spotfire Server TERR service deployed in a Kubernetes cluster

book

Article ID: KB0070421

calendar_today

Updated On:

Products Versions
Spotfire Server 12.0 and later

Description

In most cases, you need to install additional R packages on the machine which runs your Spotfire Server TERR service. This presents a challenge when deploying in a Kubernetes cluster (using the Spotfire Cloud Deployment Kit), as it requires some changes to be applied to the running spotfire-terrservice pod. Follow the steps below to implement a Helm post-install hook for the Spotfire TERR service that will install additional R packages after the TERR service pod has started.
 

Issue/Introduction

Outlines the steps needed to install additional R packages for a Spotfire Server TERR service deployed in a Kubernetes cluster. This article demonstrates installing the 'rinclude' package using a Helm post-install hook.

Resolution

1. Create a persistent volume to hold the R packages that you eventually will install on the TERR Service pod. A persistent volume claim is also needed.
 
$ kubectl apply -f - <<!
# Create the PV
apiVersion: v1
kind: PersistentVolume
metadata:
  name: terr-pv-volume
  labels:
    type: local
spec:
  storageClassName: standard
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/mnt/data"
---
# Create the PVC
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: terr-pv-claim
spec:
  storageClassName: standard
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 3Gi
!

2. Create a cluster role and cluster role binding. These are needed to execute kubectl commands from a Kubernetes manifest file (*.yaml).
 
$ kubectl apply -f - <<!
# Create the cluster role
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: pod-viewer
rules:
- apiGroups: [""]
  resources: ["pods", "pods/exec"]
  verbs: ["get", "list", "create"]
---
# Create the cluster role binding
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: default-pod-viewer-binding
subjects:
- kind: ServiceAccount
  name: default
  namespace: default
roleRef:
  kind: ClusterRole
  name: pod-viewer
  apiGroup: rbac.authorization.k8s.io
!

3. Modify the provided Dockerfile for the TERR service under spotfire-cloud-deployment-kit/containers/images/spotfire-terrservice/Dockerfile. First, find the section where the initial apt-get update is performed:
 
ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update \
  && apt-get clean all

Then add additional lines to install additional Linux packages that are required to install kubectl and install R packages inside the spotfire-terrservice pod:
 
ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update \
  && apt-get clean all
# additional packages for TERR & kubectl
RUN apt install -y libtinfo5 make r-base-core curl ca-certificates apt-transport-https
RUN curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
RUN install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl

Then set PATH and JAVA_HOME just prior to the final line where the entrypoint-terr.sh script is run..
 
# Configure logging
ENV LOGGING_SERVICELOG_SIZE=10MB
ENV LOGGING_SERVICELOG_MAX=2

# add PATH and JAVA_HOME
ENV JAVA_HOME="/usr/lib/jvm/java-17-openjdk-amd64"
ENV PATH="/usr/lib/jvm/java-17-openjdk-amd64/bin:${PATH}"

CMD ["./entrypoint-terr.sh"]

Then build the image according to the Spotfire CDK documentation.

4. Obtain your additional R packages from a known repository like https://tran.tibco.com/terr61/ or https://cran.r-project.org/, and place them in your local directory which maps to the persistent volume you created earlier. In this example, the rinclude package has been downloaded, and extracted to the local directory /mnt/data:
 
$ ls -al /mnt/data/rinclude/
total 32
drwxr-xr-x 5 jmulcahe group   100 Sep 15  2020 .
drwxr-xr-x 3 root     root                         22 Dec 20 13:19 ..
-rw-r--r-- 1 user group   293 Sep 15  2020 DESCRIPTION
drwxr-xr-x 3 user group    21 Dec 15 20:28 inst
-rw-r--r-- 1 user group 18011 Sep 15  2020 LICENSE
drwxr-xr-x 2 user group    33 Dec 15 20:28 man
-rw-r--r-- 1 user group  2986 Sep 15  2020 MD5
-rw-r--r-- 1 user group     1 Sep 15  2020 NAMESPACE
drwxr-xr-x 2 user group    19 Dec 15 20:28 R

5. Create a new file named 'post-install-job.yaml', and place it in spotfire-cloud-deployment-kit/helm/charts/spotfire-terrservice/templates/. The contents should look like the following:
 
# The Job starts a TERR container and runs 'install.pacakges()', which
# installs the packages onto the PersistentVolume pointed
# to by the PersistentVolumeClaim 'terr-pv-claim'.
apiVersion: batch/v1
kind: Job
metadata:
  name: "post-install-job"
  labels:
    app.kubernetes.io/managed-by: {{ .Release.Service | quote }}
    app.kubernetes.io/instance: {{ .Release.Name | quote }}
    app.kubernetes.io/version: {{ .Chart.AppVersion }}
    helm.sh/chart: "{{ .Chart.Name }}-{{ .Chart.Version }}"
  annotations:
    # This is what defines this resource as a hook. Without this line, the
    # job is considered part of the release.
    "helm.sh/hook": post-install
    "helm.sh/hook-weight": "-5"
    "helm.sh/hook-delete-policy": hook-succeeded
spec:
  template:
    metadata:
      name: "post-install-job"
      labels:
        app.kubernetes.io/managed-by: {{ .Release.Service | quote }}
        app.kubernetes.io/instance: {{ .Release.Name | quote }}
        helm.sh/chart: "{{ .Chart.Name }}-{{ .Chart.Version }}"
    spec:
      volumes:
        - name: terr-pv-storage
          persistentVolumeClaim:
            claimName: terr-pv-claim
      restartPolicy: Never
      containers:
        - name: post-install-job
          image: 127.0.0.1:5000/spotfire/spotfire-terrservice:1.17.0-2.0.0
          imagePullPolicy: Always
          command: ["/bin/sh"]
          args: ['-c', 'TERR_SERVICE_POD_NAME=`kubectl get pod | grep terrservice` ;
                  kubectl exec $TERR_SERVICE_POD_NAME -- 
                  /opt/spotfire/nodemanager/nm/services/TERR/terr/bin/TERR -e 
                  ''install.packages( file.path("/opt/packages", 
                  list.files("/opt/packages")), 
                  repos = NULL, type = "source" )''']
          volumeMounts:
          - mountPath: "/opt/packages"
            name: terr-pv-storage

Here, we are obtaining the exact name of the running spotfire-terrservice pod, then running the TERR install.packages() command. The spotfire-terrservice pod's /opt/packages is now mapped to the local /mnt/data directory, where we have placed the rinclude package files. 

6. Deploy the spotfire-terrservice pod using the pv/pvc-related options, as noted in the CDK documentation.
 
$ helm install tss1400-terrservice . \
  --set acceptEUA=true \
  --set global.spotfire.image.registry="127.0.0.1:5000" \
  --set global.spotfire.image.pullPolicy="Always" \
  --set nodemanagerConfig.serverBackendAddress="$SPOTFIRE_SERVER" \
  --set logging.logForwarderAddress="$LOG_FORWARDER" \
  --set persistentVolumeClaim.create="false" \
  --set volumes.packages.existingClaim="terr-pv-claim" \
  --set volumes.packages.persistentVolumeClaim.volumeName="terr-pv-volume"

Here, we are directing Helm to use the existing 'terr-pv-claim' and 'terr-pv-volume' that we created in an earlier step.

7. Exec into the spotfire-terrservice pod, and confirm that 'rinclude' is shown in the list of installed.packages():
 
$ kubectl exec tss1400-terrservice-spotfire-terrservice-6d676b596f-dt5l7 \
-- /opt/spotfire/nodemanager/nm/services/TERR/terr/bin/TERR -e 'installed.packages()'
...
methods           NA      "file LICENSE"        NA    "6.1.0"
parallel          NA      "file LICENSE"        NA    "6.1.0"
rinclude          NA      "GPL-2 (see LICENSE)" NA    "4.0.2"
sjdbc             NA      "file LICENSE"        NA    "6.1.0"
splines           NA      "file LICENSE"        NA    "6.1.0"
...