Caching data through Information Links in Spotfire

Caching data through Information Links in Spotfire

book

Article ID: KB0070590

calendar_today

Updated On:

Products Versions
Spotfire Analyst 7.5 and higher
Spotfire Automation Services 7.5 and higher

Description

There are multiple options for caching data within Spotfire.  There may be situations where the load time needs to be improved for both the Spotfire Analyst and Spotfire Web Player. 

For large data sets, it can take a long time to open a dashboard on the Spotfire Analyst or Spotfire Web Player as the data will need to be read from the Information Link source database. The (TIBCO) Spotfire Server (TSS) will query the database for data for every user who opens any of those dashboards from either the Spotfire Analyst or Spotfire Web Player. There are approaches with complications in this situation:
  • Caching the analysis in Scheduled Updates will help the load time in the Web Player, but not the Spotfire Analyst client.
  • Saving the analysis as embedded (dxp) or the data (SBDF) in the library may not be an option if the data is larger than 2GB and if the Spotfire application database is on a Microsoft SQL Server (see KB 000028303 Size limit of items stored in the TIBCO Spotfire Library for more details).
  • If there are multiple analysis files which contain data imported from the same Information Link, then each analysis will be executing the same query which may cause unnecessary load on the TSS or the database.
This article outlines how Information Link caching can be used to open the dashboard faster on both the Spotfire Analyst and Spotfire Web Player while also helping avoid querying the database every time the report is opened as a way to decrease load on the database.

Issue/Introduction

This article explains ways to cache data through Information Links in Information Designer

Resolution

Information link caching is done on the (TIBCO) Spotfire Server (TSS) end to speed up the time it takes to return data to the clients and to reduce load on the database, as the TSS will not have to communicate and retrieve the data from the database server as long as the cache is not stale and the "Validation SQL query" result is the same (if configured).

To enable Information Link caching on your Information Links, open the Spotfire Analyst client > Tools > Information designer > select your Information Link > Caching, and check the "Cacheable" checkbox, and change  any other settings like "Timeout (seconds)".  Now, then that Information Link is executed, the corresponding cache entry will get created inside the Attachment Manager (<Spotfire Server Installation Directory>\7.5.X\tomcat\temp\AttachmentManager) folder. The settings in the Spotfire Configuration Tool > Configuration tab > Attachment Manager apply here. Until the cache expires (per the Timeout on the Information Link and the cache expiration time in Attachment Manager whichever is shorter will take precedence over other) or the "Validation SQL query" returns a different result, the cached results will be used if a new request for the Information Link is received.

Below are the ways to achieve Information link caching in case of single Spotfire Server:

a) Open the dashboard containing Information Links to be cached using either the Spotfire Web Player or Spotfire Analyst. This will trigger the Information Link to be cached on Spotfire Server.

b) Open the dashboard containing Information Links to be cached using Spotfire Automation Services or Scheduled Updates. This will trigger the Information Link to be cached on Spotfire Server. This has the added benefit of being able to schedule the Schedule Updates Job/Automation Services Job in order to trigger the cache at a specific time (for example, at 7am daily before business hours). 

Creating scheduled updates by using Spotfire Server
https://docs.tibco.com/pub/spotfire_server/latest/doc/html/TIB_sfire_server_tsas_admin_help/server/topics/creating_a_scheduled_update_by_using_spotfire_server.html

Information link caching through Spotfire Automation Services:
  1. Create a Automation Services Job with Open Analysis from Library (which will be linked to Information Link)
  2. Schedule this Automation Services job based on required intervals in Windows Task Scheduler to fetch new data from the database and cache the Information Link ((for example, at 7am daily before business hours) See KB 000021541 How to schedule an Automation Services job using the Microsoft Windows Task Scheduler for more details.

Considerations:

  • Whenever Spotfire Server is restarted, the Information link cache will expire and needs to be re-cached again
  • Information link caching option is not useable in case of a clustered Spotfire Servers because Information link caching needs to be done on each Spotfire server, the cache is not shared across servers. Opening analysis on Spotfire Web Player or Spotfire Automation Services connecting to specific Spotfire server will not guarantee caching Information link on that particular server as Node Manager services can cache Information link data on any Spotfire servers or sometimes all. This may sometimes result in cached data being out of sync. However, opening dashboard manually connecting to each Spotfire server through Spotfire Analyst will create Information link cache on each server.
  • Spotfire Attachment Manager default cache expiration time is 1 day so if cache needs to be retained for few more days, this value needs to be increased. 
  • Spotfire Attachment Manager caches library content and the results of information link executions when downloading or saving large amounts of data. If cache expiration time in Attachment Manager is extended to a week, then other contents will also not be cleaned up till a week and temp folder will start occupying more disk space
  • Spotfire Attachment Manager default cache size is 10GB so if caching large data sets this value needs to be increased provided the disk space is available
  • Multiple dashboards using same Information link should not be run at the same time otherwise all the dashboards will try to cache the same Information link multiple times. If data volume retrieved by that Information link is huge, disk space can also be exhausted and it will also impact your database performance as same query is running multiple times simultaneously. Let one dashboard cache the Information link completely and then trigger rest of the dashboards so they will use the Information link cache

Additional Information

Doc: Information Link Tab, caching section Doc: Setting max Information link cache expiration time in Attachment Manager Doc: Creating scheduled updates by using Spotfire Server Doc: Creating Automation Jobs KB: 000021541 How to schedule an Automation Services job using the Microsoft Windows Task Scheduler KB: 000035482 Various Caching options in Spotfire