Products | Versions |
---|---|
Spotfire Enterprise Runtime for R | All |
data.frame
When using data.frame
objects:
matrix
When using matrix
objects:
matrix
was the fastest method overall for computing mutual information, for both TERR and R.data.table
When using data.table
objects:
data.table
and OpenMP Interaction in TERRThe data.table
package includes its own parallel processing capabilities leveraging OpenMP. Since TERR is also configured to use OpenMP, it is suspected that a conflict or interaction between TERR's OpenMP implementation and data.table
's internal OpenMP usage might be a contributing factor to TERR's slower performance when specifically using data.table
objects.
parallel_TERR_vs_R.html
(Contains the detailed analysis of the results)time_all.rds
(R object containing the raw timing data)All
The findings from this comparison suggest the following strategies for achieving optimal performance in parallel computations:
When working with tabular data where data.frame
semantics are sufficient, TERR may offer a speed advantage.
For matrix-based computations, R generally performs better, and using a matrix
data structure is the most efficient approach overall in both TERR and R.
When using data.table
objects, R currently demonstrates better performance. Consider this if data.table
specific features and performance are critical.
Regardless of the environment (TERR or R) or data structure, prefer a single parallel loop with a standard nested loop over nested parallel loops for the type of computations tested, as it yields better performance.
To reduce computation time, leverage multiple CPUs for parallel tasks whenever possible.
This article summarizes a performance comparison between Spotfire Enterprise Runtime for R (TERR) and open-source R, focusing on parallel computations with varying numbers of CPUs and different data object types. The analysis involved timing computations for mutual information using both nested parallel loops and single parallel loops with a standard nested loop.