Vectorization of Apply Operations for the Exploitation of the Efficient Interpretation of R
In this presentation, we will discuss an approach that could reduce the interpretation overhead of R through the vectorization of Apply class of operations. The normal implementation of Apply incurs in significant overhead resulting from iteratively applying the input function to each element of the input data. Our approach combines data transformation and function vectorization to convert the looping-over-data execution into vector operations. Because R has built-in support of vector data types and vector operations, this new form incurs in much less interpretation overhead. We implemented the vector transformations as an R package that can be invoked by the standard GNU-R interpreter. We used a suite of data analysis algorithm benchmarks to evaluate this approach, and the result shows that the transformed code can achieve on average 15x speedup for iterative algorithms and 5x for direct (single-pass) algorithms.
Haichuan Wang is a Ph.D student at Computer Science department, University of Illinois at Urbana-Champaign, and works with Professor David Padua. Wang’s research area includes compiler, runtime and parallel computing. He is working on compiler and runtime optimization for dynamic scripting languages currently. Before that, Wang was a Research Staff Member at IBM Research - China, where he was researching on parallel programming models and performance tooling for Java language.