Distributed Performance Analysis for R (RIOT 2015 - R Implementation, Optimization and Tooling Workshop)

Sun 5 - Fri 10 July 2015 Prague, Czech Republic

Track

RIOT 2015

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Sun 5 Jul 2015 11:30 - 12:00 at FIT-111 - Analysis

Abstract

R is the most widely used programming language for statisticians in research, especially in biostatistics and bioinformatics with high-dimensional data sets. Here an excessively high amount of resources is needed. Our existing tool traceR allows the user to profile the resource usage of an application to locate bottlenecks and develop new optimizations. traceR was previously limited to non-parallelized R applications. Parallel computing however is becoming a more and more popular option to reduce the effective runtime of compute-bound R applications. Therefore we have enabled the profiling of such applications with traceR.

Compared to existing profiling tools such as Rprof, traceR is directly integrated with the R interpreter. This enables the generation of more detailed and accurate data about memory behavior and runtime usage of an R application. For example, data about the size and the number of memory allocations needed during execution is provided. Since the gain from parallel execution can be negated if the memory requirements of all parallel processes exceed the capacity of the system, this data can serve as a constraint to determine the maximum amount of parallelization. The information gathered using traceR can be used to guide scheduling decisions to allow efficient resource utilization. Such decisions are especially important if the hardware system is heterogeneous or if the jobs have varying resource requirements depending on the input data.

In this talk we will present our profiling tool traceR and how to apply it to analyze parallel R programs.

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Sun 5 Jul
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

11:00 - 12:30	AnalysisRIOT at FIT-111

11:00 30m Talk		Detecting Memory Protection Errors in GNU-R using Static Checking RIOT Tomas Kalibera Northeastern University
11:30 30m Talk		Distributed Performance Analysis for R RIOT Helena Kotthaus TU Dortmund
12:00 30m Talk		Feature Specific Profiling in the R Language RIOT Leif Andersen PLT @ Northeastern University

Distributed Performance Analysis for R

Sun 5 Jul
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

Helena Kotthaus

TU Dortmund

Tracks

Workshops

Distributed Performance Analysis for R

Program Display Configuration

Program Display Configuration

Sun 5 JulDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

Helena Kotthaus

TU Dortmund

Sun 5 Jul
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change