Benchmarking in R

30 Mar 2020

Reading time ~6 minutes

The process of measuring and comparing performance metrics is many a time crucial for us to accurately evaluate an algorithm’s efficiency, or to simply obtain the runtime for a chunk of code. Benchmarking, as it is commonly referred to, is measuring how much time (or memory) a program takes to execute, and this could vary between programming languages. (I recall that the first time I got introduced to the term was while using the <chrono> library’s high-resolution clock to time C++ code snippets!)

In R, benchmarking is a vital aspect to be taken note of since code in the language varies by quite a margin in contrast to other types of languages (like the strictly compiled variants, as opposed to this interpreted one). For example, if we benchmark code for multiplication and division via the * and / operator versus the >> and << (bitwise) operators respectively, the former sets would be found to be comparatively slower, or the latter (using shifts) would be relatively faster in most compiled programming languages such as C or C++. Such exceptions make it a noteworthy point to measure the performance in order to evaluate how efficiently the language processes bits and to eventually distinguish between alternative methods, so as to choose the one which is better in terms of resource-efficacy.

Given the fact that my GSoC’20 project for R involves it, I thought I’d write a post about benchmarking methods in R and so here it is!
There are six benchmarking functions/libraries in R that am aware of or have encountered so far:

`Sys.time`

As the name indicates, it gets the system’s time at the moment when invoked. To use it for benchmarking purposes, simply call it before and after the code snippet you want to benchmark and the difference between the start time and end time is your desired result. To make it easier, you can enclose the two Sys.time() calls with the code in between within a function as well.
Example:

start <- Sys.time()
Sys.sleep(10) # Sleep for 10 seconds
stop <- Sys.time()
start - stop
# Output: 
Time difference of 10.00227 secs

`system.time`

Another readily available alternative that is used to evaluate the time elapsed for an R expression. Just enclose your expression within the system.time() block.
Example:

system.time(Sys.sleep(10))
# Output:
   user  system elapsed 
   0.00    0.00   10.06 

`tictoc`

This library consists of two functions tic() and toc(), which act similar to the start and stop variables I declared above for demonstrating the Sys.time example, except that they can also contain arguments such as a print statement. Fairly obvious and straightforward.
Example:

# Importing the required library: (not a standalone function like the ones above!)
library(tictoc)
tic("sleeping for 10 seconds") # start
Sys.sleep(10)
toc() # stop
# Output:
sleeping for 10 seconds: 10.021 sec elapsed

One can nest multiple timers and overlap benchmarking sequences as well, by enclosing tic() ... toc() blocks inside other tic() and toc() blocks.

`rbenchmark`

rbenchmark::benchmark() is a wrapper around system.time() which is almost its equivalent, except for added convenience (for instance, it requires just one benchmark call to time multiple replications of multiple expressions, and the returned results are conveniently organized in a data frame).

Let’s consider an example from the rbenchmark documentation which benchmarks two named expressions ‘expr1’ and ‘expr2’ with three different replication counts, with the output sorted by test name and replication count:

# Importing the required library: 
library(rbenchmark)
# Creating simple test functions to be used in the example below:
random.array = function(rows, cols, dist = rnorm)
array(dist(rows * cols), c(rows, cols))
random.replicate = function(rows, cols, dist = rnorm)
replicate(cols, dist(rows))
# Benchmarking the two expressions:
within(benchmark("expr1" = random.replicate(100, 100),
"expr2" = random.array(100, 100),
replications = 10 ^ (1:3),
columns = c('test', 'replications', 'elapsed'),
order   = c('test', 'replications')),
{ average = elapsed / replications })
# Output:
   test replications elapsed average
1 expr1           10    0.02 0.00200
3 expr1          100    0.11 0.00110
5 expr1         1000    0.98 0.00098
2 expr2           10    0.02 0.00200
4 expr2          100    0.09 0.00090
6 expr2         1000    0.65 0.00065

`microbenchmark`

Being similar to the above, microbenchmark::microbenchmark() is usually used to compare the running times of multiple R expressions. It returns a data frame composed of columns min, lq, mean, median, uq, max and neval (apart from expr or expression) when being called usually, but that’s actually and in fact, the summary of the benchmark. The actual call (returned as a data frame - as.data.frame can be used to explicitly obtain that) returns all the timings for each different expression.

Here is an example with a microbenchmark summary for the computation for 15 raised to the power of 2, computed using different methods, including a bitwise left shift (multiplication) operation (which highlights the point I made above, at the beginning of this post) and square root calls, which take more time than the normal multiplication or exponent raise operation - to convince someone, reproducible benchmarks (I would recommend using reprex) can be made like the one I have below (note that the timings are in nanoseconds) to serve as evidence for such conclusions.

# Importing the required library: 
library(microbenchmark)
a = 15
microbenchmark(a ^ 2,
               a * a,
               bitwShiftL(a,1), 
               sqrt(a) * sqrt(a) * a,
               sqrt(a) * sqrt(a) * sqrt(a) * sqrt(a))
# Output:
Unit: nanoseconds
                                  expr min  lq mean median   uq  max neval
                                   a^2 100 200  214    200  200 1500   100
                                 a * a 100 100  184    200  200  500   100
                      bitwShiftL(a, 1) 500 500  653    600  600 4600   100
                 sqrt(a) * sqrt(a) * a 500 500  606    600  600 3900   100
 sqrt(a) * sqrt(a) * sqrt(a) * sqrt(a) 800 900 1030   1000 1100 3300   100

`bench`

The newest addition to CRAN in terms of a benchmarking framework/package, which holds a list of functions incorporating benchmarks for both time and memory via bench::mark(). bench::time and bench::memory() can be separately used to individually extract the time and the memory allocations respectively. A separate bench::press() can be used to run benchmarks against a grid of parameters as well.

Here’s an example from the official documentation:

# Importing the required library: 
library(bench)
# Setting the seed for reproducibility:
set.seed(10)
dat <- data.frame(x = runif(10000, 1, 1000), y = runif(10000, 1, 1000))
bench::mark(dat[dat$x > 500, ],
            dat[which(dat$x > 500), ],
            subset(dat, x > 500))
# Output:            
  expression                    min  median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result        memory     time   gc         
  <bch:expr>                <bch:t> <bch:t>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list>        <list>     <list> <list>     
1 dat[dat$x > 500, ]          319us   409us     2270.     375KB     33.1   892    13      393ms <df[,2] [4,9~ <df[,3] [~ <bch:~ <tibble>
2 dat[which(dat$x > 500), ]   259us   311us     3089.     258KB     18.2  1357     8      439ms <df[,2] [4,9~ <df[,3] [~ <bch:~ <tibble>
3 subset(dat, x > 500)        415us   455us     2129.     507KB     26.0   900    11      423ms <df[,2] [4,9~ <df[,3] [~ <bch:~ <tibble>

Note that bench also includes system_time(), a higher precision alternative to system.time() that benchmarks on milliseconds and microseconds for the real- and the process-time used by the CPU respectively.

bench::system_time(Sys.sleep(1))
# Output:
> process    real 
>   116µs  1005ms

My pick

Out of all the choices, I prefer the use of microbenchmark and bench libraries for benchmarking purposes in R, for the added convenience of having the benchmarked results as a data.frame, plus for the precision it produces the results on, which is ideally nanoseconds considering time. (again, note that bench is the only go-to for measuring memory allocations among the ones I mentioned here!)

Installations

Snippets for installing the mentioned libraries via their GitHub repositories:

# Command:                                          # Library:
#--------------------------------------------------------------------
devtools::install_github("r-lib/bench")             # bench
devtools::install_github("jabiru/tictoc")           # tictoc
devtools::install_github("eddelbuettel/rbenchmark") # rbenchmark
devtools::install_github("microbenchmark")          # microbenchmark
#--------------------------------------------------------------------
# In case devtools doesn't exist among your collection of packages:
if(!require(devtools)) install.packages("devtools")

Anirban

03/30/2020