Closures in R

Put briefly, closures are functions that make other functions. Are you repeating a lot of code, but there’s no simple way to use the apply family or Purrr to streamline the process? Maybe you could write your own closure. Closures enclose access to the environment in which they were created – so you can nest functions within other functions.

What does that mean exactly? Put simply, it can see all the variables enclosed in the function that created it. That’s a useful property!

What’s the use case for a closure?

Are you repeating code, but instead of variables or data that’s changing – it’s the function instead? That’s a good case for a closure. Especially as your code gets more complex, closures are a good way of modularising and containing concepts.

What are the steps for creating a closure?

Building a closure happens in several steps:

  1. Create the output function you’re aiming for. This function’s input is the same as what you’ll give it when you call it. It will return the final output. Let’s call this the enclosed function.
  2. Enclose that function within a constructor function. This constructor’s input will be the parameters by which you’re varying the enclosed function in Step 1. It will output the enclosed function. Let’s call this the enclosing function.
  3. Realise you’ve got it all wrong, go back to step 1. Repeat this multiple times. (Ask me how I know..)
  4. Next you need to create the enclosed function(s) (the ones from Step 1) by calling the enclosing function (the one from Step 2).
  5. Lastly, call and use your enclosed functions.

An example

Say I want to calculate the mean, SD and median of a data set. I could write:
x <- c(1, 2, 3)
mean(x)
median(x)
sd(x)

That would definitely be the most efficient way of going about it. But imagine that your real use case is hundreds of statistics or calculations on many, many variables. This will get old, fast.

I’m calling those three functions each in the same way, but the functions are changing rather than the data I’m using. I could write a closure instead:

stat <- function(stat_name){

function(x){
stat_name(x)
}

}

This is made up of two parts: function(x){} which is the enclosed function and stat() which is the enclosing function.

Then I can call my closure to build my enclosed functions:

mean_of_x <- stat(mean)
sd_of_x <- stat(sd)
median_of_x <- stat(median)

Lastly I can call the created functions (probably many times in practice):

mean_of_x(x)
sd_of_x(x)
median_of_x(x)

I can repeat this for all the statistics/outcomes I care about. This example is too trivial to be realistic – it takes about double the lines of code and is grossly ineffficient! But it’s a simple example of how closures work.

More on closures

If you’re producing more complex structures, closures are very useful. See Jason’s post from Left Censored for a realistic bootstrap example – closures can streamline complex pieces of code reducing mistakes and improving the process you’re trying to build. It takes modularity to the next step.

For more information in R see Hadley Wickham’s section on closures in Advanced R.