Ekinoderm
 

Code With No Name (Part 2): Closures

Last time, we talked about how anonymous functions could be created in most any language that supports first-class functions.  In brief, we discussed how a function definition can be used as an expression in these languages, stored in a local variable, or passed back as a return value from another function.  The one key piece we didn’t discuss was the thing that makes anonymous functions so cool: they have access to the lexical scope in which they are defined.

First, what do we mean by “lexical scope”?  Nearly all languages are lexically scoped, meaning that an identifier’s scope is determined by where it is declared (or defined) in the source code.  For example, the following should cause a compiler error in C#:

for (int i = 0; i < 5; i++)
{
     int i_squared = i * i;
     Console.WriteLine(i_squared);
}
Console.WriteLine(i_squared);

This should cause an error because i_squared is declared inside of the for loop, and it is only a valid identifier inside of the for loop.  If we want access to the value of the variable inside of the loop, we need to assign it to variable which is valid outside of the loop (and, hence, in a different lexical scope).  So let’s say we do this:

int final_value;
for (int i = 0; i < 5; i++)
{
     int i_squared = i * i;
     Console.WriteLine(i_squared);
     final_value = i_squared;
}
Console.WriteLine(final_value);

This should compile and give us the value of i_squared the last time through the loop in final_value. A key thing to note is that we’re getting the value of the variable, and not the variable itself. This might sounds a little pedantic, but you’ll see in a minute why this is relevant.

Now, we’ll take the jump and make our first closure:

Action print_final_value = null;
for (int i = 0; i < 5; i++)
{
     int i_squared = i * i;
     Console.WriteLine(i_squared);
     print_final_value = () => Console.WriteLine(i_squared);
}
print_final_value();

As you might expect, this will print out: 0 1 4 9 16 16.

The most important point here is that we’ve captured the local variable i_squared inside of our function called print_final_value. We’ve taken a variable from one lexical scope and effectively frozen it for use outside that scope. When we do this, we’re creating a separate environment that rides along with the function definition wherever it is passed. This combination of a function definition and an associated environment is called a “closure.” Note the distinction that we’ve captured the actual variable and not just the value of the variable. You can see the significance of this by making the following modification:

Action print_final_value = null;
for (int i = 0; i < 5; i++)
{
     int i_squared = i * i;
     Console.WriteLine(i_squared);
     print_final_value = () => Console.WriteLine(i_squared);
     i_squared = 0;
}
print_final_value();

So what does this print? If you guessed 0 1 4 9 16 0, you’re right! When we create the closure, we don’t just get a copy of the variable values at the time the function is defined, we get the actual variables in the closure, and any modifications made to those variables are reflected in the closure’s environment.

Of course, each time we go through the loop, we get a new lexical scope for the loop body, so the i_squared variable is a different variable for each time through the loop, despite having the same name. If you want to see a very contrived example of this:

Action a = null;
Action b = null;
 
for (int i = 1; i < 3; i++)
{
     int i_squared = i * i;
     if (i == 1)
     {
          a = () => Console.WriteLine(i_squared);
     }
     if (i == 2)
     {
          b = () => Console.WriteLine(i_squared);
     }
}
 
a();
b();

This prints out: 1 4. The a function gets the i_squared from the the first iteration of the loop and the b function gets the i_squared from the second iteration. Because these are separate scopes, there are separate variables, and the two closures have different i_squared variables in their closure environments.

So, now that you’ve got your feet wet with closures, there’s a lot of other things you can try with them (in C# or any other language that supports them):

  • Use a closure instead of a subclass. When you need specialized functionality and/or data-hiding, but you don’t want the overhead of introducing new types in your namespace, you can use a closure instead of a subclass. If you really wanted to, you could roll your own object-oriented type system just using closures, and then you’d have JavaScript.
  • Use currying to build a family of related functions. Currying is the process of taking a multiple-argument function and turning it into a chain of functions, each of which takes a single argument. This is handy when you want to “fix” one of the parameters of a function, but still allow the other parameters to be passed in as variables. For example, you could write a function for logging messages that takes a handle to a file and a string to write to that file. You could curry it and fix the handle to the log file, and then pass around a function that just takes one parameter (the message to write) and automatically writes to the file handle that you baked into the function earlier.
  • Multi-thread with closures. Instead of messing with locks and so forth, build a thread’s state in the lexical scope where the thread function itself is defined. This is more complicated that I’m making it sound, but using closure variables instead of global variables can make working with multi-threaded code simpler. There’s also a huge body of research on implicit parallelism, which has to do with the fact that certain language constructs are inherently thread-safe (for example, a closure with no side-effects inside of it) and can be parallelized by the compiler or the runtime.1

If you already knew about closures, hopefully this article renewed your faith in their importance, and, if you didn’t know about them before, then welcome to a new world. Closures are an elegant weapon…for a more civilized age.



  1. The PLINQ project at Microsoft is working towards making parallelism mostly implicit in .Net (when the developer gives the “hint” to parallelize a certain LINQ query). []

3 Responses to “Code With No Name (Part 2): Closures”

  1. 1
    Donny:

    Thank you for a great and approachable explanation of closures :)

  2. 2
    Richard:

    Everytime i come back here I’m reminded why I added your site to my favourites:)

  3. 3
    Code With No Name: Anonymous Functions | Ekinoderm:

    [...]   « Ideas Worth Stealing Code With No Name (Part 2): Closures [...]

Leave a Reply