The Algebra of Data, and the Calculus of Mutation

Kalani Thielen - April 18th, 2009

With the spreading popularity of languages like F# and Haskell, many people are encountering the concept of an algebraic data type for the first time.  When that term is produced without explanation, it almost invariably becomes a source of confusion.  In what sense are data types algebraic?  Is there a one-to-one correspondence between the structures of high-school algebra and the data types of Haskell?  Could I create a polynomial data type?  Do I have to remember the quadratic formula?  Are the term-transformations of (say) differential calculus meaningful in the context of algebraic data types?  Isn’t this all just a bunch of general abstract nonsense?

We’ll investigate these questions, and perhaps demystify this important concept of functional languages.

Algebraic Expressions at the Level of Types

To understand the concept of algebraic data types, simply stated, we need to unify the set of concepts that we associate with algebra, and the set of concepts that we associate with data types.

In an algebraic expression, we might encounter a humble lone variable:

x

In a data type expression, we might encounter a humble lone type:

int

Algebraic expressions can also be compositions of other expressions, so for example we might see:

x*x

And similarly for data types:

int*int

To explain: the product of two types denotes the type of pairs of those types.  In some languages, this might instead be written as pair <int,int> but as we’ll soon see, it’s useful to think of this as a product.

Of course, the algebraic expression above could be simplified to x**2 (“x squared”) and so you could also write int**2.

Now, in an algebraic expression you might also see:

x+x

And the same concept exists for data types:

int+int

The sum of two types denotes the type of variants which can have values either of the left type or the right (but not both). The sum in the above type might seem superfluous, because whether a value of the above sum type has the left or the right type, it’ll be an int. However, there’s some additional useful information in knowing whether the left or right was chosen.

Consider, for example, a data type representing the balance of a bank account:

type balance = int + int

We might interpret values on the left as positive balances, and values on the right as negative balances.  So if we wrote a function to describe a balance for screen-reading, it might look like:

string describe(balance b) {
   if (b.is_left()) {
      return "The bank owes you $" + b.left();
   } else {
      return "You owe the bank $" + b.right();
   }
}

For many people, sum types are less obvious than product types.  It may help to consider that, just as pair<int,int> corresponds to product types, boost::variant<int,int> corresponds to sum types.  C-style unions also serve a similar function.

Now, we can use these type-composition functions to make arbitrarily complex expressions:

(int*int*string)+(date*(string+int))

Just as we could use them to make arbitrarily complex algebraic expressions:

(x*x*y)+(z*(y+x))

But there is perhaps a simpler algebraic concept to consider:

1

What does the natural number “one” mean at the level of types?  This is called the “unit type” and it has only one value, written ().  It doesn’t do much good to have a value of the unit type — you already know what it must be.  However, it can become useful in composition, for example if you consider the simple identity:

1+1=2

Now “two” becomes the type of bits (or boolean values) — because a value of this type could either be the (trivial) unit value on the left, or the one on the right.  This view may help to understand variant types better, consider:

int+int = 2*int

In other words, the sum type of two ints is the same thing as the product type of a boolean value and a single int (the boolean value tells you whether the int is the one on the left or the right of the sum).

Furthermore, as you might expect:

1*int = int

In other words, having a pair of a unit value and an int is the same as having an int (the unit value adds no information here — you already know what it must be).

Now there’s one other natural number that deserves special consideration:

0

This is a special kind of type — it’s known as the “void” type (slightly different than the void type in programming languages like C, you’ll have to forgive this overloading of terms).  Unlike the unit type, which has a single value, the void type has no values.  It’s impossible to construct a value of this type.

That means, for example, that:

1+0=1

Because if you have a sum type with unit on the left and void on the right, no values of this type can possibly be on the right (again, it’s impossible to construct a value of the void type) — so it’s safe to assume that you can simplify this sum type to just be the unit type.

By a similar argument:

1*0=0

Because to construct a pair, you have to construct both of its members — if you can’t construct one member of the pair (again, it’s impossible to construct a value of the void type) then you can’t construct the whole pair.

Now, these types that we’ve covered so far are quite useful, and it’s hopefully pretty clear how they map to simple algebraic expressions.  However, it might be pointed out that there are more complex types, and more complex algebraic expressions to consider (even though the expressions above can get quite complex).

For example, how might the type of a list of integers be described (a list<int>) as an algebraic expression?  To help motivate this answer, consider the possible lists of integers we might produce.  For example, a list of integers could be an empty list, or it might be just one integer, or it might be two, or three, etc.  Each “or” here should be thought of as a sum, so that (with the unit type representing the trivial empty list) we come to a type represented by an infinite sum:

1+int+int**2+int**3+…+int**n

Now, as an algebraic expression we might be happy with this, or we might convert it to the “big-sigma” notation.  However, in the world of types, this kind of infinite expansion is described with a recursive type.  A recursive type associates the expansion with a symbol and consolidates the type to an expression where that symbol may be used.  For the case of our list of ints, this is written:

μX.(1+int*X)

This can be pronounced “mu X, one plus int times X.”  Forget the “mu” if Greek symbols bother you; all it really means is “look out, here comes a recursive type.”  The most important idea is that the type should be thought of as the infinite expansion you get when you repeatedly substitute the entire recursive type for X (the recursion variable).  For example, performing that substitution once in the above type will produce:

1+int*(μX.(1+int*X))

and again:

1+int*(1+int*(μX.(1+int*X)))

Using the normal distributive properties of * and +, this can be reduced to:

1+int+int**2*(μX.(1+int*X))

And in the limit that we perform this substitution continually, we will produce the same infinite sum that we expected for this list of ints.

Although this formalism may seem somewhat removed from the language of types in F#, you can quickly recover it if you replace the keyword “type” for “mu,” “=” for “.”, “|” for “+”, recognize that empty constructors are constructors on the unit type, and ignore constructor names.  For example:

type IntList = Nil | Cons of int*IntList

becomes:

μIntList.(1+int*IntList)

Again, the type of lists of integers that we have already identified (modulo the choice of “X” or “IntList” for the recursion symbol).

This machinery of recursive types takes some getting used to, but it can produce the same infinite series that we expect in algebra, and it’s very convenient for recursive types with multiple branches of recursion, for example the type of binary trees of integers:

μX.(int+X*X)

In other words, a binary tree is either a leaf (storing an int) or it’s a branch (storing a pair of binary trees).

It seems like we can describe a very rich category of types with these tools (for more detail, see “the bible of type theory”).  What’s more, our existing knowledge of algebra can be ported to this world of types, where we find many familiar structures.  But this formalization of types goes even deeper, and there are more concepts left to unify.

for Data

In the world of functional programming, destructive mutation is discouraged or even banned outright.  From a formal perspective, an update should produce a new copy of the original value, with both versions available for some later use.  This view has simplified many otherwise thorny problems, and it makes programs much simpler to reason about.

We have previously discussed ways to recover mutation-like functions, inspired by the most famous data structure of its kind: the Zipper.

In 2001, Conor McBride made a remarkable discovery about algebraic data types, and the concept of a zipper, which he elaborated four years later (along with coauthors Abbot, Ghani, and Altenkirch).

Central to this discovery is the concept of a one-hole context.  A one-hole context can be thought of as a data type “with a hole in it.”  This is a concept closely linked to the concept of mutation, as a value can be “updated” by “plugging in” that hole.

Consider as a very simple example, a 3-tuple of integers.  Its type is int*int*int, or int**3.  Now, we might have a value of this type:

(98, 76, 54)

We might want to “update” one of these values, or at least single one out for updating.  We can do this by “poking a hole” in the data structure where a particular value goes.  In other words, we might choose to update the first, second, or third components of this 3-tuple (using “_” to represent a hole):

(_, 76, 54)
(98, _,  54)
(98, 76, _)

These three cases represent the only ways that we could “poke a hole” in this 3-tuple data, and we can capture that idea in a corresponding data type.  Because there are three ways to poke a hole, the type should be a three-way sum.  Each possibility leaves two integers (in different configurations), and so we can pretty straightforwardly deduce this type of one-hole contexts:

(int*int)+(int*int)+(int*int)

In other words, the first int*int corresponds to the type of “(_, 76, 54)“, the second int*int corresponds to the type of “(98, _, 54)“, and the third int*int corresponds to the type of “(98, 76, _)“.

It’s useful to simplify the type of this expression, and so we might shrink it to (explicitly naming squares):

int**2+int**2+int**2

and then combining like terms:

3*int**2

In other words, the type of one-hole contexts of 3-tuples of ints is one of three int-pairs.

Now here’s an odd thing, we started out with the type:

int**3

and taking its one-hole context we derived the type:

3*int**2

That looks an awful lot like the derivative of differential calculus!

In fact that’s exactly what McBride discovered, that the derivative of an algebraic data type is its type of one-hole contexts!  Extended to recursive types (see McBride’s paper for more details) we can use the derivative to mechanically produce functions for updating and iterating over arbitrary data structures.

Further Reading

Predating McBride’s work, Andre Joyal explored the concept of combinatorial species (built on the concept of generating functions — a formalism similar in purpose to algebraic data types), where the use of the derivative was also discovered.

McBride is also the co-creator of the programming language Epigram, a programming language with a powerful theory of types.

In the next article, we’ll look at implementing this notion of differentiation of data types with a brief Haskell program.

Posted by Kalani Thielen in Computer Science