Mikrokosmos
Programming on the λ-calculus
Table of Contents
This is a tutorial in how to program on the untyped and simply-typed λ-calculus using the Mikrokosmos interpreter. It is divided on seven chapters of progressive difficulty.
- The λ-calculus: describes the lenguage.
- Boolean logic: implements logical gates.
- Numbers: implements aritmetical operators and comparisons.
- Combinatory logic: describes the SKI combinators.
- Data structures: implements lists and binary trees.
- Recursion: describes fixpoint operators and recursion.
- Types: describes the simply-typed λ-calculus and its logical interpretation.
1 The λ-calculus
1.1 The untyped λ-calculus
The untyped λ-calculus (sometimes called λ-calculus) is a formal system that is also a model of computation (in the same sense that Turing machines are, for instance). In fact, it is Turing-complete, that is, it has the property that every computable function, every function you could write in any programming language, could be written on the untyped λ-calculus. It was introduced in the 1930s by Alonzo Church, who was looking for a formal system able to express the notion of computability.
Its syntax, however, is very simple; any expression is either:
- a variable, \(x,y,z,\dots\) which may appear free or as a placeholder when defining a function;
- a lambda abstraction \(\lambda x. -\), which can be interpreted as a function taking \(x\) as an argument and returning what is written under \(-\), which in turn can depend on \(x\);
- or an application of two expressions \((\lambda x. M)\ N\) which applies the function \((\lambda x. M)\) over the argument \(N\). When a function is applied, all the ocurrences of \(x\) on \(M\) will be substituted by \(N\). This process is called a β-reduction, and it captures our usual notion of function application.
That is, in λ-calculus any function \(f\) can be written as \(\lambda x.f(x)\), and it can be applied to an argument and β-reduced as
\[ (\lambda x.f(x))\ y \longrightarrow f(y). \]
For example, the function that duplicates an element could be written as \((\lambda x.\ 2 \cdot x)\), and it would be applied to \(3\) as
\[ (\lambda x.\ 2 \cdot x)\ 3 \longrightarrow 2 \cdot 3 \longrightarrow 6. \]
In this chapter, however, we will restrict ourselves to very basic functions which do not involve arithmetic. How to define natural numbers, addition or multiplication will be topics for the next chapters. An introduction to the untyped lambda calculus can be found in the following video.
On this tutorial we will use a didactic λ-calculus interpreter, Mikroskosmos; which can be used directly from the browser. Every time you find a Mikrokosmos code snippet like the following, you can execute it by pressing evaluate.
1.2 A first example
In Mikrokosmos, it is possible to use the backslash character \
as a λ
to make the input easier. If you happen to have a greek keyboard
configured, however, feel free to write λ
! Mikrokosmos accepts both
sintaxes. For instance, the nested expression \(\lambda x.\lambda y.x\)
can be written as \x.\y.x
.
Our first example defines the identity function \(\lambda x. x\); which is a function that takes an argument \(x\) and returns it unchanged. Our first statement just after this definition applies this function to itself, and the result is again the identity. Keep in mind during this tutorial that it is perfectly possible to apply functions to other functions or to themselves; in fact, this is one of the core ideas of λ-calculus.
In its output, Mikrokosmos shows the evaluated term and possible names
for it. We will learn what those names mean later. In the previous
example, the evaluation of id id
to id
has followed this chain of
β-reductions and equivalences
\[ id\ id = (\lambda x.x)(\lambda y.y) \longrightarrow (\lambda y.y) = id. \]
Note that the specific name of a variable is irrelevant. They act only
as placeholders. A perfectly valid definition of id
would be id = \a.a
or id = \b.b
.
1.3 Some more examples
Function application is left distributive and parentheses can be
ommited when they are not necessary, that is, f g h
must be read as
(f(g))(h)
instead of f(g(h))
. This goes against the usual
mathematical convention, which always writes parentheses on functions,
that is, we write
\[
f\ x,\quad\text{ instead of }\quad f(x);
\]
and \[ f\ x\ y,\quad\text{ instead of }\quad (f(x))(y),\ \text{ or }\ f(x,y); \]
but this makes it less verbose to write multiple argument functions such as the constant function, which takes two arguments and returns the first one.
Note that, in λ-calculus, a "function taking two arguments" and "a function that takes an argument and returns a function taking the second" are conceptually the same. This equivalence is called currying in honor of Haskell Curry, the logician who developed this idea, previously discovered by Schönfinkel and Frege.
This is why we can see every function with two arguments as a function
taking only one argument and returning again a function taking the
second one. For example, the discard = const id
function is a function that
takes only one argument, discards it, and outputs id
. This is called
a partial application of a function.
A more useful example of function taking functions as arguments is the function composition, which takes two functions and returns a new one created by applying the two sequentially. This corresponds to the usual mathematical function composition \(f \circ g\).
1.4 Exercises
Exercise 1.1: Think what should be the result of evaluating the following expressions and then check it with the interpreter. Beware that the last one is a bit more difficult than the other two.
compose const id
compose id const
compose const const
Exercise 1.2: Write the lambda terms for
- function application, a function taking a function \(f\) and a term \(x\) as arguments and returning \(f(x)\); and
- flip, a function that takes a function on two arguments \(f\) and returns a \(g\) such that \(g(x,y) = g\ x\ y = f\ y\ x = f(y,x)\). In other words, \(\mathsf{flip}\ f\ x\ y = f\ y\ x\).
And test them on the interpreter.
2 Boolean logic
2.1 Booleans
We will start our programming journey on λ-calculus encoding
boolean logic. In boolean logic, we have two elements true
and false
;
and some logical gates such as
- the and gate, which outputs true if and only if its two inputs are true.
- the or gate, which outputs true if any of its inputs is true, or even if both are true.
- the not gate, which outpus true if and only if its input is false.
For example,
\[ \mathsf{not}\ (\mathsf{or}\ \mathsf{true}\ \mathsf{false}) = \mathsf{not}\ \mathsf{true} = \mathsf{false}. \]
Our intuition on what means to be a truth value is that truth values
can distinghish between two values (true or false) or two branches on
a program (if ... then ... else ...)
. We are going to use this
intuition to write an encoding of boolean values based on their
ability to choose between two branches. Maybe surprisingly, this
encoding will be also useful to write the usual boolean logic gates.
Here, a truth value is a function on two elements that chooses one of them.
- \(\mathsf{true}\ a\ b = a\)
- \(\mathsf{false}\ a\ b = b\)
This is called the Church encoding of the booleans, as it was described by Alonzo Church for the first time. This idea of defining a data structure based not on its content but on how it can be used will appear later, when dealing with more complex structures. The following are examples on how false and true work.
In particular, true
is exactly the same lambda term as const
. This is
a very usual phenomenon in λ-calculus, where the same function
can be interpreted differently in different contexts. Mikrokosmos
allows giving multiple names to the same term.
2.2 If-else
The advantage of this way of encoding the boolean values is that they can be easily used in combination with other λ-terms. In particular, the way to encode an if-else is almost trivial: it is already encoded on the λ-terms!
If we really want to write an if-else
function, it will be, quite
literally, the trivial one.
2.3 Logic gates
Usual operations on booleans can be defined too on this encoding and they will be surprisingly easy to construct if we think of booleans as functions choosing from two terms.
Exercise 2.1: Explain why this definition of the and
gate works.
Hint: think what happens when the first argument is true
.
What happens if it is false
?
The or
gate can be defined in a similar way.
And finally, the negation operator is only a way of interchanging the two truth values
2.4 Exercises and more logical gates
The boolean logic implication operator works also as a boolean gate, it can be defined as
\[(a \to b) \equiv (\neg a) \vee b,\]
that is, the implication is true if both are true or if the premise is false.
Exercise 2.2: Compute the complete operation table for the implication using the previous definition.
Exercise 2.3: Compute the following boolean clauses using lambda calculus
- \((\mathsf{true} \vee \mathsf{false})\to \mathsf{false}\).
- \(\mathsf{false} \to (\mathsf{false} \to \mathsf{false})\)
- \((\neg \mathsf{false} \wedge \neg \mathsf{true}) \to \mathsf{true}\)
Exercise 2.4: Define the xor gate as a lambda term. The xor of two boolean values must return a true if and only if exactly one of them are true. Check also its logic table. Hint: you may want to use the already defined `not`.
3 Numbers
3.1 Peano and the natural numbers
In the 19th century, Giuseppe Peano gave a definition of the natural numbers and an axiomatic theory of them based on only two contructors.
- The zero is a natural number, written as Z.
- The successor of a natural number is a natural number, written as S.
In these terms, the usual natural numbers will be
\[ Z,\ S\ Z,\ S\ (S\ Z),\ S\ (S\ (S\ Z)),\ \dots \]
The question is now how can we encode them on λ-calculus. We cannot simply write the two constructors on λ-calculus, but we can make the natural numbers depend on any two given constructors. This is again the same idea we used when we worked with boolean values: we do not care about the structure, but about how can it be used later.
Under this interpretation, a number \(n\) is really a function taking a function \(a\) as an argument and applying it \(n\) times over the argument \(b\). Interpret the following two evaluations.
Exercise 3.1: Define a function that takes a natural number and returns true if and only if the number is even. Hint: you may want to think of the given number as a function.
3.2 Addition and multiplication
The encoding of the addition and multiplication of natural numbers will profit from the interpretation of numbers as functions. We are actually encoding naturals as their induction principle: we can define a function by defining how it acts on zero and a successor.
The double
function will only change the successor for the
composition of the successor function with itself.
Exercise 3.2: Define a triple
function.
We are going now to define addition using this same principle. It
takes a successor and a zero, computes the first number as n s z
and then uses it as a zero on the interpretation of the second one.
Exercise 3.3: How would you define multiplication? Keep in mind that
you can use a number as a function. Keep also in mind the previous
exercises on double
and triple
. Spoilers below!
You may want to try this exercise before continuing.
There many possible ways of defining multiplication. Some of them can
use the repeated application of plus
to a number; but we are going
to define multiplication in a way that is similar to how we defined
double
previously. We are going to interpret the successor as the
n-fold application of successor.
3.3 The predecessor function
How to compute the predecessor of a number? We have not encoded negative numbers, so it could be a function returning zero whenever it tries to get the predecessor of zero. It is an insightful exercise to try to define it by yourself, but please, do not get too obsessed with it. The solution is certainly not easy.
The solution is the term that follows this paragraph. You probably expected something easier! Stephen Kleene, who was a student of Alonzo Church, discovered for the first time how to write a predecessor on lambda calculus while at the dentist. This discovery made Church start thinking that every intuitively computable function could be computed using lambda calculus, that is, that the notions of lambda-computable function and intuitively computable function would coincide.
But why does something like this even work? The main idea is to create a function that can be applied to a pair of numbers \(n\) times and in such a way that the first application renders \(1,0\), and any subsequent application acts as a successor on both numbers. We finally take only the second one. A detailed derivation of this function can be found here; but we will also study it in the Exercise 3.5.
Exercise 3.4: Use the predecessor function to define the minus
function. It should return the difference between two numbers. It
should return zero whenever the first number is smaller than the
second.
Exercise 3.5: Let's look again at the definition of pred
.
The core of the definition is the subexpression \g.(\h.h (g f))
,
which we will call inc f
. It is a sort of reversed composition.
Study what happens when we compose inc f
with itself and reimplement
pred
using inc
.
3.4 Predicates on natural numbers
This encoding even allows us to write predicates on natural numbers. The first predicate will be a function distinguishing a non-zero number from a zero. It will be used later to build more complex ones.
It is built by applying a const false
function n
times to a true
constant. Only if it is applied 0
times, it will return a true
value.
Using this predicate, we can build eq
and leq
, corresponding to \(==\) and \(\leq\).
4 Combinatory logic
4.1 SKI combinators
Combinatory logic provides a notation for λ-terms independent from quantified variables. Every λ-expression can be written in terms of three combinators, \(S,K,I\), which are defined as
- \(I = \lambda x.x\), the identity function;
- \(K = \lambda x.\lambda y.x\), the constant function;
- \(S = \lambda x.\lambda y.\lambda z. x z (y z)\), a generalized application.
The first one, the identity, can be also written as a function of \(S\) and \(K\).
The interesting property of this particular set of combinators is that every other lambda expression can be written in terms of them. We can see how a particular lambda expression is written in SKI calculus by turning on the ski mode of the interpreter.
Exercise 4.1: How are Church-encoded numerals represented with SKI combinators? Compute the first four or five numbers and try to come up with the general rule.
4.2 Schönfinkel combinators
Schönfinkel and Church defined different but equivalent sets of combinators. Some of the other classical combinators are
- function application,
C
; - function composition,
B
; - and duplication of an argument,
W
;
and they are defined as follows.
The Y combinator is related to recursion and we will discuss it on the following chapters.
5 Data structures
5.1 Tuples
Tuples can be easily defined from boolean logic. The main idea will be that the application of tuple to a function will result on the application of the function to its two components.
\[ \mathtt{tuple}(a,b)(f) \equiv f\ a\ b \]
With this idea, tuples and their two projections are defined as follows.
As seen in the example, we use true
and false
to select the first
or the second argument to the function. It is possible to use the same
idea to apply other functions.
5.2 Lists I: nil and cons
Data structures such as lists or binary trees can be represented using
the same principle we used to build naturals and booleans. We would
need two constructors to represent a list a nil
signaling the end of
the list and a cons
, joining an element to the head of the list. A
list would be something like this
\[ \mathtt{cons}\ 1\ (\mathtt{cons}\ 2\ (\mathtt{cons}\ 3\ \mathtt{nil})).\]
As we did with natural numbers, we are going to write a representation independent from the constructors, they are going to be passed as arguments. We need
nil
, the empty list;cons
, a function taking an element (head) and a list (tail) and returning a new list.
This interpretation makes easier to write folding functions for
lists. We can define a function on a list by simply giving the
interpretation for the nil
and a binary function as an interpretation
for the const
.
It is useful to encode this principle into a function called
fold
. We are going to define a summation \(\Sigma\) function and a
list product \(\Pi\) function on lists.
Exercise 5.1: Write the any
and all
functions. They are
functions that can be applied over lists of booleans.
all
returns true if the list is made up only of trues.any
returns true if there is at least one true on the list.
You may want to use the fold
function.
5.3 Lists II: map and filter
Map, filter and fold are the most famous examples of higher order functions on lists and a common example of the power of functional programming, which has its roots on lambda calculus.
- The map function applies a function
f
to every element on a list. - The filter function removes the elements of the list that do not satisfy a given predicate. It "filters" the list, leaving only elements that satisfy the predicate.
We are going to implement these functions using our previously defined
fold
.
Exercise 5.3: Write functions
- doubling the value of each number on a list.
- negating each value of a list of booleans.
Hint: you don't need to use lambda abstractions on these definitions.
Filter can be defined using a boolean to decide at each step whether to return a list with a head or return the tail ignoring the head, like this
Exercise 5.4: Write a function that, given any list, returns a list containing only the even numbers on the list.
5.4 Binary trees
Lists have been defined using two constructors and trees will be
defined using the same technique. The only difference with lists is
that the cons
constructor is going to be replaced by a node
constructor, which takes two trees as arguments. That is, a binary
tree is
- an empty tree.
- a node, containing a label, a left subtree, and a right subtree.
Defining functions using a fold-like combinator is again very simple due to the chosen representation. We are going to need also a variant of the usual function acting on three arguments, the label, the right node and the left node.
6 Recursion
6.1 Fixpoints
A fixpoint combinator is a higher-order function that, given a function \(f\), solves the equation
\[ x = f\ x \]
for \(x\), meaning that, if \(\mathtt{fix}\ f\) is the fixpoint of \(f\), the following sequence of equations holds
\[ \mathtt{fix}\ f = f (\mathtt{fix}\ f) = f ( f (\mathtt{fix}\ f)) = f ( f ( f (\mathtt{fix}\ f))) = \dots \]
The first surprising fact is that such a combinator actually exists.
In the following example, we define the fix
combinator; which we
will use later to define recursive functions. The problem it has is
that it cannot be evaluated without arguments into a closed form, so
we have to delay the evaluation of the expression when we bind it. To
do this, we use the Mikrokosmos !=
operator, which binds an
expression to a variable without simplifying it first.
This `fix` operator (which is more commonly called the Y combinator)
allows us to use the function we are defining on its own
definition. The function will be passed as the first argument to the
argument of fix, as f = fix (\f. ...)
. It is important to notice
that recursive functions, even if they work, cannot be evaluated alone
without entering an infinite beta-reduction loop.
In Mikrokosmos, we need the !=
operator when defining recursive
functions to prevent them from expanding.
Our first example is the factorial function.
The complexity of computing a factorial grows exponentially, and the
lambda calculus (and particularly, this encoding of natural numbers)
was not thought to be efficient. fact 6
will surely be too much for
the interpreter.
As a last example, we are going to define Fibonacci numbers.
6.2 Evaluation
The order in which evaluation is performed is crucial to determine if an expression will eventually terminate. Mikrokosmos evaluates every expression from left to right, that is, the arguments of a function are not evaluated until they are being actually used on the function. This is not the most efficient way: if the same argument appears twice in the body of the function, it will be evaluated twice! but it prevent some expressions taking from entering an inifinite loop.
For instance, fix
is a non-terminating term (it diverges); but if
it is used inside an ifelse
statement, it will be not evaluated at
all.
If false 2 fix
is evaluated, however, Mikrokosmos will enter an
infinite loop. Normally, your browser will allow you to stop the
Javascript script and reload the page.
Other examples of Mikrokosmos dealing with non terminating functions include infinite lists as in the following examples, where we take the first term of an infinite list without having to evaluate it completely or compare an infinite number arising as the fix point of the successor function with a finite number.
7 Types
7.1 Simply typed λ-calculus
Until now, we have been talking about untyped lambda calculus, but we are now going to deal with the simply-typed lambda calculus. The main differences are that
- every term has a type;
- only a subset of the lambda expressions can be written in simply-typed lambda calculus, the typable ones;
- every term normalizes, that is, every computation finishes;
- as a consequence, it is not Turing-complete.
The command :types on
activates types. Types are displayed with
every lambda expression, but certain lambda expressions which cannot
be typed cannot be used anymore. The fix
operator is an example.
A type is written as a set of type variables and arrows, where A -> B
represents the type of a function between A
and B
. Currying
works also with types, and a multiargument function must be written as
A -> B -> C
. The interpreter will always try to infer the
most general type, that is, it is preferible to have A -> B
than the
particular case A -> C -> D
where B
happens to be C -> D
.
Exercise 7.1: The library and
function is not typeable. Think why and
write a similar typeable and
function.
7.2 Propositions as types
What types are inhabited? It is easy to find an expression of the type
A -> A
, but it seems that there is no expression of type A -> B
.
We can reason that any expression of that type should be able to
transform any given input type onto any desired output type, and that
such an expression would not be possible.
The rules of lambda calculus are similar to the rules of the intuitionistic propositional logic; this means that a type will be inhabited if and only if the type, reading arrows as logical implications, is a tautology of propositional logic. That is, the axioms of intuistic propositional logic are
- every expression implies itself,
A -> A
. - we can discard any assumption to arrive at a conclusion
A -> B -> A
. - an assumption can be used multiple times to arrive at intermediate
conclusions,
(A -> B -> C) -> (A -> B) -> A -> C
.
And these are precisely the types of the SKI combinators. As any lambda expression can be written in terms of these combinators, every lambda expression of a type is actually a proof of the proposition the type represents.
We can define some logical connectives using only the implication. For
example, the negation of a proposition \(A\) would be a function taking
\(A\) and returning any given type. As we discussed earlier, this should
be impossible, so the existence of a function T -> B
where B
is a
free variable should be a proof of the type T
not being inhabited.
For example, we can write a proof of the modus ponens by presenting an inhabitant of the type \(A \to (A \to B) \to B\), where A and B are free type variables.
7.3 Products, unions and logic
Mikrokosmos supports product, union, unit and void types.
- The product type
A × B
is the type of the pairs of elements where the first one is of typeA
and the second one is of typeB
. - The union type
A + B
is a type that can contain an element of typeA
or an element of typeB
. It is a labeled disjoint union of the two types. - The unit type
⊤
has only one element, calledunit
or★
. - The void type
⊥
has no elements. Its utility will be discussed later.
They can be used with the following typed constructors
Constructor | Type | Description |
---|---|---|
(-,-) |
A → B → A × B |
Pairs |
fst |
(A × B) → A |
First projection |
snd |
(A × B) → B |
Second projection |
inl |
A → A + B |
First inclusion |
inr |
B → A + B |
Second inclusion |
caseof |
(A + B) → (A → C) → (B → C) → C |
Case analysis |
unit |
⊤ |
Unit element |
abort |
⊥ → A |
Empty function |
absurd |
⊥ → ⊥ |
Empty identity |
The following are examples of the use of typed constructors.
As an example, we could use the type ⊤ + ⊤
to write a new version
of the booleans. This new version will use caseof
as its ifelse
.
7.4 Conjunctions, disjunctions and proofs
These types complete the correspondence between intuitionistic logic and lambda calculus. A type is inhabited if and only if its proposition is provable.
Description | Type | Proposition | Description |
---|---|---|---|
Product type | A × B |
A ∧ B |
Logical conjunction |
Disjoint union type | A + B |
A ∨ B |
Logical disjunction |
Unit type | ⊤ |
⊤ |
True proposition |
Empty type | ⊥ |
⊥ |
False proposition |
Function to empty | A → ⊥ |
¬A |
Logical negation |
Following this correspondence, we are now able to write more complex proofs on intuitionistic logic. For example, the following is a proof of the trivial fact that
\[ A \wedge B \to B \wedge A; \]
and, at the same time, it is a function swapping the two terms of a pair.
We had defined the negation of a type as a function from that type
to the empty type. This makes sense, only if the type is not inhabited
it is possible to write an empty function. The abort
function is
useful if we want to use the principle of explosion, that is, from
falsity, everything follows. As an example, we will prove
\[ A \wedge \neg A \to B \]
and one of the deMorgan laws
\[ \neg (A \vee B) \to \neg A \wedge \neg B \]
as the following terms.
Note the use of abort
and absurd
as □
and ■
!
The characteristic difference of classical versus intuitionistic logic
is that \(A \vee \neg A\) and \(\neg \neg A \to A\) (the law of excluded
middle, LEM) are not provable on intuitionistic logic. It is not
possible to find an expression of type A + (A → ⊥)
,
which would correspond to \(A \vee \neg A\).
It is possible, however, to prove \(\neg \neg (A \vee \neg A)\).
Now that we are writing proofs, it is a good time to introduce a new
feature of the interpreter. Mikrokosmos can show you the derivation
tree of any of your proofs with @@
. For example, we will prove
that
\[ A \vee B \to B \vee A \]
as
7.5 Curry-Howard
The Curry-Howard isomorphism not only states the correspondence between types and propositions and between terms and proofs; but a full isomorphism with respect to term evaluation and proof simplification. That is, if a term corresponds to a proof, the fully evaluated term corresponds to a fully simplified proof.
Every simply typed λ-calculus term can be understood as a
derivation tree of the term it is proving. Mikrokosmos allows the user
to check the derivation tree of a given λ-term using @
. And
it can also simplify the derivation using @@
, as we did before. We
will use our previously defined function/proof swap
as an example.
In the first case, the function is not fully simplified, and a more complex derivation tree arises from it. The simplification of the term corresponds directly to a simplification of the derivation tree.