Adjunctions

Universal properties are a generalization of the notion of a Galois connection between two orderings. Or perhaps I should say: universal properties arise from adjunctions, and it is adjunctions that are a generalization of Galois connections. Adjunctions capture in an abstract categorical setting the idea of “optimal solutions to a problem”; and this idea is itself very general, capturing many of the structures underlying common patterns in programming (not to mention the rest of mathematics). Solutions to equations, products, limits of sequences of approximations, and minimality and maximality are just some of the instances of this powerful abstraction that we will make use of. In the preface to Categories for the Working Mathematician, Mac Lane wrote that “adjoint functors arise everywhere”.

Adjoint functors

Two functors {\mathsf{F} : \mathbb{D} \leadsto \mathbb{C}} and {\mathsf{G} : \mathbb{C} \leadsto \mathbb{D}} form an adjunction, written {\mathsf{F} \dashv \mathsf{G}}, if there is an isomorphism between the sets of arrows {\mathsf{F}(B) \rightarrow A} in {\mathbb{C}} and {B \rightarrow \mathsf{G}(A)} in {\mathbb{D}}. We say that {\mathsf{F}} is the left adjoint and {\mathsf{G}} the right adjoint. The essence of the isomorphism is captured by two natural transformations {\eta : \mathsf{Id} \mathbin{\stackrel{.}{\to}} \mathsf{G} \mathbin{\cdot} \mathsf{F}} in {\mathbb{D}} and {\epsilon : \mathsf{F} \mathbin{\cdot} \mathsf{G} \mathbin{\stackrel{.}{\to}} \mathsf{Id}} in {\mathbb{C}}, called the unit and counit of the adjunction; {\eta} is the image in {\mathbb{D}} of {\mathit{id}_{\mathsf{F}(B)} : \mathsf{F}(B) \rightarrow \mathsf{F}(B)} in {\mathbb{C}}, and conversely, {\epsilon} is the image in {\mathbb{C}} of {\mathit{id}_{\mathsf{G}(A)}} in {\mathbb{D}}. The unit and counit satisfy the laws

\displaystyle  \begin{array}{lcl} \epsilon_{\mathsf{F}(B)} \cdot \mathsf{F}(\eta_B) &=& \mathit{id}_{\mathsf{F}(B)} \\ \mathsf{G}(\epsilon_A) \cdot \eta_{\mathsf{G}(A)} &=& \mathit{id}_{\mathsf{G}(A)} \end{array}

From them one can construct the witnesses to the isomorphism for arbitrary arrows: for each arrow {f : \mathsf{F}(B) \rightarrow A} in {\mathbb{C}}, there is a unique arrow {g : B \rightarrow \mathsf{G}(A)} in {\mathbb{D}} such that {\epsilon_A \cdot \mathsf{F}(g) = f}, given by {g = \mathsf{G}(f) \cdot \eta_B}; and conversely, for each arrow {g : B \rightarrow \mathsf{G}(A)} in {\mathbb{D}}, there is a unique arrow {f : \mathsf{F}(B) \rightarrow A} in {\mathbb{C}} such that {\mathsf{G}(f) \cdot \eta_ B = g}, given by {f = \epsilon_B \cdot \mathsf{F}(g)}; and moreover, these two constructions are each other’s inverses.

Adjunctions from Galois connections

A preorder {(X,{\le})} forms a category: the objects of the category are the elements of the set~{X}, and between any two elements {x,y \in X}, there is a unique arrow if {x \le y}, and no arrow otherwise. That adjunctions are a generalization of Galois connections follows straightforwardly from the fact that there is at most one arrow between any two objects in a preorder category. Then monotonic functions {f : (X,{\le_X}) \rightarrow (Y,{\le_Y})} and {g : (Y,{\le_Y}) \rightarrow (X,{\le_X})} between preorders {(X,{\le_X})} and {(Y,{\le_Y})} form a Galois connection precisely if the sets of arrows {f(y) \rightarrow x} and {y \rightarrow g(x)} are isomorphic—that is, if both {f(y) \le_X x} and {y \le_Y g(x)} hold, or neither do, or in other words,

\displaystyle  f(y) \le_X x \Leftrightarrow y \le_Y g(x)

Adjoints of the diagonal functor

A very useful example of adjunctions arises in the definition of products—in the category {\mathbb{S}\mathrm{et}} of sets and total functions, for given types {A,B,C}, there is an isomorphism between the set of pair-generating functions, of type {A \rightarrow B \times C}, and their two projections, pairs of functions of types {A \rightarrow B} and {A \rightarrow C}. (Indeed, given functions {f:A \rightarrow B} and {g:A \rightarrow C}, one can construct the pair-generating function {\mathit{fork}(f,g) : A \rightarrow B \times C}; and conversely, given a pair-generating function {h : A \rightarrow B \times C}, one can construct its two projections {fst \cdot h : A \rightarrow B} and {snd \cdot h : A \rightarrow C}; and moreover, these two constructions are inverses.)

The “isomorphism between sets of arrows” can be elegantly expressed as an adjunction; since it concerns pairs of arrows, one side of the adjunction involves the product category {\mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}}. The right adjoint is the product functor {(\times) : \mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et} \leadsto \mathbb{S}\mathrm{et}}, mapping an object in {\mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}}—that is, a pair of sets—to their cartesian product as an object in {\mathbb{S}\mathrm{et}}, and an arrow in {\mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}}—that is, a parallel pair of functions—to a function in {\mathbb{S}\mathrm{et}} acting pointwise on pairs. In the other direction, the left adjoint is the diagonal functor {\triangle : \mathbb{S}\mathrm{et} \leadsto \mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}}, mapping an object {A} in {\mathbb{S}\mathrm{et}} to the object {(A,A)} in {\mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}}, and a function {f} to the pair of functions {(f,f)} as an arrow in {\mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}}. The adjunction {{\triangle} \dashv (\times)} amounts to the isomorphism

\displaystyle  \triangle A \rightarrow (B,C) \approx A \rightarrow {\times} (B,C)

or equivalently,

\displaystyle  (A \rightarrow B)\times(A \rightarrow C) \approx A \rightarrow (B\times C)

The unit and counit of the adjunction are {\eta : \mathsf{Id} \mathbin{\stackrel{.}{\to}} (\times) \mathbin{\cdot} \triangle} and {\epsilon : \triangle \mathbin{\cdot} (\times) \mathbin{\stackrel{.}{\to}} \mathsf{Id}}. In more familiar terms, the unit is a natural transformation in {\mathbb{S}\mathrm{et}}, so a polymorphic function; in fact, it’s the function of type {A \rightarrow A \times A} that we might call {\mathit{double}}. However, the counit is a natural transformation {(A \times B,A \times B) \rightarrow (A,B)} in {\mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}}, so not simply a (polymorphic) function; but arrows in {\mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}} are pairs of functions, so we might write this {(\mathit{fst},\mathit{snd}) :: (A \times B \rightarrow A, A \times B \rightarrow B)}.

Then the “fork” operation is in fact one of the two witnesses to the isomorphism between the sets of arrows: given an arrow {\triangle A \rightarrow (B,C)} in {\mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}}, that is, a pair {(f,g)} of functions of types {(A \rightarrow B,A \rightarrow C)}, then {\mathit{fork}(f,g)} is an arrow {A \rightarrow {\times} (B,C)} in {\mathbb{S}\mathrm{et}}, that is, a function of type {A \rightarrow B \times C}, given by the construction above:

\displaystyle  \mathit{fork}(f,g) = (\times) (f,g) \cdot \mathit{double}

or, with more points,

\displaystyle  \mathit{fork} (f,g)\,a = (f\,a, g\,a)

The laws that the unit and counit satisfy are

\displaystyle  \begin{array}{lcl} (\mathit{fst},\mathit{snd}) \cdot \triangle \mathit{double} &=& \mathit{id} \\ (\times) (\mathit{fst},\mathit{snd}) \cdot \mathit{double} &=& \mathit{id} \end{array}

or, in more familiar terms,

\displaystyle  \begin{array}{lcl} \mathit{fst} \cdot \mathit{double} &=& \mathit{id} \\ \mathit{snd} \cdot \mathit{double} &=& \mathit{id} \\ \mathit{fork} (\mathit{fst},\mathit{snd}) &=& \mathit{id} \end{array}

The universal property of {\mathit{fork}} follows from the isomorphism between sets of arrows:

\displaystyle  \begin{array}{ll} & h = \mathit{fork}(f,g) \\ \Leftrightarrow & \qquad \{ \mbox{\(\mathit{fork}\)} \} \\ & h = (\times) (f,g) \cdot \mathit{double} \\ \Leftrightarrow & \qquad \{ \mbox{isomorphism between arrow sets} \} \\ & (\mathit{fst},\mathit{snd}) \cdot \triangle h = (f,g) \\ \Leftrightarrow & \qquad \{ \mbox{\(\triangle\)} \} \\ & (\mathit{fst},\mathit{snd}) \cdot (h,h) = (f,g) \\ \Leftrightarrow & \qquad \{ \mbox{composition in \(\mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}\) is pointwise} \} \\ & (\mathit{fst} \cdot h,\mathit{snd} \cdot h) = (f,g) \\ \Leftrightarrow & \qquad \{ \mbox{equality of pairs is pointwise} \} \\ & \mathit{fst} \cdot h=f \land \mathit{snd} \cdot h=g \end{array}

The universal property of {\mathit{fork}} underlies all the useful laws of that operator.

Of course, the situation nicely dualizes too. Coproducts in {\mathbb{S}\mathrm{et}} arise from the isomorphism between the set of arrows {A+B \rightarrow C} and the pairs of arrows in {A \rightarrow C} and {B \rightarrow C}. Again, “pairs of arrows” suggest the product category; but this time, the diagonal functor is the right adjoint, with the coproduct functor {(+) : \mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et} \rightarrow \mathbb{S}\mathrm{et}} (which takes a pair of sets {(A,B)} to their disjoint union) as the left adjoint. That is, the adjunction is {(+) \dashv \triangle}, and the isomorphism is

\displaystyle  (+) (A,B) \rightarrow C \approx (A,B) \rightarrow \triangle C

The unit {\eta : \mathsf{Id} \mathbin{\stackrel{.}{\to}} \triangle \mathbin{\cdot} (+)} is a natural transformation {(A,B) \rightarrow (A+B,A+B)} in {\mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}}, that is, a pair of functions {\mathit{inl} : A \rightarrow A+B} and {\mathit{inr} : B \rightarrow A+B}. The counit {\epsilon : (+) \mathbin{\cdot} \triangle \mathbin{\stackrel{.}{\to}} \mathsf{Id}} is a natural transformation {A+A \rightarrow A} in {\mathbb{S}\mathrm{et}}, which we might call {\mathit{merge}}. The “join” of two functions with a common range is a witness to one half of the isomorphism—given an arrow {(f,g) : (A,B) \rightarrow \triangle C} in {\mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}}, then {\mathit{join} (f,g)} is an arrow {(+) (A,B) \rightarrow C} in {\mathbb{S}\mathrm{et}}, defined by

\displaystyle  \mathit{join} (f,g) = \mathit{merge} \cdot (+) (f,g)

The two laws that the unit and counit satisfy are:

\displaystyle  \begin{array}{lcl} \mathit{merge} \cdot (+) (\mathit{inl},\mathit{inr}) &=& \mathit{id} \\ \triangle \mathit{merge} \cdot (\mathit{inl},\mathit{inr}) &=& \mathit{id} \end{array}

or, perhaps more perspicuously,

\displaystyle  \begin{array}{lcl} \mathit{join} (\mathit{inl},\mathit{inr}) &=& \mathit{id} \\ \mathit{merge} \cdot \mathit{inl} &=& \mathit{id} \\ \mathit{merge} \cdot \mathit{inr} &=& \mathit{id} \end{array}

Another familiar example from functional programming is the notion of currying, which arises when one can construct the function space {A \Rightarrow B} (the type of functions from {A} to {B}, for each type {A} and {B}), such that there is an isomorphism between the sets of arrows {A \rightarrow (B \Rightarrow C)} and {A \times B \rightarrow C}. Here, the adjunction is {( \times B) \dashv (B \Rightarrow )}—in this case, both functors are endofunctors on {\mathbb{S}\mathrm{et}}. The unit and counit are natural transformations {\mathsf{Id} \mathbin{\stackrel{.}{\to}} (B \Rightarrow )\mathbin{\cdot}( \times B)} and {( \times B)\mathbin{\cdot}(B \Rightarrow ) \mathbin{\stackrel{.}{\to}} \mathsf{Id}}. We might call these {\mathit{pair}} and {\mathit{apply}}, since the first is a curried pair-forming operator, and the second applies a function to an argument:

\displaystyle  \begin{array}{lcl} \mathit{pair} &:& A \rightarrow (B \Rightarrow (A \times B)) \\ \mathit{apply} &:& (B \Rightarrow A) \times B \rightarrow A \end{array}

The laws they satisfy are as follows:

\displaystyle  \begin{array}{lcl} \mathit{apply} \cdot ( \times B) \mathit{pair} &=& \mathit{id} \\ (B \Rightarrow )\mathit{apply} \cdot \mathit{pair} &=& \mathit{id} \end{array}

or, in points,

\displaystyle  \begin{array}{lcl} \mathit{apply} (\mathit{pair}\,a,b) &=& (a,b) \\ \mathit{apply} \cdot \mathit{pair}\,f &=& f \end{array}

The isomorphism itself is witnessed by the two inverse functions

\displaystyle  \begin{array}{lcl} \mathit{curry}\,f &=& (B \Rightarrow ) f \cdot \mathit{pair} \\ \mathit{uncurry}\,g &=& \mathit{apply} \cdot ( \times B) g \end{array}

where {f : A \times B \rightarrow C} and {g : A \rightarrow (B \Rightarrow C)}.

About these ads

About jeremygibbons

Jeremy is Professor of Computing in Oxford University Department of Computer Science, and a fan of functional programming and patterns of computation.
This entry was posted in Uncategorized. Bookmark the permalink.

4 Responses to Adjunctions

  1. Derek Elkins says:

    While you haven’t yet mentioned (co)continuity of adjoints, which is one of the most valuable things about knowing that something is an adjoint, it’s nice to know that almost all the “high school algebra” rules that the naming and notation suggests follow immediately from (co)continuity of the adjoints defining a cartesian closed category, or indeed any symmetric monoidally closed category. (One of the ones that doesn’t is Ax1 ~ A, which is natural as the equivalent to that is an axiom of a monoidal category.)

    However, to do this to the fullest extent requires recognizing another adjunction, of some importance, that exists in any symmetric monoidally closed category, and one that is rarely mentioned in contexts like this for reasons I don’t understand. Namely, (->B)^op -| (->B). Admittedly, in the CCC case, this doesn’t introduce any new constructs, it’s just a consequence. In the monoidally closed category case, though, asserting this adjunction is the same as asserting a symmetry via uncurry (flip (curry id)), though this is a bit roundabout. Of course, even as a consequence this is useful since it immediately gives things like (0->B) ~ 1 and ((A+B)->C) ~ (A->C)x(B->C).

  2. I guess my unfamiliarity with (co-)continuity of adjoints is just a deeper manifestation of my similar unfamiliarity with the corresponding property of partial orders – that lower (upper) adjoints preserve joins (meets). I’m not aware of ever having had occasion to digest that…

  3. Alan Schmitt has pointed out that I was a bit glib in my definition of adjunction. It’s not enough for there just to be an isomorphism phi between the hom-sets F(B)->A and A->G(B); it must be natural in A and B:

      phi(f . F(h)) = phi(f) . h
      phi(k . f) = G(k) . phi(f)
    

    You need naturality to prove the two identity laws

      eps . F(eta) = id
      G(eps) . eta = id
    
  4. Pingback: The Abstract Interpretation Monad | Eran's blag

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s