Prompted by some recent work I’ve been doing on reasoning about monadic computations, I’ve been looking back at the work from the 1990s by Phil Trinder, Limsoon Wong, Leonidas Fegaras, Torsten Grust, and others, on monad comprehensions as a framework for database queries.
The idea goes back to the adjunction between extension and intension in set theory—you can define a set by its extension, that is by listing its elements:
or by its intension, that is by characterizing those elements:
Expressions in the latter form are called set comprehensions. They inspired a programming notation in the SETL language from NYU, and have become widely known through list comprehensions in languages like Haskell. The structure needed of sets or of lists to make this work is roughly that of a monad, and Phil Wadler showed how to generalize comprehensions to arbitrary monads, which led to the “do” notation in Haskell. Around the same time, Phil Trinder showed that comprehensions make a convenient database query language. The comprehension notation has been extended to cover other important aspects of database queries, particularly aggregation and grouping. Monads and aggregations have very nice algebraic structure, which leads to a useful body of laws to support database query optimization.
Just as a warm-up, here is a reminder about Haskell’s list comprehensions.
This (rather concocted) example yields the list of all values of the expression as is drawn from and from and such that is divisible by , namely .
To the left of the vertical bar is the term (an expression). To the right is a comma-separated sequence of qualifiers, each of which is either a generator (of the form , with a variable and a list expression ) or a filter (a boolean expression). The scope of a variable introduced by a generator extends to all subsequent generators and to the term. Note that, in contrast to the mathematical inspiration, bound variables need to be generated from some existing list.
The semantics of list comprehensions is defined by translation; see for example Phil Wadler’s Chapter 7 of The Implementation of Functional Programming Languages. It can be expressed equationally as follows:
(Here, denotes the empty sequence of qualifiers. It’s not allowed in Haskell, but it is helpful in simplifying the translation.)
Applying this translation to the example at the start of the section gives
More generally, a generator may match against a pattern rather than just a variable. In that case, it may bind multiple (or indeed no) variables at once; moreover, the match may fail, in which case it is discarded. This is handled by modifying the translation for generators to use a function defined by pattern-matching, rather than a straight lambda-abstraction:
or, more perspicuously,
It is clear from the above translation that the necessary ingredients for list comprehensions are , singletons, , and the empty list. The first three are the operations arising from lists as a functor and a monad, which suggests that the same translation might be applicable to other monads too. But the fourth ingredient, the empty list, does not come from the functor and monad structures; that requires an extra assumption:
Then the translation for list comprehensions can be generalized to other monads:
(so ). The actual monad to be used is implicit; if we want to be explicit, we could use a subscript, as in ““.
This translation is different from the one used in the Haskell language specification, which to my mind is a little awkward: the empty list crops up in two different ways in the translation of list comprehensions—for filters, and for generators with patterns—and these are generalized in two different ways to other monads (to the method of the class in the first case, and the method of the class in the second). I think it is neater to have a monad subclass with a single method subsuming both these operators. Of course, this does mean that the translation forces a monad comprehension with filters or possibly failing generators to be interpreted in a monad in the subclass rather than just —the type class constraints that are generated depend on the features used in the comprehension. (Perhaps this translation was tried in earlier versions of the language specification, and found wanting?)
Taking this approach gives basically the monad comprehension notation from Wadler’s Comprehending Monads paper; it loosely corresponds to Haskell’s do notation, except that the term is to the left of a vertical bar rather than at the end, and that filters are just boolean expressions rather than introduced using .
We might impose the law that is a “left” zero of composition, in the sense
or, in terms of comprehensions,
Informally, this means that any failing steps of the computation cleanly cut off subsequent branches. Conversely, we do not require that is a “right” zero too:
This would have the consequence that a failing step also cleanly erases any effects from earlier parts of the computation, which is too strong a requirement for many monads—particularly those of the “launch missiles now” variety. (The names “left-” and “right zero” make more sense when the equations are expressed in terms of the usual Haskell bind operator , which is a kind of sequential composition.)
Ringads and collection classes
One more ingredient is needed in order to characterize monads that correspond to “collection classes” such as sets and lists, and that is an analogue of set union or list append. It’s not difficult to see that this is inexpressible in terms of the operations introduced so far: given only collections of at most one element, any comprehension using generators of the form will only yield another such collection, whereas the union of two one-element collections will in general have two elements.
To allow any finite collection to be expressed, it suffices to introduce a binary union operator :
We require composition to distribute over union, in the following sense:
or, in terms of comprehensions,
For the remainder of this post, we will assume a monad in both and . Moreover, we will assume that is the unit of , and is both a left- and a right zero of composition. To stress the additional constraints, we will write “” for “” from now on. The intention is that such monads exactly capture collection classes; Phil Wadler has called these structures ringads. (He seems to have done so in an unpublished note Notes on Monads and Ringads from 1990, which is cited by some papers from the early 1990s. But Phil no longer has a copy of this note, and it’s not online anywhere… I’d love to see a copy, if anyone has one!)
(There are no additional methods; the class is the intersection of the two parent classes and , with the union of the two interfaces, together with the laws above.) I used roughly the same construction already in the post on Horner’s Rule.
As well as (finite) sets and lists, ringad instances include (finite) bags and a funny kind of binary tree (externally labelled, possibly empty, in which the empty tree is a unit of the binary tree constructor). These are all members of the so-called Boom Hierarchy of types—a name coined by Richard Bird, for an idea due to Hendrik Boom, who by happy coincidence is named for one of these structures in his native language. All members of the Boom Hierarchy are generated from the empty, singleton, and union operators, the difference being whether union is associative, commutative, and idempotent. Another ringad instance, but not a member of the Boom Hierarchy, is the type of probability distributions—either normalized, with a weight-indexed family of union operators, or unnormalized, with an additional scaling operator.
The well-behaved operations over monadic values are called the algebras for that monad—functions such that and . In particular, is itself a monad algebra. When the monad is also a ringad, necessarily distributes also over —there is a binary operator such that (exercise!). Without loss of generality, we write for ; these are the “reductions” of the Bird–Meertens Formalism. In that case, is a ringad algebra.
The algebras for a ringad amount to aggregation functions for a collection: the sum of a bag of integers, the maximum of a set of naturals, and so on. We could extend the comprehension notation to encompass aggregations too, for example by adding an optional annotation, writing say ““; although this doesn’t add much, because we could just have written “” instead. We could generalize from reductions to collection homomorphisms ; but this doesn’t add much either, because the map is easily combined with the comprehension—it’s easy to show the “map over comprehension” property
Leonidas Fegaras and David Maier develop a monoid comprehension calculus around such aggregations; but I think their name is inappropriate, because nothing forces the binary aggregating operator to be associative.
Note that, for to be well-defined, must satisfy all the laws that does— must be associative if is associative, and so on. It is not hard to show, for instance, that there is no on sets of numbers for which ; such an would have to be idempotent, which is inconsistent with its relationship with . (So, although denotes the sum of the squares of the odd elements of bag , the expression (with now a set) is not defined, because is not idempotent.) In particular, must be the unit of , which we write .
We can derive translation rules for aggregations from the definition
For empty aggregations, we have:
For filters, we have:
For generators, we have:
And for sequences of qualifiers, we have:
Putting all this together, we have:
We have seen that comprehensions can be interpreted in an arbitrary ringad; for example, denotes (the set of) the squares of the odd elements of (the set) , whereas denotes the bag of such elements, with a bag. Can we make sense of “heterogeneous comprehensions”, involving several different ringads?
Let’s introduced the notion of a ringad morphism, extending the familiar analogue on monads. For monads and , a monad morphism is a natural transformation —that is, a family of arrows, coherent in the sense that for —that also preserves the monad structure:
A ringad morphism for ringads is a monad morphism that also respects the ringad structure:
Then a ringad morphism behaves nicely with respect to ringad comprehensions—a comprehension interpreted in ringad , using existing collections of type , with the result transformed via a ringad morphism to ringad , is equivalent to the comprehension interpreted in ringad in the first place, but with the initial collections transformed to type . Informally, there will be no surprises arising from when ringad coercions take place, because the results are the same whenever this happens. This property is straightforward to show by induction over the structure of the comprehension. For the empty comprehension, we have:
For filters, we have:
And for sequences of qualifiers:
For example, if is the obvious ringad morphism from bags to sets, discarding information about the multiplicity of repeated elements, and a bag of numbers, then
and both yield the set of squares of the odd members of . As a notational convenience, we might elide use of the ringad morphism when it is “obvious from context”—we might write just even when is a bag, relying on the “obvious” morphism . This would allow us to write, for example,
(writing for the extension of a bag), instead of the more pedantic
There is a forgetful function from any poorer member of the Boom hierarchy to a richer one, flattening some distinctions by imposing additional laws—for example, from bags to sets, flattening distinctions concerning multiplicity—and I would class these forgetful functions as “obvious” morphisms. On the other hand, any morphisms in the opposite direction—such as sorting, from bags to lists, and one-of-each, from sets to bags—are not “obvious”, and so should not be elided; and similarly, I’m not sure that I could justify as “obvious” any morphisms involving non-members of the Boom Hierarchy, such as probability distributions.