The “Sieve of Eratosthenes” is often used as a nice example of the expressive power of streams (infinite lazy lists) and list comprehensions:

That is, takes a stream of candidate primes; the head of this stream is a prime, and the remaining primes are obtained by removing all multiples of from the candidates and sieving what’s left. (It’s also a nice unfold.)

Unfortunately, as Melissa O’Neill observes, this nifty program is not the true Sieve of Eratosthenes! The problem is that for each prime , every remaining candidate is tested for divisibility. O’Neill calls this bogus common version “trial division”, and argues that the Genuine Sieve of Eratosthenes should eliminate every multiple of without reconsidering all the candidates in between. That is, only of the natural numbers are touched when eliminating multiples of 2, less than of the remaining candidates for multiples of 3, and so on. As an additional optimization, it suffices to eliminate multiples of starting with , since by that point all composite numbers with a smaller nontrivial factor will already have been eliminated.

O’Neill’s paper presents a purely functional implementation of the Genuine Sieve of Eratosthenes. The tricky bit is keeping track of all the eliminations when generating an unbounded stream of primes, since obviously you can’t eliminate all the multiples of one prime before producing the next prime. Her solution is to maintain a priority queue of iterators; indeed, the main argument of her paper is that functional programmers are often too quick to use lists, when other data structures such as priority queues might be more appropriate.

O’Neill’s paper was published as a Functional Pearl in the Journal of Functional Programming, when Richard Bird was the handling editor for Pearls. The paper includes an epilogue that presents a purely list-based implementation of the Genuine Sieve, contributed by Bird during the editing process. And the same program pops up in Bird’s textbook Thinking Functionally with Haskell (TFWH). This post is about Bird’s program.

## The Genuine Sieve, using lists

Bird’s program appears in §9.2 of TFWH. It deals with lists, but these will be infinite, sorted, duplicate-free streams, and you should think of these as representing *infinite sets*, in this case sets of natural numbers. In particular, there will be no empty or partial lists, only properly infinite ones.

Now, the prime numbers are what you get by eliminating the composite numbers from the “plural” naturals (those greater than one), and the composite numbers are the proper multiples of the primes—so the program is cleverly cyclic:

We’ll come back in a minute to , which unions a set of sets to a set; but is the obvious implementation of list difference of sorted streams (hence, representing set difference):

and generates the multiples of starting with :

Thus, the composites are obtained by merging together the infinite stream of infinite streams . You might think that you could have defined instead , but this doesn’t work: this won’t compute the first prime without first computing some composites, and you can’t compute any composites without at least the first prime, so this definition is unproductive. Somewhat surprisingly, it suffices to “prime the pump” (so to speak) just with 2, and everything else flows freely from there.

Returning to , here is the obvious implementation of , which merges two sorted duplicate-free streams into one (hence, representing set union):

Then is basically a stream fold with . You might think you could define this simply by , but again this is unproductive. After all, you can’t merge the infinite stream of sorted streams into a single sorted stream, because there is no least element. Instead, we have to exploit the fact that we have a *sorted* stream of sorted streams; then the binary merge can exploit the fact that the head of the left stream is the head of the result, without even examining the right stream. So, we define:

This program is now productive, and yields the infinite sequence of prime numbers, using the genuine algorithm of Eratosthenes.

## Approximation Lemma

Bird uses the cyclic program as an illustration of the *Approximation Lemma*. This states that for infinite lists , ,

where

Note that is undefined; the function preserves the outermost constructors of a list, but then chops off anything deeper and replaces it with (undefined). So, the lemma states that to prove two infinite lists equal, it suffices to prove equal all their finite approximations.

Then to prove that does indeed produce the prime numbers, it suffices to prove that

for all , where is the th prime (so ). Bird therefore defines

and claims that

To prove the claim, he observes that it suffices for to be well defined at least up to the first composite number greater than , because then delivers enough composite numbers to supply , which will in turn supply , and so on. It is a “non-trivial result in Number Theory” (in fact, a consequence of Bertrand’s Postulate) that ; therefore it suffices that

where is the th composite number (so ) and . Completing the proof is set as Exercise 9.I of TFWH, and Answer 9.I gives a hint about using induction to show that is the result of merging with . (Incidentally, there is a typo in TFWH: both the body of the chapter and the exercise and its solution have “” instead of “”.)

Unfortunately, the hint in Answer 9.I is simply wrong. For example, it implies that (which equals ) could be constructed from (which equals ) and (which equals ); but this means that the 6 and 8 would have to come out of nowhere. Nevertheless, the claim in Exercise 9.I is valid. What should the hint for the proof have been?

## Fixing the proof

The problem is made tricky by the cyclicity. Tom Schrijvers suggested to me to break the cycle by defining

so that ; this allows us to consider in isolation from . In fact, I claim (following a suggestion generated by Geraint Jones and Guillaume Boisseau at our Algebra of Programming research group) that if where , then has a well-defined prefix consisting of all the composite numbers such that and is a multiple of some with , in ascending order, then becomes undefined. That’s not so difficult to see from the definition: contains the multiples of starting from ; all these streams get merged; but the result gets undefined (in ) once we pass the head of the last stream. In particular, has a well-defined prefix consisting of all the composites up to , then becomes undefined, as required.

However, I must say that I still find this argument unsatisfactory: I have no equational proof about the behaviour of . In fact, I believe that the Approximation Lemma is unsufficient to provide such a proof: works along the grain of , but against the grain of . I believe we want something instead like an “Approx While Lemma”, where is to as is to :

Thus, retains elements of as long as they satisfy , but becomes undefined as soon as they don’t (or when the list runs out). Then for all sorted, duplicate-free streams and unbounded streams (that is, ),

The point is that works conveniently for and and hence for too. I am now working on a proof by equational reasoning of the correctness (especially the productivity) of the Genuine Sieve of Eratosthenes…