Breadth-First Traversal

Recently Eitan Chatav asked in the Programming Haskell group on Facebook

What is the correct way to write breadth first traversal of a ${[\mathsf{Tree}]}$?

He’s thinking of “traversal” in the sense of the ${\mathit{Traversable}}$ class, and gave a concrete declaration of rose trees:

$\displaystyle \begin{array}{lcl} \mathbf{data}\;\mathsf{Tree}\;\alpha &=& \mathit{Tree}\; \{\; \mathit{root} :: \alpha, \mathit{children} :: [\mathsf{Tree}\;\alpha] \;\} \end{array}$

It’s an excellent question.

Breadth-first enumeration

First, let’s think about breadth-first enumeration of the elements of a tree. This isn’t compositional (a fold); but the related “level-order enumeration”, which gives a list of lists of elements, one list per level, is compositional:

$\displaystyle \begin{array}{lcl} \mathit{levels} &::& \mathsf{Tree}\;\alpha \rightarrow [[\alpha]] \\ \mathit{levels}\;t &=& [\mathit{root}\;t] : \mathit{foldr}\;(\mathit{lzw}\;(\mathbin{{+}\!\!\!{+}}))\;[\,]\;(\mathit{map}\;\mathit{levels}\;(\mathit{children}\;t)) \end{array}$

Here, ${\mathit{lzw}}$ is “long zip with”; it’s similar to ${\mathit{zipWith}}$, but returns a list as long as its longer argument:

$\displaystyle \begin{array}{llllcl} \mathit{lzw} &&&&::& (\alpha\rightarrow\alpha\rightarrow\alpha) \rightarrow [\alpha]\rightarrow[\alpha]\rightarrow[\alpha] \\ \mathit{lzw}&f&(a:x)&(b:y) &=& f\;a\;b : \mathit{lzw}\;f\;x\;y \\ \mathit{lzw}&f&x&[\,] &=& x \\ \mathit{lzw}&f&[\,]&y &=& y \end{array}$

(It’s a nice exercise to define a notion of folds for ${\mathsf{Tree}}$, and to write ${\mathit{levels}}$ as a fold.)

Given ${\mathit{levels}}$, breadth-first enumeration is obtained by concatenation:

$\displaystyle \begin{array}{lcl} \mathit{bf} &::& \mathsf{Tree}\;\alpha \rightarrow [\alpha] \\ \mathit{bf} &=& \mathit{concat} \cdot \mathit{levels} \end{array}$

Incidentally, this allows trees to be foldable, breadth-first:

$\displaystyle \begin{array}{lcl} \mathbf{instance}\;\mathit{Foldable}\;\mathsf{Tree}\;\mathbf{where} \\ \quad \mathit{foldMap}\;f = \mathit{foldMap}\;f \cdot \mathit{bf} \end{array}$

Relabelling

Level-order enumeration is invertible, in the sense that you can reconstruct the tree given its shape and its level-order enumeration.

One way to define this is to pass the level-order enumeration around the tree, snipping bits off it as you go. Here is a mutually recursive pair of functions to relabel a tree with a given list of lists, returning also the unused bits of the lists of lists.

$\displaystyle \begin{array}{lcl} \mathit{relabel} &::& (\mathsf{Tree}\;(),[[\alpha]]) \rightarrow (\mathsf{Tree}\;\alpha,[[\alpha]]) \\ \mathit{relabel}\;(t,(x:\mathit{xs}):\mathit{xss}) &=& \mathbf{let}\; (\mathit{us},\mathit{yss}) = \mathit{relabels}\; (\mathit{children}\;t,\mathit{xss}) \; \mathbf{in}\; (\mathit{Tree}\;x\;\mathit{us}, \mathit{xs}:\mathit{yss}) \medskip \\ \mathit{relabels} &::& ([\mathsf{Tree}\;()],[[\alpha]]) \rightarrow ([\mathsf{Tree}\;\alpha],[[\alpha]]) \\ \mathit{relabels}\;([],\mathit{xss}) &=& ([],\mathit{xss}) \\ \mathit{relabels}\;(t:\mathit{ts},\mathit{xss}) &=& \mathbf{let} \; (u,\mathit{yss}) = \mathit{relabel}\;(t,\mathit{xss}) \mathbin{;} (\mathit{us},\mathit{zss}) = \mathit{relabels}\;(\mathit{ts},\mathit{yss}) \; \mathbf{in} \; (u:\mathit{us},\mathit{zss}) \end{array}$

Assuming that the given list of lists is “big enough”—ie each list has enough elements for that level of the tree—then the result is well-defined. Then ${\mathit{relabel}}$ is determined by the equivalence

$\displaystyle \begin{array}{ll} & \mathit{relabel}\;(t,\mathit{xss}) = (u,\mathit{yss}) \\ \Leftrightarrow & \\ & \mathit{shape}\;u = \mathit{shape}\;t \land \mathit{lzw}\;(\mathbin{{+}\!\!\!{+}})\;(\mathit{levels}\;u)\;\mathit{yss} = \mathit{xss} \end{array}$

Here, the ${\mathit{shape}}$ of a tree is obtained by discarding its elements:

$\displaystyle \begin{array}{lcl} \mathit{shape} &::& \mathsf{Tree}\;\alpha \rightarrow \mathsf{Tree}\;() \\ \mathit{shape} &=& \mathit{fmap}\;(\mathit{const}\;()) \end{array}$

In particular, if the given list of lists is the level-order of the tree, and so is exactly the right size, then ${\mathit{yss}}$ will have no remaining elements, consisting entirely of empty levels:

$\displaystyle \mathit{relabel}\;(\mathit{shape}\;t, \mathit{levels}\;t) = (t, \mathit{replicate}\;(\mathit{depth}\;t)\;[\,])$

So we can take a tree apart into its shape and contents, and reconstruct the tree from such data.

$\displaystyle \begin{array}{lcl} \mathit{split} &::& \mathsf{Tree}\;\alpha \rightarrow (\mathsf{Tree}\;(), [[\alpha]]) \\ \mathit{split}\;t &=& (\mathit{shape}\;t, \mathit{levels}\;t) \medskip \\ \mathit{combine} &::& \mathsf{Tree}\;() \rightarrow [[\alpha]] \rightarrow \mathsf{Tree}\;\alpha \\ \mathit{combine}\;u\;\mathit{xss} &=& \mathit{fst}\;(\mathit{relabel}\;(u, \mathit{xss})) \end{array}$

Breadth-first traversal

This lets us traverse a tree in breadth-first order, by performing the traversal just on the contents. We separate the tree into shape and contents, perform a list-based traversal, and reconstruct the tree.

$\displaystyle \begin{array}{l} \mathbf{instance}\;\mathit{Traversable}\;\mathsf{Tree}\;\mathbf{where} \\ \quad \mathit{traverse}\;f\;t = \mathit{pure}\;(\mathit{combine}\;(\mathit{shape}\;t)) \circledast \mathit{traverse}\;(\mathit{traverse}\;f)\;(\mathit{levels}\;t) \end{array}$

This trick of traversal by factoring into shape and contents is explored in my paper Understanding Idiomatic Traversals Backwards and Forwards from Haskell 2013.

Inverting breadth-first enumeration

We’ve seen that level-order enumeration is invertible in a certain sense, and that this means that we perform traversal by factoring into shape and contents then traversing the contents independently of the shape. But the contents we’ve chosen is the level-order enumeration, which is a list of lists. Normally, one thinks of the contents of a data structure as simply being a list, ie obtained by breadth-first enumeration rather than by level-order enumeration. Can we do relabelling from the breadth-first enumeration too? Yes, we can!

There’s a very clever cyclic program for breadth-first relabelling of a tree given only a list, not a list of lists; in particular, breadth-first relabelling a tree with its own breadth-first enumeration gives back the tree you first thought of. In fact, the relabelling function is precisely the same as before! The trick comes in constructing the necessary list of lists:

$\displaystyle \begin{array}{lcl} \mathit{bflabel} &::& \mathsf{Tree}\;() \rightarrow [\alpha] \rightarrow \mathsf{Tree}\;\alpha \\ \mathit{bflabel}\;t\;\mathit{xs} &=& \mathbf{let}\;(u,\mathit{xss}) = \mathit{relabel}\;(t, \mathit{xs}:\mathit{xss})\;\mathbf{in}\;u \end{array}$

Note that variable ${\mathit{xss}}$ is defined cyclically; informally, the output leftovers ${\mathit{xss}}$ on one level also form the input elements to be used for relabelling all the lower levels. Given this definition, we have

$\displaystyle \mathit{bflabel}\;(\mathit{shape}\;t)\;(\mathit{bf}\;t) = t$

for any ${t}$. This program is essentially due to Geraint Jones, and is derived in an unpublished paper Linear-Time Breadth-First Tree Algorithms: An Exercise in the Arithmetic of Folds and Zips that we wrote together in 1993.

We can use this instead in the definition of breadth-first traversal:

$\displaystyle \begin{array}{l} \mathbf{instance}\;\mathit{Traversable}\;\mathsf{Tree}\;\mathbf{where} \\ \quad \mathit{traverse}\;f\;t = \mathit{pure}\;(\mathit{bflabel}\;(\mathit{shape}\;t)) \circledast \mathit{traverse}\;f\;(\mathit{bf}\;t) \end{array}$

Advertisements

About jeremygibbons

Jeremy Gibbons is Professor of Computing in Oxford University Department of Computer Science, and a fan of functional programming and patterns of computation.
This entry was posted in Uncategorized. Bookmark the permalink.

7 Responses to Breadth-First Traversal

1. Eitan Chatav says:

Thanks for the shoutout 🙂 Here’s my solution though I guess it may have worse spacetime complexity?

2. Minor tweaks made.

3. Eitan, I haven’t studied your solution, but the splitPlaces makes me a bit uneasy about the complexity. However, mine isn’t linear-time either: you would at least need to use an accumulating-parameter version of levels to achieve that.

4. George says:

What is the meaning of the symbol in a circle in the last line of code, replaced by ?? in the following: traverse ft = pure (bflabel (shape t)) ?? traverse f (bf t)

• Sorry, that wasn’t meant to be cryptic. It’s a fancy rendering of the “zap” of an Applicative functor, written “<*>” in ASCII.

• Why not fmap? I was under the impression that pure f x is the same as f x

5. Perhaps some angle brackets went missing from your question, but yes, it’s a law of the Applicative class (and its Functor superclass) that pure f <*> x = fmap f x.