blog.poucet.org Rotating Header Image

Uncategorized

Higher Order Zippers

In this blog article we explore the meaning of zippers as well as higher order zippers. The idea for this blog article came while discussing with fasta on Freenode today about zippers for graphs. Out of the blue, the question popped to me, what does it mean to have higher order zippers. Not having thought it through completely, I hope this blog will offer some clarity, and perhaps some feedback.

[EDIT: This is a new version as of 2007-07-11 that takes some comments from reddit and the comments here into account, cleaning up a bit and adding some more content]


For those new to the concept of Zippers, I shall briefly explain what they are and how one can reason with them. The idea of a zipper is to provide an iterator that one can move through an arbitrary functor.

A functor is basically a container of elements. For instance, [] in Haskell is a functor. If you apply to it to a specific type, you get a container-type (in this case a list) of that type. Other typical examples include Maybe, Binary trees and the N-ary tree introduced in Flatten Benchmark.

Due to the fact that Haskell does not have mutability, at first it can seem inefficient to mutate an element inside such a Functor or container. That is why Huet invented the concept of a Zipper. A Zipper allows one to walk through such a container such that one specific element has O(1) access, namely the element that is currently in the ‘hole’. Then, typically, there will be functions that allow one to move the element we’re currently looking at.

First Order Zippers

To illustrate, I will demonstrate the Zipper for a list first.

> {-# OPTIONS_GHC -fglasgow-exts #-}
> module Zipper where
> import Data.Maybe(fromJust)
>
> -- A zipper is the current element and a context that stores where this
> -- element is.
> data Zipper d a = Zipper a (d a) deriving (Eq, Ord, Show)
>
> -- The context for a list is a combination of the elements before and the
> -- elements after the current element.
> data DList a = DList [a] [a]
> deriving (Eq, Ord, Show)
> type ZList a = Zipper DList a

The advantage of this zipper is that one can easily (with O(1) complexity) move the current hole we’re looking at back and forth. It also provides an O(1) complexity way of changing the current element. As last requirement is that it should be possible to reconstruct the entire data-type that has been ‘zipped’ from just the Zipper. This process is often referred to as ‘unzipping’ the data structure. The methods to perform these operations, as well as to create a zipper from a list are given here:

> enterList :: (Monad m) => [a] -> m (ZList a)
> enterList [] = fail "Can not zip an empty list"
> enterList (x:xs) = return $ Zipper x (DList [] xs)
>
> unzipList :: ZList a -> [a]
> unzipList (Zipper a (DList [] suf)) = a:suf
> unzipList z = unzipList . fromJust . prev $ z
>
> prev :: (Monad m) => ZList a -> m (ZList a)
> prev (Zipper a (DList [] tl)) = fail "Already at the first element"
> prev (Zipper a (DList (x:xs) tl)) = return $ Zipper x (DList xs (a:tl))
>
> next :: (Monad m) => ZList a -> m (ZList a)
> next (Zipper a (DList hd [])) = fail "Already at the last element"
> next (Zipper a (DList hd (x:xs))) = return $ Zipper x (DList (a:hd) xs)
>
> update :: (a -> a) -> Zipper d a -> Zipper d a
> update f (Zipper a d) = Zipper (f a) d

In The derivative of a Regular Type is a Type of One-Hole Contests Conor McBride shows us how it is possible to determine the context for any data-type using algebra. Basically, the context for a data-type is it’s derivative. How can one derive a data-type, however? Well this is done by using the more algebraic notation of a data type. If one takes a list, then the actual data type is: ‘List a = Nil | Cons a (List a)’. In an algebraic notation, this would be: L = 1 + a·L. Namely, a list is a sumtype of a constant (1) and a tuple of an element ‘a’ and the list again. Well, this is not fully correct, a list is actually the least-fixed-type that satisfies this equation (namely: L = 1 + a·L μ L.1 + a·L). However I will gloss over this. The context of a list is then the derivative of this equation using standard derivation as taught in algebra/calculus:

a L = ∂a a · L + a·∂a L = 1·L + a·∂a L = L + a·∂a L


=> data DList a = [a] | DCons a (DList a)

By factoring out the [a] (because any DList will contain exactly one), one gets “[a]*(mu dL.1 + a*dL)” or “[a]*[a]“. In general, I believe that in an equation “(mu x.S + Tx)” one can factor out the S (as long as it doesn’t contain a reference to x of course) to get “(S*(mu x.1 + Tx))”. For exercise’s sake, we will derive the context for a binary tree:

> data Tree a = Leaf | Node a (Tree a) (Tree a)
> deriving (Eq, Ord, Show)
> -- Tree a (or t for short)

a t = t2 + a·t·∂_ t = t2 × (μ x. 1 + 2·a·t·x) = t2·p where p = 1 + 2·a·t·p or p = 2·a·t

Looking at this from an intuitive point of view, this makes sense. The context of an element in a tree, are the left and right child-trees of the node this element is in, as well as the path that we took to get there by saving the trees that we skipped over along the way to the element.

> data DTree a = DTree {leftChild   :: (Tree a),
> rightChild :: (Tree a),
> path :: [SEither (Step a)]}
> deriving (Eq, Ord, Show)
> type SEither a = Either a a
> -- The path through a btree to the current node, saving those trees we did not enter.
> data Step a = Step a (Tree a)
> deriving (Eq, Ord, Show)
> type ZTree a = Zipper DTree a
>
> enterTree :: (Monad m) => Tree a -> m (ZTree a)
> enterTree Leaf = fail "Can not zip an empty btree"
> enterTree (Node x l r) = return $ Zipper x $ DTree l r []
>
> unzipTree :: (ZTree a) -> Tree a
> unzipTree (Zipper x (DTree l r [])) = Node x l r
> unzipTree dt = unzipTree . fromJust . up $ dt
>
> up :: (Monad m) => ZTree a -> m (ZTree a)
> up (Zipper x (DTree l r [])) = fail "Already at the top of a tree"
> up (Zipper x (DTree l r (p:pp))) =
> case p of
> Right (Step px pr) -> return $ Zipper px (DTree (Node x l r) pr pp)
> Left (Step px pl) -> return $ Zipper px (DTree pl (Node x l r) pp)
>
> left :: (Monad m) => ZTree a -> m (ZTree a)
> left (Zipper x (DTree Leaf pr pp)) = fail "Can not go further left"
> left (Zipper x (DTree (Node v l r) pr pp)) = return $ Zipper v $ DTree l r $ ((Right $ Step x pr):pp)
> right :: (Monad m) => ZTree a -> m (ZTree a)
> right (Zipper x (DTree pl Leaf pp)) = fail "Can not go further right"
> right (Zipper x (DTree pl (Node v l r) pp)) = return $ Zipper v $ DTree l r $ ((Left $ Step x pl):pp)

I will leave the implementation for an N-ary tree as an exercise for the reader.

Second Order Zippers

Now nothing introduced so far is really new. The question that arose to me is: what would a higher-order zipper be? What does it mean to have a double derivative of a data-type. I will first show what the structure for the double derivative of a list looks like by working it out and then give an intuition of what the meaning of this data-type might mean. To make sure that this intuition makes any sense, we will then work out the double derivative for a binary tree, to ascertain that the intuition we had for the list is valid for other data-types. So first we define the generic meaning of a second-order zipper:

> type Zipper2 d2 a = Zipper (Zipper d2) a
>
> update2 :: (a -> a) -> Zipper2 d2 a -> Zipper2 d2 a
> update2 f (Zipper x (Zipper y d)) = Zipper x (Zipper (f y) d)

A second-order zipper is simply the zipper of the zipper of a datatype parametrized with the second-order derivative of that data-type and a the element-type. So we can already see that a second-order zipper will be containing two ‘a’s as well as this second-order derivative. So what is the second-order derivative of a list? Well we had: “[a]‘ = [a]_h*[a]_t”. Applying simple derivation in the algebraic sense we get: “[a]” = [a]_h*[a]_t’ + [a]_h’*[a]_t”.

Second Order Zipper for a List

If we remember the meaning we gave to the two lists in the case of “[a]‘”, then we had the head part and the tail part of the list around the current element we’re looking at. Since we’re now either deriving one or the other, my intuition tells me that we’ll have either the prefix and suffix of the head part or the prefix and suffix of the tail part. If we take the definition of “[a]‘” and expand it in “[a]”” we get: “[a]*[a]*[a] + [a]*[a]*[a]“. Putting this in a Haskell data-type with names based on my intuition, we get:

> -- DDList a = Prev (reverse of prefix of prefix) 
> -- (suffix of prefix)
> -- suffix
> -- | Next (rever se prefix)
> -- (rever se of prefix of suffix)
> -- (suffix of suffix)
> data DDList a = Prev [a] [a] [a] | Next [a] [a] [a]
> deriving (Eq, Ord, Show)
> type ZZList a = Zipper2 DDList a
>
> openLeft :: (Monad m) => ZList a -> m (ZZList a)
> openLeft (Zipper r (DList [] tl)) = fail "Can not enter an empty list"
> openLeft (Zipper r (DList (l:hd) tl)) = return $ Zipper l (Zipper r (Prev hd [] tl))
>
> openRight :: (Monad m) => ZList a -> m (ZZList a)
> openRight (Zipper l (DList hd [])) = fail "Can not enter an empty list"
> openRight (Zipper l (DList hd (r:tl))) = return $ Zipper r (Zipper l (Next hd [] tl))
>
> unzipZList :: ZZList a -> ZList a
> unzipZList (Zipper l (Zipper r (Prev a [] c))) = Zipper r (DList (l:a) c)
> unzipZList z@(Zipper l (Zipper r (Prev a b c))) = unzipZList . fromJust . shrink $ z
> unzipZList (Zipper r (Zipper l (Next a [] c))) = Zipper l (DList a (r:c))
> unzipZList z@(Zipper r (Zipper l (Next a b c))) = unzipZList . fromJust . shrink $ z
>
> enlarge :: (Monad m) => ZZList a -> m (ZZList a)
> enlarge (Zipper l (Zipper r (Prev [] bs cs))) = fail "Can not move further to the left"
> enlarge (Zipper l (Zipper r (Prev (a:as) bs cs))) = return $ Zipper a (Zipper r (Prev as (l:bs) cs))
> enlarge (Zipper r (Zipper l (Next as bs []))) = fail "Can not move further to the right"
> enlarge (Zipper r (Zipper l (Next as bs (c:cs)))) = return $ Zipper c (Zipper l (Next as (r:bs) cs))
>
> shrink :: (Monad m) => ZZList a -> m (ZZList a)
> shrink (Zipper l (Zipper r (Prev as [] cs))) = fail "Can not further shrink"
> shrink (Zipper l (Zipper r (Prev as (b:bs) cs))) = return $ Zipper b (Zipper r (Prev (l:as) bs cs))
> shrink (Zipper r (Zipper l (Next as [] cs))) = fail "Can not further shrink"
> shrink (Zipper r (Zipper l (Next as (b:bs) cs))) = return $ Zipper b (Zipper l (Next as bs (r:cs)))
>
> switch :: ZZList a -> ZZList a
> switch (Zipper l (Zipper r (Prev as bs cs))) = Zipper r (Zipper l (Next as (reverse bs) cs))
> switch (Zipper r (Zipper l (Next as bs cs))) = Zipper l (Zipper r (Prev as (reverse bs) cs))
>
> splice :: ZZList a -> [a] -> ZZList a
> splice (Zipper l (Zipper r (Prev as _ cs))) bs = Zipper l (Zipper r (Prev as bs cs))
> splice (Z ipper r (Zipper l (Next as _ cs))) bs = Zipper r (Zipper l (Next as bs cs))
>
> -- A more generic function (splice can be implemented in terms of this)
> modify :: ([a] -> [a]) -> ZZList a -> ZZList a
> modify f (Zipper l (Zipper r (Prev as bs cs))) = Zipper l (Zipper r (Prev as (f bs) cs))
> modify f (Zipper r (Zipper l (Next as bs cs))) = Zipper r (Zipper l (Next as (f bs) cs))
>

So what my intuitition tells me is that where a Zipper gives O(1) mutation of an element, a Zipper2 gives O(1) splicing. After all, we can directly access the part that is between the two singular elements stored in the Zipper2. Of course, to ascertain that this statement is valid for more than just lists, we should ascertain this is true for other data-types. In this case, we shall do the experiment for binary trees. Again, the exercise for N-ary trees is left as an exercise for the reader (feel free to put in a comment if you want to).

Deriving the rules for a tree

Note: I redid the math with a cleaner technique, and it seems I had made some mistakes in the original math, which is why my data-type was not as intuitive as I had hoped it to be. The correct version makes much more sense :) Namely: I was missing a “p” in the last term.
<!–
$t = 1 + a * t^2 $

$\part_a t = t^2 + a*t*\part_ t = t^2 \x (\mu x. 1 + 2*a*t*x) = t^2*p$
where $p = 1 + 2*a*t*p$ or $p = [2*a*t]$

So we derive:

$\part_a^2 t = 2 * t * \part_a t * p + t^2 * \part_a = D + R$

where $D = 2 * t * \part_a t * p = 2 * t * (t^2) * p * p$

and $R = t^2 * \part_a p = t^2 * \part_x[x] |_{x=2*a*t} * \part_a(2*a*t) =…$

$… = t^2 * [2*a*t]*[2*a*t]*\part_a(2*a*t) = t^2 * p^2 * (2*t + 2*a*t’) = t^2 * p^2 *(2*t +2*a*t^2*p) = …$

$… = 2*t*t^2*p^2 + 2*a*t^2*t^2*p*p^2 = U + UD$

So finally we get (where we label the different parts to refer to them):

$\part_a^2 t = D + U + UD$ where

$D = 2 * t_1 * (t^2)_2 * p_3 * p_4$

$U = 2 * t_1 * (t^2)_2 * p_3 * p_4$

$UD = 2 * a_1 * (t^2)_2L * (t^2)_2R * p_3 * (p^2)_4$
–>

Returning to the definitions, we had:
t = 1 + a · t2

a t = t2 × (μ x. 1 + 2·a·t·x) = t2·p
where p = 1 + 2·a·t·p or p = 2·a·t

So we derive:
2a t = 2 · t · ∂a t · p + t2 · ∂a = D + R
where D = 2 · t · ∂a t · p = 2 · t · (t2) · p · p
and R = t2 · ∂a p = t2 · ∂x[x] |‎x=2·a·t‏ · ∂a(2·a·t) =…
… = t2 · 2·a·t·2·a·t·∂a(2·a·t) = t2 · p2 · (2·t + 2·a·t’) = t2 · p2 ·(2·t +2·a·t2·p) = …
… = 2·t·t2·p2 + 2·a·t2·t2·p·p2 = U + UD

So finally we get (where we label the different parts to refer to them):
2a t = D + U + UD where
D = 2 · t1 · (t2)2 · p3 · p4
U = 2 · t1 · (t2)2 · p3 · p4
UD = 2 · a1 · (t2)2L · (t2)2R · p3 · (p2)4

The first term, D, is basically a derivative of one or the left or right sub-trees of the tree we are currently looking at it with our first-order zipper. More specifically, if we go down the left-subtree, we save the right subtree in t1. Then the second finger will be pointing at a tree further down, so we need to save it’s children, namely: (t2)2. And obviously we have the two paths, one from the root of the tree to the first finger, p3, and one from this finger to the second finger, p4.

The second term, U, is analogous, except that the first finger will now be at the bottom and the second finger more at the top. This means that the path that goes from the first finger to the second finger, p4, will be reversed. Additionally (at least I think this should be so for efficiency reasons), the path from the root, p3, will only extend down to the second finger. Moving the second-finger around is then O(1) as we can just pop a path-element from one of the paths and put it on the other path or vice-versa.

The final term, UD, should then obviously be a path that walks up along the tree and then down from another node. In this case, we still have the path from the root of the tree, p3, but instead of going down to the first finger or second finger, it will go down to the join-point of these two fingers (namely the place where the path stops going up and starts going down again. Because neith
er subtree of this node is being stored in this path (as we’re deriving both sides), we need to save the element at this point, namely in a1. The two paths, (p2)4, then simply consist of the path up from the first finger and then down to the second finger. We have two options for this one, as we might be going up from the left side of the join point or the right side of the join point. Finally, the two pairs of trees, (t2)2L and (t2)2R obviously are the subtrees of the two nodes where our fingers are pointing at.

> data DDTree a = Down (Either (Mono a)) | Up (Either (Mono a)) | UpDown (Either (Duo a))
> deriving (Eq, Ord, Show)
> data Mono a = Mono {fromRoot :: [Path a],
> toSecond :: [Path a],
> otherChild :: Tree a,
> leftChild2 :: Tree a,
> rightChild2 :: Tree a,
> toBottom :: DTree a}
> deriving (Eq, Ord, Show)
>
> data Duo a = Duo {fromRoot :: [Path a],
> joinValue :: a,
> leftleftChild :: Tree a,
> leftrightChild :: Tree a,
> rightleftChild :: Tree a,
> rightrightChild :: Tree a
> firstToRoot :: [Path a],
> rootToSecond :: [Path a]}
> deriving (Eq, Ord, Show)

The operations for this tree are left as an exercise for the reader (mostly for conciseness purposes and because I have other things to do, I might add them later). Suffice to say that intuitively (due to the way the lists are ordered (reverse path or not), moving the second finger will be easier than moving the first finger). Obviously, splicing will depend on whether we have a stricly unidirectional path, or an updown path.

I hope the above is clear, otherwise I can add figures, though if possible I’d prefer not to to say on time (and I don’t have a nice diagram editor :/).

Generating higher-order zippers

By now the pattern for creating higher-order zippers should be clear. Given the definition of a data-type as an algebraic entity. The first-order zipper would be:


type FirstOrderZipper d a = Zipper d a

where ‘d’ is the first derivative of the data-type in question. To then go to higher-order zippers, we simply apply the meta-rule (note that this is a meta-rule as there are no haskell facilities to derive data-types that I am aware of).


Zipper d a -> Zipper (Zipper d') a

Where d’ obviously stands for the derivative of d. As a reader of reddit pointed out, this is “simply” having several pointers that point into a functional data-structure. However, due to the way that these ‘pointers’ work (for instance, you can not just get a second one from a data-type, you need to go higher-order to get more pointers) we get extra functionality, for instance the O(1) splicing which I mentioned previously. For completeness’ sake, I here show the rules that were introduced in McBride’s paper.

x x = 1
x y = 0 where x ≠ y
x 0 = 0
x(S + T) = ∂x S + ∂x T
x 1 = 1
x(S × T) = ∂x S × T + S × ∂x T
x(μ y.F) = μ z.∂x F|‎y=μ y.F‏ + ∂y F|‎y=μ y.F‏ × z where z is fresh
x(F|‎y = S‏) = ∂x F|‎y=S‏ + ∂y F|‎y=S‏ × ∂x S
x T = 0 when T does not contain x
μ z.T + S × z = T × μ z. 1 + S × z when T and s do not contain z

ICFP Contest

Anyone in contact with me either online or IRL knows that I’m looking forward to the ICFP Contest with much anticipation. If possible, I will push for the #oasis channel on Freenode to be the de-facto discussion channel for different teams. Last year, there was a grass-roots movement whereby #oasis ended up to be this icfp-discussion channel, and it’d be nice to have it again.

Last year our team, Lazy Bottoms, made it in the top 5% (18th) and was mentioned several times in the results presentation. Hopefully this year we can make it top-10 or better. I think that our team has some very smart members of Haskell-noteworthiness, so I’m hopeful.

For more details about our team: Lazy Bottoms Team . I think the double entendre of the name should be obvious to most people reading this blog.

ICFP Contest

Anyone in contact with me either online or IRL knows that I’m looking forward to the ICFP Contest with much anticipation. If possible, I will push for the #oasis channel on Freenode to be the de-facto discussion channel for different teams. Last year, there was a grass-roots movement whereby #oasis ended up to be this icfp-discussion channel, and it’d be nice to have it again.

Last year our team, Lazy Bottoms, made it in the top 5% (18th) and was mentioned several times in the results presentation. Hopefully this year we can make it top-10 or better. I think that our team has some very smart members of Haskell-noteworthiness, so I’m hopeful.

For more details about our team: Lazy Bottoms Team . I think the double entendre of the name should be obvious to most people reading this blog.

Flatten Benchmark

In this blog article, we look at the benchmarks as proposed in Kyle’s Smith’s blog regarding the flattening of lists in Scheme. However, we perform the same experiment in Haskell. Obviously, due to typing reasons, this will not be done on lists but N-ary trees. It should be interesting to see how these algorithms perform in Haskell, especially due to the very different semantics caused by laziness.

In fact, a very common idiom in Scheme, or even SML or O’Caml, namely ensuring tail-call recursion, is not always a good idea in Haskell when working with data-structures. They are too strict in a sense, and as such they do not work well on infinite data-structures.


The first thing to do in Haskell, of course, is to define a module. We also define the data-type that will hold a list to be flattened. In this case, it is an N-ary tree where the options for the data-type is sumtype of a child of a single element, or a parent of a list of this sum-type. To keep it concise, I will give the data-constructors short names.

> module Flatten where
> import Control.Monad(liftM, liftM2)
> import CPUTime(getCPUTime)
> import Data.List(foldl')
> import System(getArgs)
> data Tree a = C a | P [Tree a]

As example, we show the encoding of a one of the ‘lists’ from Kyle’s blog. The notation is a bit verbose, but one could easily add a read-instance to fix this. Since this is not crucial, I will not do so at the moment (let’s just say, yay for VIM and some regexp-fu to turn the data into Haskelly data).

(define abc ‘(a (b (c (d (e (f (g (h (i (j (k (l (m n) o) p) q) r) s) t) u) v) w) x) y) z)) ; would become:

> abc = P [C 'a', P [C 'b', P [C 'c', P [C 'd', P [C 'e', P [C 'f', P [C 'g', 
> P [C 'h', P [C 'i', P [C 'j', P [C 'k', P [C 'l', P [C 'm', C 'n'], C 'o'], C 'p'], C 'q'], C 'r'], C 's'],
> C 't'], C 'u'], C 'v'], C 'w'], C 'x'], C 'y'], C 'z']
> flat = P $ map C $ [1..26]
> flat_abc = P $ map C $ ['a'..'z']
> tree = P [C 'a', P [C 'b', P [C 'd', P [C 'i', C 'q', C 'r'], P [C 'j', C 's', C 't']],
> P [C 'f', P [C 'k', C 'u', C 'v'], P [C 'l', C 'w', C 'x']]], P [C 'c', P [C 'g', P [C 'm', C 'y', C 'z'],
> P [C 'n', C 'a', C 'b']], P [C 'h', P [C 'o', C 'c', C 'd'], P [C 'p', C 'e', C 'f']]]]
> treec = P [C 'a', P [C 'b', P [C 'd', P [C 'i', C 'q', C 'r'], P [C 'j', C 's', C 't']], P [C 'f', P [C 'k', C 'u', C 'v'],
> P [C 'l', C 'w', C 'x']]], P [C 'c', P [C 'g', P [C 'm', C 'y', C 'z'], P [C 'n', C 'a', C 'b']], P [C 'h', P [C 'o', C 'c', C 'd'], P [C 'p', C 'e', C 'f']]]]

All the algorithms will by definition be slightly different as we have one more case to check due to the design of our data-type. First of all, instead of checking on whether the input is null, we will instead check when we have a child element. And secondly, because each level of the tree is a pure list, we can use standard list operations like map and fold.

Looking through the algorithms, a lot use side-effects or reversing at the end. Obviously, side-effects are not a standard idiom in Haskell. Reversing is not a good idiom for Haskell programming, because reversing tends to destroy laziness. As such, the version that is most straightforward to code in Haskell is the one that uses ‘cons’ with an accumulator. The version in scheme required reversing at teh end. We do not need to reverse the list at the end because we use ‘foldr’ while the algorithms in scheme used a fold-left mechanism. The reason why we want foldr is because it respects laziness, while the reason why schemers and mlers want foldl is because it is tail-call recursive.

I will add type-declarations to the function but these are purely for documentation purposes and are not actually required for Haskell. The scheme version uses a mutation on the result variable. Instead of doing this, we will take a more Haskelly approach (I was going to say pure, in the purely functional, but that could be misconstrued to mean pure as in purity of algorithm) and thread this result through the calls.

> flatten :: Tree a -> [a]
> flatten t = f t [] -- Call helper with the tree and an empty list
> -- The case we have list of nodes
> where f (P l) r = foldr f r l
> -- The case where we have a single element
> f (C a) r = a:r

To test we simply run ‘ghci Flatten.lhs’ and then when we run ‘flatten abc’ we get “abcdefghijklmnopqrstuvwxyz. We also see that this definition respects lazyness in the horizontal (but therefore not perse the vertical) direction. Namely, we construct an infinite flat tree of elements and see whether it succeeds in flattening. We will also construct an infinite vertical tree (obviously with some elements at eachc level, or it will take an infinite amount of time to reach the first element.

> infinite_flat_tree = P $ map C [0..]
> infinite_vert_tree = build [0..]
> where build (x:xs) = P [C x, build xs]

Indeed, if we load Flatten.lhs in ghci and run flatten, everything works out fine (make sure to do something like: ‘take 10 $ flatten infinite_flat_tree’, or your terminal will spam a lot of numbers). Similarly, it also respects the infinitely vertical tree (which the reader should note, has been built up in a way to ensure the creation of the tree itself is also lazy).

Originally I had intended to try out all the different algorithms that were written in Kyle’s blog, but it seems most of the trade-off is between using set! as well as reversing and other techniques to circumvent this. However, when using foldr and lazy structures, as long as you make sure to respect laziness, this is not required and there’s only one way of coding this function, which is probably the way anyone would write it in Haskell. If any of the readers have feedback on other versions that might be applicable for Haskell, I am definitely interested in seeing their version.

Therefore, to give this article any meaningful content, we will add some benchmarks. My computer is not so powerful, it is a simple Dell Latitude D610 laptop with 1GB of memory and a Centrino of 1.6Ghz.

First we need to generate some data-sets besides the one above. The first one was far too small and infinite data-sets will have obvious results if computed completely. To ensure the calculation actually occurs, I will get th
e last element of the computed list to force the entire list to be generated. To measure how complicated an N-ary tree is, Kyle introduces the concept of complexity. I will translate this to Haskell (If anyone spots algorithmic differences, feel free to point them out).

> complexity :: Tree a -> Integer
> complexity (C x) = 1
> complexity (P l) = foldl' (\c e -> 1 + c + complexity e) 0 l
> buildTestTree :: [Tree a] -> Int -> Tree a
> buildTestTree ts n = P $ take n $ cycle ts -- Cycle through the different
> -- trees, take only 'n' in total
> -- and create a parent node
> -- Thanks apfelmus from #haskell (P.s: getCPUTime returns picoseconds)
> time m = liftM2 subtract getCPUTime (m >> getCPUTime)

Due to laziness, it is important that we force the creation of the input tree before actually timing the flattening of the tree. But since we need to know the complexity of the tree before we try to flatten it, we can let the complexity function force the creation of the tree. Therefore, to get a test,case, we simply get the following code:

> runTest :: (Show a) => Tree a -> IO ()
> runTest x = do
> putStr "Complexity: "
> t1 <- time $ print $ complexity $ x -- Force the creation of the tree
> putStr "Last element: "
> t2 <- time $ print $ last $ flatten x -- Time the flatten function
> putStrLn $ "Times: " ++ show t1 ++ ", " ++ show t2

Finally, we can tie it all together with a little main that will allow us to run this program to test different complexities. While the scheme version allowed the user to choose which subtrees to use to build up the big tree, I will not give this option as in the end, the only metric used was complexity. I would like to briefly point out, however, that it might be interesting to test deeper trees, as the trees seem to have been limited to the depth of trees defined at the beginning.

> main :: IO ()
> main = do
> n <- liftM (read . head) getArgs
> let t = buildTestTree [abc, flat_abc, treec] n
> sequence_ $ replicate 10 $ runTest t -- Run timing test 10 times

The results can be seen below in the figure. I hope this was useful.

Flatten Benchmark

In this blog article, we look at the benchmarks as proposed in Kyle’s Smith’s blog regarding the flattening of lists in Scheme. However, we perform the same experiment in Haskell. Obviously, due to typing reasons, this will not be done on lists but N-ary trees. It should be interesting to see how these algorithms perform in Haskell, especially due to the very different semantics caused by laziness.

In fact, a very common idiom in Scheme, or even SML or O’Caml, namely ensuring tail-call recursion, is not always a good idea in Haskell when working with data-structures. They are too strict in a sense, and as such they do not work well on infinite data-structures.


The first thing to do in Haskell, of course, is to define a module. We also define the data-type that will hold a list to be flattened. In this case, it is an N-ary tree where the options for the data-type is sumtype of a child of a single element, or a parent of a list of this sum-type. To keep it concise, I will give the data-constructors short names.

> module Flatten where
> import Control.Monad(liftM, liftM2)
> import CPUTime(getCPUTime)
> import Data.List(foldl')
> import System(getArgs)
> data Tree a = C a | P [Tree a]

As example, we show the encoding of a one of the ‘lists’ from Kyle’s blog. The notation is a bit verbose, but one could easily add a read-instance to fix this. Since this is not crucial, I will not do so at the moment (let’s just say, yay for VIM and some regexp-fu to turn the data into Haskelly data).

(define abc ‘(a (b (c (d (e (f (g (h (i (j (k (l (m n) o) p) q) r) s) t) u) v) w) x) y) z)) ; would become:

> abc = P [C 'a', P [C 'b', P [C 'c', P [C 'd', P [C 'e', P [C 'f', P [C 'g', 
> P [C 'h', P [C 'i', P [C 'j', P [C 'k', P [C 'l', P [C 'm', C 'n'], C 'o'], C 'p'], C 'q'], C 'r'], C 's'],
> C 't'], C 'u'], C 'v'], C 'w'], C 'x'], C 'y'], C 'z']
> flat = P $ map C $ [1..26]
> flat_abc = P $ map C $ ['a'..'z']
> tree = P [C 'a', P [C 'b', P [C 'd', P [C 'i', C 'q', C 'r'], P [C 'j', C 's', C 't']],
> P [C 'f', P [C 'k', C 'u', C 'v'], P [C 'l', C 'w', C 'x']]], P [C 'c', P [C 'g', P [C 'm', C 'y', C 'z'],
> P [C 'n', C 'a', C 'b']], P [C 'h', P [C 'o', C 'c', C 'd' ], P [C 'p', C 'e', C 'f']]]]
> treec = P [C 'a', P [C 'b', P [C 'd', P [C 'i', C 'q', C 'r'], P [C 'j', C 's', C 't']], P [C 'f', P [C 'k', C 'u', C 'v'],
> P [C 'l', C 'w', C 'x']]], P [C 'c', P [C 'g', P [C 'm', C 'y', C 'z'], P [C 'n', C 'a', C 'b']], P [C 'h', P [C 'o', C 'c', C 'd'], P [C 'p', C 'e', C 'f']]]]

All the algorithms will by definition be slightly different as we have one more case to check due to the design of our data-type. First of all, instead of checking on whether the input is null, we will instead check when we have a child element. And secondly, because each level of the tree is a pure list, we can use standard list operations like map and fold.

Looking through the algorithms, a lot use side-effects or reversing at the end. Obviously, side-effects are not a standard idiom in Haskell. Reversing is not a good idiom for Haskell programming, because reversing tends to destroy laziness. As such, the version that is most straightforward to code in Haskell is the one that uses ‘cons’ with an accumulator. The version in scheme required reversing at teh end. We do not need to reverse the list at the end because we use ‘foldr’ while the algorithms in scheme used a fold-left mechanism. The reason why we want foldr is because it respects laziness, while the reason why schemers and mlers want foldl is because it is tail-call recursive.

I will add type-declarations to the function but these are purely for documentation purposes and are not actually required for Haskell. The scheme version uses a mutation on the result variable. Instead of doing this, we will take a more Haskelly approach (I was going to say pure, in the purely functional, but that could be misconstrued to mean pure as in purity of algorithm) and thread this result through the calls.

> flatten :: Tree a -> [a]
> flatten t = f t [] -- Call helper with the tree and an empty list
> -- The case we have list of nodes
> where f (P l) r = foldr f r l
> -- The case where we have a single element
> f (C a) r = a:r

To test we simply run ‘ghci Flatten.lhs’ and then when we run ‘flatten abc’ we get “abcdefghijklmnopqrstuvwxyz. We also see that this definition respects lazyness in the horizontal (but therefore not perse the vertical) direction. Namely, we construct an infinite flat tree of elements and see whether it succeeds in flattening. We will also construct an infinite vertical tree (obviously with some elements at eachc level, or it will take an infinite amount of time to reach the first element.

> infinite_flat_tree = P $ map C [0..]
> infinite_vert_tree = build [0..]
> where build (x:xs) = P [C x, build xs]

Indeed, if we load Flatten.lhs in ghci and run flatten, everything works out fine (make sure to do something like: ‘take 10 $ flatten infinite_flat_tree’, or your terminal will spam a lot of numbers). Similarly, it also respects the infinitely vertical tree (which the reader should note, has been built up in a way to ensure the creation of the tree itself is also lazy).

Originally I had intended to try out all the different algorithms that were written in Kyle’s blog, but it seems most of the trade-off is between using set! as well as reversing and other techniques to circumvent this. However, when using foldr and lazy structures, as long as you make sure to respect laziness, this is not required and there’s only one way of coding this function, which is probably the way anyone would write it in Haskell. If any of the readers have feedback on other versions that might be applicable for Haskell, I am definitely interested in seeing their version.

Therefore, to give this article any meaningful content, we will add some benchmarks. My computer is not so powerful, it is a simple Dell Latitude D610 laptop with 1GB of memory and a Centrino of 1.6Ghz.

First we need to generate some data-sets besides the one above. The first one was far too small and infinite data-sets will have obv
ious results if computed completely. To ensure the calculation actually occurs, I will get the last element of the computed list to force the entire list to be generated. To measure how complicated an N-ary tree is, Kyle introduces the concept of complexity. I will translate this to Haskell (If anyone spots algorithmic differences, feel free to point them out).

> complexity :: Tree a -> Integer
> complexity (C x) = 1
> complexity (P l) = foldl' (\c e -> 1 + c + complexity e) 0 l
> buildTestTree :: [Tree a] -> Int -> Tree a
> buildTestTree ts n = P $ take n $ cycle ts -- Cycle through the different
> -- trees, take only 'n' in total
> -- and create a parent node
> -- Thanks apfelmus from #haskell (P.s: getCPUTime returns picoseconds)
> time m = liftM2 subtract getCPUTime (m >> getCPUTime)

Due to laziness, it is important that we force the creation of the input tree before actually timing the flattening of the tree. But since we need to know the complexity of the tree before we try to flatten it, we can let the complexity function force the creation of the tree. Therefore, to get a test,case, we simply get the following code:

> runTest :: (Show a) => Tree a -> IO ()
> runTest x = do
> putStr "Complexity: "
> t1 <- time $ print $ complexity $ x -- Force the creation of the tree
> putStr "Last element: "
> t2 <- time $ print $ last $ flatten x -- Time the flatten function
> putStrLn $ "Times: " ++ show t1 ++ ", " ++ show t2

Finally, we can tie it all together with a little main that will allow us to run this program to test different complexities. While the scheme version allowed the user to choose which subtrees to use to build up the big tree, I will not give this option as in the end, the only metric used was complexity. I would like to briefly point out, however, that it might be interesting to test deeper trees, as the trees seem to have been limited to the depth of trees defined at the beginning.

> main :: IO ()
> main = do
> n <- liftM (read . head) getArgs
> let t = buildTestTree [abc, flat_abc, treec] n
> sequence_ $ replicate 10 $ runTest t -- Run timing test 10 times

The results can be seen below in the figure. I hope this was useful.

Personal Note

Well…

It’s my birthday today.

Edit: I’ve changed the way my posts are displayed to have “Read more” to keep the top-page cleaner. However, it’s still not what I want. For one, bloghelper says it leaves as homework how to remove the “Read more!”s from pages that do not have an extended article. I will go back and edit the other posts to add this span. If anyone knows how to remove the readmores from short posts that do not have an extended article, I’d be greatful. And if someone knows how to do it without just hiding the content (thereby reducing download time for readers), I’d be even more greatful.

Personal Note

Well…

It’s my birthday today.

Edit: I’ve changed the way my posts are displayed to have “Read more” to keep the top-page cleaner. However, it’s still not what I want. For one, bloghelper says it leaves as homework how to remove the “Read more!”s from pages that do not have an extended article. I will go back and edit the other posts to add this span. If anyone knows how to remove the readmores from short posts that do not have an extended article, I’d be greatful. And if someone knows how to do it without just hiding the content (thereby reducing download time for readers), I’d be even more greatful.

Simplistic compiler in Haskell

Recently I gave a short intro course on functional programming languages to people at work, introducing some basic comments. At the end, I added a very very tiny compiler to show how easy it is to create a compiler in Haskell.

I thought it might be interesting for the people out there to see it as well. As mentioned, it is very minimalistic. Keeping with the trend of the previous post, I will ensure this blogpost is proper literal haskell code.


So first we create our module. We also import the Control.Monad for liftM and liftM2. Finally, we import QuickCheck so we can create some easy tests at the end.


> module Alg where
> import Control.Monad
> import Test.QuickCheck

Next, we define the domain of our compiler. Namely, our compiler will compile simple arithmetic expressions, that can be nested arbitrarily deep, to a stack machine. An expression consists of either a simple number, the addition of two expressions, the multiplication of two expressions, or the substraction of two expressions. We add some standard typeclasses that allow us to easily work with them in the GHC interpreter (for instance Show to show them).


> data Exp = Simple Integer
> | Mul Exp Exp
> | Add Exp Exp
> | Sub Exp Exp
> deriving (Eq, Ord, Show)

Without compiling, we can already write a mini interpreter that interprets an expression straight away. One option of making this more generic is abstracting away the specific binary operator instead of creating a specific data-constructor for each, but I will leave that as an excercise.


> interpretExp (Simple i) = i
> interpretExp (Mul a b) = interpretExp a * interpretExp b
> interpretExp (Add a b) = interpretExp a + interpretExp b
> interpretExp (Sub a b) = interpretExp a - interpretExp b

Next we define the codomain, or the target, of our compiler. For this I have opted for a very simple stack machine with only four instructions. Either one pushes a number, or one applies an operator to the top two numbers on the stack. As for the stack, it is simply a list of numbers.


> data Op = OpPush Integer | OpAdd | OpMul |OpSub
> deriving (Eq, Ord, Show)
> type Stk = [Integer]

We can also write an ‘interpreter’ for this stack-based language. First, we write the function that interprets a single stack operation with a given stack and returns a new stack. For clarity sake, I have not included error code for when the stack is empty but numbers are required.


> interpret :: Stk -> Op -> Stk
> interpret s (OpPush i) = i:s
> interpret (a:b:s) OpAdd = (a+b):s
> interpret (a:b:s) OpSub = (a-b):s
> interpret (a:b:s) OpMul = (a*b):s

To run a set operations, one can simply fold over the list of operations, starting with an empty stack:


> run :: [Op] -> Stk
> run = foldl interpret []

Next, we define the compiler function. This compiles algebraic expressions to a list of stack operations. Notice to do this, first we calculate the two sub expressions, and then compile the operation in question:


> compile :: Exp -> [Op]
> compile (Simple i) = [OpPush i]
> compile (Mul a b) = compile b ++ compile a ++ [OpMul]
> compile (Add a b) = compile b ++ compile a ++ [OpAdd]
> compile (Sub a b) = compile b ++ compile a ++ [OpSub]

The code is now done, and in fact, one can simply type ‘ghci Alg.lhs” and try it out. However, we will add a quickcheck instance so we can test the correctness of the compiler. Simply, we require that interpreting an expression yields the same result as the top of the stack after compiling and interpreting the stack operations. To enable this, we first need to define a quickcheck instance for the domain, namely algebraic expressions. The code is a bit more complicated as it makes sure that it never generates infinite expression trees, so I will not explain it in detail. I suggest for those interested to check The quickcheck manual, or the haskell documentation.


> instance Arbitrary Exp where
> arbitrary = sized tree'
> where tree' 0 = liftM Simple arbitrary
> tree' n | n > 0 =
> oneof[liftM Simple arbitrary,
> liftM2 Mul subtree subtree,
> liftM2 Add subtree subtree,
> liftM2 Sub subtree subtree]
> where subtree = tree' (n `div` 2)
> coarbitrary (Simple i) =
> variant 0 . coarbitrary i
> coarbitrary (Mul a b) =
> variant 1 . coarbitrary a . coarbitrary b
> coarbitrary (Add a b) =
> variant 2 . coarbitrary a . coarbitrary b
> coarbitrary (Sub a b) =
> variant 3 . coarbitrary a . coarbitrary b

Now that we have an implementation that generates arbitrary algebraic expressions, it’s time to write our test case. Namely we always require (True ==>), that the result of interpretation of the algebraic expression is the same as the top of the stack after compiling and interpreting the stack operations. We could add an additional requirement that the stack only has 1 element remaining in it.


> prop_compile tree = True ==> (head $ run $ compile tree) == interpretExp tree

Well, I hope that was useful :)

A friend of mine suggested I add some examples. Once you have saved the above in a file named Alg.lhs, load it up in the interpreter with ‘ghci Alg.lhs’. Then you try it out:


> interpretExp (Add (Mul (Simple 5) (Simple 4)) (Simple 3))
23
> compile (Add (Mul (Simple 5) (Simple 4)) (Simple 3))
[OpPush 3,OpPush 4,OpPush 5,OpMul,OpAdd]
> let x = [OpPush 3,OpPush 4,OpPush 5,OpMul,OpAdd] in run x
[23]

Cheers!

As final note, please feel free to leave comments or questions.

Simplistic compiler in Haskell

Recently I gave a short intro course on functional programming languages to people at work, introducing some basic comments. At the end, I added a very very tiny compiler to show how easy it is to create a compiler in Haskell.

I thought it might be interesting for the people out there to see it as well. As mentioned, it is very minimalistic. Keeping with the trend of the previous post, I will ensure this blogpost is proper literal haskell code.


So first we create our module. We also import the Control.Monad for liftM and liftM2. Finally, we import QuickCheck so we can create some easy tests at the end.


> module Alg where
> import Control.Monad
> import Test.QuickCheck

Next, we define the domain of our compiler. Namely, our compiler will compile simple arithmetic expressions, that can be nested arbitrarily deep, to a stack machine. An expression consists of either a simple number, the addition of two expressions, the multiplication of two expressions, or the substraction of two expressions. We add some standard typeclasses that allow us to easily work with them in the GHC interpreter (for instance Show to show them).


> data Exp = Simple Integer
> | Mul Exp Exp
> | Add Exp Exp
> | Sub Exp Exp
> deriving (Eq, Ord, Show)

Without compiling, we can already write a mini interpreter that interprets an expression straight away. One option of making this more generic is abstracting away the specific binary operator instead of creating a specific data-constructor for each, but I will leave that as an excercise.


> interpretExp (Simple i) = i
> interpretExp (Mul a b) = interpretExp a * interpretExp b
> interpretExp (Add a b) = interpretExp a + interpretExp b
> interpretExp (Sub a b) = interpretExp a - interpretExp b

Next we define the codomain, or the target, of our compiler. For this I have opted for a very simple stack machine with only four instructions. Either one pushes a number, or one applies an operator to the top two numbers on the stack. As for the stack, it is simply a list of numbers.


> data Op = OpPush Integer | OpAdd | OpMul |OpSub
> deriving (Eq, Ord, Show)
> type Stk = [Integer]

We can also write an ‘interpreter’ for this stack-based language. First, we write the function that interprets a single stack operation with a given stack and returns a new stack. For clarity sake, I have not included error code for when the stack is empty but numbers are required.


> interpret :: Stk -> Op -> Stk
> interpret s (OpPush i) = i:s
> interpret (a:b:s) OpAdd = (a+b):s
> interpret (a:b:s) OpSub = (a-b):s
> interpret (a:b:s) OpMul = (a*b):s

To run a set operations, one can simply fold over the list of operations, starting with an empty stack:


> run :: [Op] -> Stk
> run = foldl interpret []

Next, we define the compiler function. This compiles algebraic expressions to a list of stack operations. Notice to do this, first we calculate the two sub expressions, and then compile the operation in question:


> compile :: Exp -> [Op]
> compile (Simple i) = [OpPush i]
> compile (Mul a b) = compile b ++ compile a ++ [OpMul]
> compile (Add a b) = compile b ++ compile a ++ [OpAdd]
> compile (Sub a b) = compile b ++ compile a ++ [OpSub]

The code is now done, and in fact, one can simply type ‘ghci Alg.lhs” and try it out. However, we will add a quickcheck instance so we can test the correctness of the compiler. Simply, we require that interpreting an expression yields the same result as the top of the stack after compiling and interpreting the stack operations. To enable this, we first need to define a quickcheck instance for the domain, namely algebraic expressions. The code is a bit more complicated as it makes sure that it never generates infinite expression trees, so I will not explain it in detail. I suggest for those interested to check The quickcheck manual, or the haskell documentation.


> instance Arbitrary Exp where
> arbitrary = sized tree'
> where tree' 0 = liftM Simple arbitrary
> tree' n | n > 0 =
> oneof[liftM Simple arbitrary,
> liftM2 Mul subtree subtree,
> liftM2 Add subtree subtree,
> liftM2 Sub subtree subtree]
> where subtree = tree' (n `div` 2)
> coarbitrary (Simple i) =
> variant 0 . coarbitrary i
> coarbitrary (Mul a b) =
> variant 1 . coarbitrary a . coarbitrary b
> coarbitrary (Add a b) =
> variant 2 . coarbitrary a . coarbitrary b
> coarbitrary (Sub a b) =
> variant 3 . coarbitrary a . coarbitrary b

Now that we have an implementation that generates arbitrary algebraic expressions, it’s time to write our test case. Namely we always require (True ==>), that the result of interpretation of the algebraic expression is the same as the top of the stack after compiling and interpreting the stack operations. We could add an additional requirement that the stack only has 1 element remaining in it.


> prop_compile tree = True ==> (head $ run $ compile tree) == interpretExp tree

Well, I hope that was useful :)

A friend of mine suggested I add some examples. Once you have saved the above in a file named Alg.lhs, load it up in the interpreter with ‘ghci Alg.lhs’. Then you try it out:


> interpretExp (Add (Mul (Simple 5) (Simple 4)) (Simple 3))
23
> compile (Add (Mul (Simple 5) (Simple 4)) (Simple 3))
[OpPush 3,OpPush 4,OpPush 5,OpMul,OpAdd]
> let x = [OpPush 3,OpPush 4,OpPush 5,OpMul,OpAdd] in run x
[23]

Cheers!


As final note, please feel free to leave comments or questions.

Generalized Matrix Multiplication

In this article I want to talk about generalized matrix multiplication.

As we all know, for 2dimensional matrices, there is 2 common ways of multiplying (I will ignore per-element-multiplication for an instance):

The cross-product:


[1 2 [5 6 [(1*5 + 2*7) (1*6 + 2*8) [19 22
3 4] x 7 8] = (3*5 + 4*7) (3*6 + 4*8)] = 43 50]

And the dot-product:


[1 2 [5 6
3 4] .* 7 8] = 1*5 + 2*6 + 3*7 + 4*8 = 59

If we remember the rule about cross-multiplication, it states that we can multiply two matrices if their shared dimension is the same. Formally, we can multiply matrices:


k*n x n*l => k*n

We also know that multiplication of numbers is associative, hence there is no reason that the shared dimension can not be more or less than a dimension. In fact, if we re-examine the dot-product, we see it’s another instance of a cross-product:


1*(m*m) x (m*m)*1 => 1*1

Therefore, we can generalize this principle to enable the multiplication of matrices of M and N dimensions to become a matrix of (M-L)+(N-L) dimensions, if, for dimensions:


a_1*..*a_(m-l)*a_(m-l+1)*..*a_m

x

b_1*...*b_l*b_(l+1)*...*b_n

it holds that:


Forall k: 0 .. l-1 .
a_(m-l+k) = b_(k+1)

And then we will have result, a matrix of the dimensions:


a_1*...*a_(m-l)*...*b_(l+1)*...*b_n

If there’s several l’s for which the equations above holds true, it is obvious that there is more than one way to multiply the matrices. This is the reason why typically there’s no agreed upon way to multiply matrices. There’s too many different possibilities, the problem is underconstrained. However, if we add as constraint the result dimensions we need, we can solve this problem. Thanks to Haskell’s advanced type-system, we can do just that!


> {-# OPTIONS_GHC -fglasgow-exts -fallow-undecidable-instances -fallow-incoherent-instances -fallow-overlapping-instances #-}
> module Matrices where

First we define type level numbers:


> data One f a = One (f a) deriving (Eq, Show)
> data S n f a = S (n f a) (f a) deriving (Eq, Show)
> tPred :: S n f a -> (n f a)
> tPred = undefined
> type Two f = S One f
> type Three f = S (S One) f
> type Four f = S (S (S One)) f

Of course it is useful to be able to get the actual number encoded in the type information:


> class Number a where
> tNumber :: a f x -> Int
> tData :: a f x -> f x
> instance Number One where
> tNumber = const 1
> tData (One a) = a
> instance (Number n) => Number (S n) where
> tNumber x = 1 + tNumber (tPred x)
> tData (S n a) = a

Let us first define a generic equality typeclass. This will be reused later for more than just type-level numbers

Next, since we want to encode arbitrary dimensions, we need to be able to encode these dimensions at the type-level. Therefore, we need lists of these type-level numbers. Because we only care about lists of these type-level numbers and not more generic of other types, we can see the application of a number to another number as consing those numbers together. Of course, there must be some way to decide when here is no more numbers left. For this we introduce the marker Nil. Additionally, we need to be able to get the tail and the head of such a list of numbers. The head of such a type-level list is trivial, it’s just the type-number. As for the tail, the tail of a type-level list is the data contained within the number


> data Nil a = Nil a deriving (Eq, Show)

We want to also be able to determine the dimensions stored in such a list. For this we introduce another type-class:


> class Dimension d where
> tDimension :: d a -> [Int]
> instance Dimension Nil where
> tDimension = const []
> instance (Number na, Dimension nb) => Dimension (na nb) where
> tDimension x = tNumber x:tDimension (tData x)

Let’s test this with a simple example:


> testNumberDisplay = do
> let x = undefined :: (Two (Three (Four Nil))) ()
> print $ tNumber x
> print $ tDimension x

Now it is easy to store matrices as such constructs. Let’s add a utility function that constructs a matrix from a List:


> class ToMatrix m where
> toMatrix :: [a] -> (m a)
> toMatrixFold :: [a] -> (m a, [a])
> toMatrix xs = fst $ toMatrixFold xs
> instance ToMatrix Nil where
> toMatrixFold (x:xs) = (Nil x, xs)
> instance (ToMatrix m) => ToMatrix (One m) where
> toMatrixFold as = (One b, xs)
> where (b,xs) = toMatrixFold as
> instance (ToMatrix m, ToMatrix (n m)) => ToMatrix (S n m) where
> toMatrixFold as = (S n b, xs)
> where (b, as') = toMatrixFold as
> (n, xs) = toMatrixFold as'

It would be nice if it were possible to display these matrices of items. For this we will create a convenience class that turns a matrix into a nested list.


> class FromMatrix m b | m -> b where
> fromMatrix :: m -> b
> instance FromMatrix (Nil a) a where
> fromMatrix (Nil x) = x
> instance (FromMatrix (m a) b) => FromMatrix (One m a) [b] where
> fromMatrix (One a) = [fromMatrix a]
> instance (FromMatrix (m a) b,
> FromMatrix (n m a) [b]) => FromMatrix (S n m a) [b] where
> fromMatrix (S n a) = fromMatrix a:fromMatrix n

Now it’s time to test whether these different functionalities work as expected.


> testToMatrix = do
> let as = [1..]
> let x = toMatrix as :: (Two (Two Nil)) Int
> print $ fromMatrix $ x
> let x = toMatrix as :: (Three (Three (Three Nil))) Int
> print $ fromMatrix $ x

Finally, now that we have matrix structures that can be created and displayed easily, it is time to create a matrix multiplication algorithm. However, to keep it generic (for instance to enable the Floyd-Warshall Algorithm), we will try to keep the actually ‘addition’ and ‘multiplication’ used to remain generic. Therefore, we first introduce a semiring. The semi-ring for numbers is trivial. As for matrices, it is possible to introduce

First we define how to add matrices (treating them as higher dimensional matrices) and how to left-multiply them with constants:


> class (Num a) => Vector m a where
> (<+>) :: (m a) -> (m a) -> (m a)
> (*>) :: a -> (m a) -> (m a)
> instance (Num a) => Vector Nil a where
> (Nil a) <+> (Nil b) = Nil $ a + b
> a *> (Nil b) = Nil $ a * b
> instance (Num a, Vector m a) => Vector (One m) a where
> (One ma) <+> (One mb) = One $ ma <+> mb
> a *> (One mb) = One $ a *> mb
> instance (Num a, Vector m a, Vector (n m) a) => Vector (S n m) a where
> (S na ma) <+> (S nb mb) = S (na <+> nb) (ma <+> mb)
> a *> (S nb mb) = S (a *> nb) (a *> mb)

Next we define how to apply a dot-product of two matrices where the second one may have additional dimensions that are then treated as vectors.


> class (Vector v a) => DotMatrix m n v a | m n -> v where
> (<.>) :: m a -> n a -> v a
> instance (Vector v a) => DotMatrix Nil v v a where
> (Nil a) <.> vb = a *> vb
> instance (DotMatrix m n v a) => DotMatrix (One m) (One n) v a where
> (One ma) <.> (One mb) = ma <.> mb
> instance (DotMatrix m n v a,
> DotMatrix (o m) (o n) v a)
> => Dot Matrix (S o m) (S o n) v a where
> (S na ma) <.> (S nb mb) = (na <.> nb) <+> (ma <.> mb)

A simple test can now be performed:


> testMultMatrix = do
> let as = [1..]
> let x = toMatrix as :: (Two (Two Nil)) Int
> let y = toMatrix as :: (Two (Two (Three (Four Nil)))) Int
> let z = x <.> y :: (Three (Four Nil)) Int
> print $ fromMatrix $ x
> print $ fromMatrix $ z

Dot-product works fine, even when working on non-scalars in the second matrix. Sadly, I can not force the matrix cross-product to use the dot-product when required. As such, this is where I’m stuck. Comments are more than welcome.

The code I had tried can be seen below:

Finally, we can define the multiplication for matrices. If only this would work… sadly I do not know how to force dotmatrix of the right submatrices.


> class MultMatrix k l m a where
> (<*>) :: k a -> l a -> m a
> instance (DotMatrix (k Nil) (k m) m a) => MultMatrix (k Nil) (k m) m a where
> ma <*> mb = ma <.> mb
> instance (MultMatrix k l m a) => MultMatrix (One k) l (One m) a where
> (One ma) <*> mb = One $ ma <*> mb
> instance (MultMatrix k l m a,
> MultMatrix (o k) l (o m) a)
> => MultMatrix (S o k) l (S o m) a where
> (S na ma) <*> mb = S (na <*> mb) (ma <*> mb)