In this blog article, we look at the benchmarks as proposed in Kyle’s Smith’s blog regarding the flattening of lists in Scheme. However, we perform the same experiment in Haskell. Obviously, due to typing reasons, this will not be done on lists but N-ary trees. It should be interesting to see how these algorithms perform in Haskell, especially due to the very different semantics caused by laziness.
In fact, a very common idiom in Scheme, or even SML or O’Caml, namely ensuring tail-call recursion, is not always a good idea in Haskell when working with data-structures. They are too strict in a sense, and as such they do not work well on infinite data-structures.
The first thing to do in Haskell, of course, is to define a module. We also define the data-type that will hold a list to be flattened. In this case, it is an N-ary tree where the options for the data-type is sumtype of a child of a single element, or a parent of a list of this sum-type. To keep it concise, I will give the data-constructors short names.
> module Flatten where
> import Control.Monad(liftM, liftM2)
> import CPUTime(getCPUTime)
> import Data.List(foldl')
> import System(getArgs)
> data Tree a = C a | P [Tree a]
As example, we show the encoding of a one of the ‘lists’ from Kyle’s blog. The notation is a bit verbose, but one could easily add a read-instance to fix this. Since this is not crucial, I will not do so at the moment (let’s just say, yay for VIM and some regexp-fu to turn the data into Haskelly data).
(define abc ‘(a (b (c (d (e (f (g (h (i (j (k (l (m n) o) p) q) r) s) t) u) v) w) x) y) z)) ; would become:
> abc = P [C 'a', P [C 'b', P [C 'c', P [C 'd', P [C 'e', P [C 'f', P [C 'g',
> P [C 'h', P [C 'i', P [C 'j', P [C 'k', P [C 'l', P [C 'm', C 'n'], C 'o'], C 'p'], C 'q'], C 'r'], C 's'],
> C 't'], C 'u'], C 'v'], C 'w'], C 'x'], C 'y'], C 'z']
> flat = P $ map C $ [1..26]
> flat_abc = P $ map C $ ['a'..'z']
> tree = P [C 'a', P [C 'b', P [C 'd', P [C 'i', C 'q', C 'r'], P [C 'j', C 's', C 't']],
> P [C 'f', P [C 'k', C 'u', C 'v'], P [C 'l', C 'w', C 'x']]], P [C 'c', P [C 'g', P [C 'm', C 'y', C 'z'],
> P [C 'n', C 'a', C 'b']], P [C 'h', P [C 'o', C 'c', C 'd'], P [C 'p', C 'e', C 'f']]]]
> treec = P [C 'a', P [C 'b', P [C 'd', P [C 'i', C 'q', C 'r'], P [C 'j', C 's', C 't']], P [C 'f', P [C 'k', C 'u', C 'v'],
> P [C 'l', C 'w', C 'x']]], P [C 'c', P [C 'g', P [C 'm', C 'y', C 'z'], P [C 'n', C 'a', C 'b']], P [C 'h', P [C 'o', C 'c', C 'd'], P [C 'p', C 'e', C 'f']]]]
All the algorithms will by definition be slightly different as we have one more case to check due to the design of our data-type. First of all, instead of checking on whether the input is null, we will instead check when we have a child element. And secondly, because each level of the tree is a pure list, we can use standard list operations like map and fold.
Looking through the algorithms, a lot use side-effects or reversing at the end. Obviously, side-effects are not a standard idiom in Haskell. Reversing is not a good idiom for Haskell programming, because reversing tends to destroy laziness. As such, the version that is most straightforward to code in Haskell is the one that uses ‘cons’ with an accumulator. The version in scheme required reversing at teh end. We do not need to reverse the list at the end because we use ‘foldr’ while the algorithms in scheme used a fold-left mechanism. The reason why we want foldr is because it respects laziness, while the reason why schemers and mlers want foldl is because it is tail-call recursive.
I will add type-declarations to the function but these are purely for documentation purposes and are not actually required for Haskell. The scheme version uses a mutation on the result variable. Instead of doing this, we will take a more Haskelly approach (I was going to say pure, in the purely functional, but that could be misconstrued to mean pure as in purity of algorithm) and thread this result through the calls.
> flatten :: Tree a -> [a]
> flatten t = f t  -- Call helper with the tree and an empty list
> -- The case we have list of nodes
> where f (P l) r = foldr f r l
> -- The case where we have a single element
> f (C a) r = a:r
To test we simply run ‘ghci Flatten.lhs’ and then when we run ‘flatten abc’ we get “abcdefghijklmnopqrstuvwxyz. We also see that this definition respects lazyness in the horizontal (but therefore not perse the vertical) direction. Namely, we construct an infinite flat tree of elements and see whether it succeeds in flattening. We will also construct an infinite vertical tree (obviously with some elements at eachc level, or it will take an infinite amount of time to reach the first element.
> infinite_flat_tree = P $ map C [0..]
> infinite_vert_tree = build [0..]
> where build (x:xs) = P [C x, build xs]
Indeed, if we load Flatten.lhs in ghci and run flatten, everything works out fine (make sure to do something like: ‘take 10 $ flatten infinite_flat_tree’, or your terminal will spam a lot of numbers). Similarly, it also respects the infinitely vertical tree (which the reader should note, has been built up in a way to ensure the creation of the tree itself is also lazy).
Originally I had intended to try out all the different algorithms that were written in Kyle’s blog, but it seems most of the trade-off is between using set! as well as reversing and other techniques to circumvent this. However, when using foldr and lazy structures, as long as you make sure to respect laziness, this is not required and there’s only one way of coding this function, which is probably the way anyone would write it in Haskell. If any of the readers have feedback on other versions that might be applicable for Haskell, I am definitely interested in seeing their version.
Therefore, to give this article any meaningful content, we will add some benchmarks. My computer is not so powerful, it is a simple Dell Latitude D610 laptop with 1GB of memory and a Centrino of 1.6Ghz.
First we need to generate some data-sets besides the one above. The first one was far too small and infinite data-sets will have obvious results if computed completely. To ensure the calculation actually occurs, I will get th
e last element of the computed list to force the entire list to be generated. To measure how complicated an N-ary tree is, Kyle introduces the concept of complexity. I will translate this to Haskell (If anyone spots algorithmic differences, feel free to point them out).
> complexity :: Tree a -> Integer
> complexity (C x) = 1
> complexity (P l) = foldl' (\c e -> 1 + c + complexity e) 0 l
> buildTestTree :: [Tree a] -> Int -> Tree a
> buildTestTree ts n = P $ take n $ cycle ts -- Cycle through the different
> -- trees, take only 'n' in total
> -- and create a parent node
> -- Thanks apfelmus from #haskell (P.s: getCPUTime returns picoseconds)
> time m = liftM2 subtract getCPUTime (m >> getCPUTime)
Due to laziness, it is important that we force the creation of the input tree before actually timing the flattening of the tree. But since we need to know the complexity of the tree before we try to flatten it, we can let the complexity function force the creation of the tree. Therefore, to get a test,case, we simply get the following code:
> runTest :: (Show a) => Tree a -> IO ()
> runTest x = do
> putStr "Complexity: "
> t1 <- time $ print $ complexity $ x -- Force the creation of the tree
> putStr "Last element: "
> t2 <- time $ print $ last $ flatten x -- Time the flatten function
> putStrLn $ "Times: " ++ show t1 ++ ", " ++ show t2
Finally, we can tie it all together with a little main that will allow us to run this program to test different complexities. While the scheme version allowed the user to choose which subtrees to use to build up the big tree, I will not give this option as in the end, the only metric used was complexity. I would like to briefly point out, however, that it might be interesting to test deeper trees, as the trees seem to have been limited to the depth of trees defined at the beginning.
> main :: IO ()
> main = do
> n <- liftM (read . head) getArgs
> let t = buildTestTree [abc, flat_abc, treec] n
> sequence_ $ replicate 10 $ runTest t -- Run timing test 10 times
The results can be seen below in the figure. I hope this was useful.