It seems the concept of an “infinite array” in my previous blog post prompted some interest. Two very good suggestions came from Chung-chieh Shan. As usual, his code was minimal but powerful.

- Use an infinite list of exponentially growing arrays.
- Use the following data-structure:
`data Tree a = Tree a (Tree a) (Tree a)`

I have to admit that I am not fully clear yet on how to encode it in the above tree. Conceptually it should be trivial to encode, for instance, [0..] to such a binary structure, after all it’s just bits. However, momentarily it escapes me.

Another solution came up through a discussion on #haskell with dcoutts and sjanssen(Spencer Janssen). At first the idea was to deal with lazy arrays, but quickly the discussion turned to the concept of a generic caching policy, brought up by dcoutts. One can imagine a combinator library of caching policies. And then one would memoize a function as follows:

`memo :: CachingPolicy -> (i -> e) -> (i :->: e)`

With this structure, one would generate a memoized function that uses the caching policy to evict old entries. Obviously this would require the use of `unsafePerformIO`

under the hood to mutate the data-structure. It would definitely be an interesting paper, according to dcoutts, all that is required is an application as a strong showcase.

Potentially, a thread could even be used to do time-based eviction policies. However, here either a lock would have to be used, or the current `STM`

system would have to be modified to allow for STM transactions to run in `unsafePerformIO`

, something that is currently unsupported.

With the new concept of Type Families, it would even be possible to choose an appropriate data-structure based on the key-type, to guarantee optimal performance.

Chung-chieh Shan <—- I can’t believe it!!!! I know this guy!!!! I knew him since high school!

I didn’t know him that well though, but it’s still surprising to discover the name of an old acquaintance!!

For infinite arrays, check out Patricia trees (big- and little-endian) and their implementation by Daan Leijen in Data.IntMap. See also Chris Okasaki’s publications on “random-access lists”.

for data Tree a = Node a (Tree a) (Tree a), the encoding is easy; and very obvious if you start from 1 instead of 0:

1

2 3

4567

in each case, a node’s two children are 2*n and 2*n+1. From that, you can trivially determine the path to a given node n; represent it in binary and remove the leading “1″ bit. Its depth is the length of the resulting string, and the binary string itself encodes the path directly: “0″ means to go left, “1″ to go right.

Hello Ryan,

Indeed you are right. It’s also possible to store it hte other way, that way lookup is easier by doing even/odd and then shifting to the right. The key insight is basically that the bits are reversed when storing and when looking it up.

Conal,

Thanks for the tip, I’ll definitely take a look at those.