Seeing how lately the levenschtein distance is so popular, I thought I’d add a haskell version to the mix.
The code below is pretty self-explanatory, using the typical memoization pattern. I do wonder, however, what would happen if somehow GHC supported lazily-resizable arrays. Perhaps then you could do this on infinite lists and only ask the levenschtein distance up to a certain index.
module Lev whereimport Data.Array levenshtein :: (Eq a) => [a] -> [a] -> Integer levenshtein xs ys = arr ! (lenx, leny) where arr :: (Array (Int, Int) Integer) arr = array ((0,0), (lenx, leny)) [((ix, iy), worker ix iy) | ix <- [0..lenx], iy <- [0..leny] ] lenx = length xs leny = length ys arrx = listArray (1, lenx) xs arry = listArray (1, leny) ys worker 0 iy = fromIntegral iy worker ix 0 = fromIntegral ix worker ix iy = minimum [1 + arr ! (ix-1, iy), 1 + arr ! (ix, iy-1), cost ix iy + arr ! (ix-1, iy-1)] cost ix iy = if (arrx ! ix) == (arry ! iy) then 0 else 1

this is a little bit shorter and faster:
levenshtein sa sb = last $ foldl (transform sa) [0..length sa] sb
where
transform str xs@(x:xs’) c = res
where
res = x+1 : zipWith4 compute str xs xs’ res
compute c’ x y z = minimum [y+1, z+1, x + if c' == c then 0 else 1]
Interesting code
Rather cryptic, I must admit.
if c’ == c then 0 else 1
can also be written
fromEnum (c /= c’)
if you don’t like inline ifs
Small improvements on jorpic’s version:
levenshtein sa sb = last $ foldl transform [0..length sa] sb where
transform xs@(x:xs’) c = scanl compute (x+1) (zip3 sa xs xs’) where
compute z (c’, x, y) = minimum [y+1, z+1, x + fromEnum (c' /= c)]
What is the complexity of your implementation, given (li ! n) is in O(n) ? I would expect O(n^3) instead of the usual O(n^2).
Sorry, I just realized you were using Data.Array, not list, so you have constant-time random access.
It is also possible to have a much faster algorithm for similar sequences. See Lloyd Allison’s haskell implementation of a O(N*d) solution.
http://www.csse.monash.edu.au/~lloyd/tildeFP/Haskell/1998/Edit01/