blog.poucet.org Rotating Header Image

Programming Languages

TypeChecking Lambda Calculus using TyperMonad (courtesy of Saizan)

While I had planned to show how to use the typechecker monad presented in in an earlier blog post by showing how to typecheck MiniML, I was positively surprised to find out that people had actually started using due to it’s simplicity. More specifically, Saizan in the #haskell channel on Freenode had written a type-checker for a simple typed lambda-calculus. With his permission, I have presented it here along with some modifications. His original code was a bit shorter, which goes to show how easy the monad is, but I’ve introduced a few helper functions that I should’ve introduced in the original blog post but didn’t because at the time I felt like they would confuse the issue too much. As such, the contents of the actual typechecking is almost the same, except for a few details which I shall mention in the body of this blogpost.


I’m just going to walk straight through the code as it’s rather straightforward.

> module STLC where
> import TypeChecker
> import qualified Data.Map as M
> import Control.Monad
> import MonadScope
> import MonadEnv
> import MonadSupply

First we introduce a function that replaces type variables with other type variables in types, this is rather straight forward. Given an environment, replace variables when you find them.

> ------------------------------------------------------------------------------
> -- substVar : Replaces variables with other variables within a type
> ------------------------------------------------------------------------------
> substVar :: [(Var, Var)] -> MonoType -> MonoType
> substVar subst tx@(TyVar x) =
> case x `lookup` subst of
> Nothing -> tx
> Just y -> TyVar y
> substVar subst (TyConst n ts) =
> TyConst n . map (substVar subst) $ ts
> substVar subst (TyFun ta tb) =
> TyFun (substVar subst ta) (substVar subst tb)

Next is an important function, and something that Saizan missed in his original post. This is actually part of my TypeChecker.hs file which I trimmed down for the previous blogpost. When you have a polytype (some type that says \forall variables. type) and you want to actually use it, you have to make a monotype out of it. Now, a polytype contains a monotype, but it can not simply be used as is. This is something that was missing in the original code, though in the case of this specific code, it would not have made a difference so it. It is important to instantiate a polytype’s free-variables with fresh type-variables, or you can end up over-specifying a type. More specifically, I think this is what makes or breaks the Monomorphism Restriction, though if I’m wrong, feel free to point that out. Just to give you an idea, assume the following code (in -some- functional language)


let id x = x in
(id 'b', id 1)

What will happen, when typechecking this code is that in the scope of the body of that let-clause, the identifier ‘id’ will refer to a polytype (Namely: forall a. a -> a). Now, if this polytype’s forall bit is simply dropped and the mono-type is used, we’ll end up with the ‘a’ being unified both with Char and Integer. Hence, it is important to instantiate each use of ‘id’ to make sure that fresh variables are used instead of ‘a’. The following function does that, using the functionality of substVar.

> ------------------------------------------------------------------------------
> -- instantiate : Instantiates a polytype, making sure all free type-variables
> -- are fresh
> ------------------------------------------------------------------------------
> instantiate :: (TyperMonad m) => PolyType -> m MonoType
> instantiate (TyPoly is t) = do
> is' <- mapM (\x -> supply) is
> return $ substVar (zip is is') t

Because we have scoping, we want to find the type of a variable. The following code finds the poly-type of a let-bound (or lambda-bound) variable in the scope, and then makes sure to instantiate it properly so that we do not get the monomorphism restriction.

> ------------------------------------------------------------------------------
> -- findType : Finds the type of a variable within the current scope
> ------------------------------------------------------------------------------
> findType :: (TyperMonad m) => String -> m MonoType
> findType x = find x >>= (maybe (error $ "Variable not in scope: " ++ x) instantiate) >>= normalize

Obviously, since we’re dealing with scopes, we also have to create new scopes.
The following function comes directly from Saizan’s code, except that I’ve taken out the dual of instantiate, namely generalize. Additionally, sometimes it’s useful to get a poly-type without generalizing any of the free-variables in the monotype (for instance patterns, though at the moment I can’t recall why). Actually, Saizan corrected me after this post went public, lambda-bound variables should be non-generalized (which looking back at my old-code makes sense, since they’re also patterns).

> ------------------------------------------------------------------------------
> -- generalize : Generalizes a monotype to a polytype by generalizing over all
> -- of its free variables
> ------------------------------------------------------------------------------
> generalize :: (TyperMonad m) => MonoType -> m PolyType
> generalize = liftM toPoly . normalize

> ------------------------------------------------------------------------------
> -- nogeneralize : Turns a type into a polytype without generalizing any of
> -- it's free variables (and thus when instantiated, this can lead to the
> -- monomorphism restriction)
> ------------------------------------------------------------------------------
> nogeneralize :: (TyperMonad m) => MonoType -> m PolyType
> nogeneralize = liftM (TyPoly []) . normalize

> ------------------------------------------------------------------------------
> -- withinScope : Applies a TyperMonad action within a nested scope with a new
> -- variable with a given type
> ------------------------------------------------------------------------------
> withinScope :: (TyperMonad m) => String -> MonoType -> m a -> m a
> withinScope x t ma = do
> pt <- nogeneralize t
> scope (M.insert x pt) ma

A last little helper function to generate unique typevariables…

> supplyVar :: TyperMonad m => m MonoType
> supplyVar = liftM TyVar supply

… and we get to the actual lambdacalculus. It’s a very simple calculus, with only three forms: identifiers (variables), lambda abstractions and function application.

> ------------------------------------------------------------------------------
> -- Simple Typed Lambda Calculus
> ------------------------------------------------------------------------------
> type Id = String
> data Expr = Var Id
> | Lambda Id Expr
> | App Expr Expr

Here the code differs just -slightly- with Saizan’s original code. Instead of normalizing after checking each sub-expression, it is perfectly fine to normalize at the end. Even if you were decorating the expression-tree, you’d still have to wait until the end, because only then could all types be normalized. So there’s no point, really, in normalizing before typechecking (and thus unification) is finished.

> typeCheck :: (TyperMonad m) => Expr -> m MonoType
> typeCheck e = check e >>= normalize
> where check (Var x) = findType x
> check (Lambda x b) = do
> tx <- supplyVar
> tb <- withinScope x tx $ check b
> return $ TyFun tx tb
> check (App a b) = do
> ta <- check a
> tb <- check b
> te <- supplyVar
> unify ta (TyFun tb te)
> return te

There’s a few alterations, but the gist of Saizan’s core which prompted this blog-post is all there, and I want to thank him for this great illustration of how to use the typermonad, since I did not think about it myself at all. Notice that the only code that would change in the case of a different type-checker is the code that came after the definition of the lambda-calculus. All the code that came before that were generic helper functions that I did not introduce in the prior blog post as then they would’ve not been clear as to why they are useful, but they -should- go into TypeChecker.hs.

Finally, let’s show some examples that Saizan originally showed, and see how nicely they typecheck :)

> id' = Lambda "x" (Var "x")
> flip' = Lambda "f" (Lambda "a" (Lambda "x" (App (App (Var "f") < font color="Green">(Var "x")) (Var "a"))))
> run m = evalTyperT M.empty m
>
> -- *STLC> run $ typeCheck (App flip' id')
> -- b -> (b -> g) -> g

As usual, this blogpost is a literate haskell file, and if you have any comments, questions or suggestions, feel free to leave a comment on my blog :)

A Fractional Type for Haskell

In this short blog post, I introduce a utility module that I will use in another blog post. To keep the content there short, I decided to make a blogpost just for this module. As usual, the license for this module is BSD, and the author is the author of this blog post.

Basically, while working on continued fractions, I needed a way to compare fractions of arbitrary numbers to see if their integral part coincided. Before I explain what I mean by that (if it’s not already obvious), I will first show some notation that I use to represent fraction types:

a :∼: b represents a fraction ab

Since this is a rather short post, let’s dive straight into the code.

As usual, this will be a literate haskell post, and it will include the full comments from the original file :) So first, the header, which also specifies some details regarding the workings of this code

> module Fraction where
> import Data.Ratio
>
> data Fraction a = {-# UNPACK #-}!a :~: {-# UNPACK #-}!a
> deriving (Show)
>
> infix 7 :~:

As mentioned earlier, this code serves to allow comparing of fractions for just their integral part. Unlike Data.Ratio in Haskell, this type allows for having 0 in the denominator. The reason for this is that the code that uses this type needed to be able to store these numbers and only later filter them. As such, we have the following rules (where nan = not a number):

  • ∀ a. (a > 0 ⇔ (a :∼: 0) = ∞) ∧ (a < 0 ⇔ (a :∼: 0) = −∞)
  • (0 :∼: 0) = nan
  • ∞ ≠ nan, −∞ ≠ nan, ∞ ≠ −∞, −∞ < ∞
  • ∀ a b: b ≠ 0 . (a :∼: b) ≠ nan ∧ (a :∼: b) ≠ ∞ ∧ (a :∼: b) ≠ −∞
  • ∀ a b: b ≠ 0 . −∞ < (a :∼: b) < ∞
  • ∀ a b c d: b ≠ 0 . (a :∼: b) = (c :∼: d) ⇔ a `div` b = c `div `d
  • ∀ a b c d: b ≠ 0 . (a :∼: b) < (c :∼: d) ⇔ a `div` b < c `div `d

It is important to notice that ‘nan’ can not be compared to any other value. It can only be ‘compared’ for equality, but it is not part of the order defined on these numbers. Porting the following rules straight to code, and adding some simple functions to check whether a number is infinite or a nan, and some accessors, we get:

> unFraction :: Fraction a -> (a,a)
> unFraction (a :~: b) = (a,b)
>
> toIntegral (a :~: b) = a `div` b
>
> isInfinite (a :~: 0) = a /= 0
> isInfinite _ = False
>
> isNan (0 :~: 0) = True
> isNan _ = False
>
> instance (Eq a, Num a, Integral a) => Eq (Fraction a) where
> -- (0 :~: 0) == (0 :~: 0) = True -- See blog-comments
> (0 :~: 0) == _ = False
> _ == (0 :~: 0) = False
> (a :~: 0) == (c :~: 0) = signum a == signum c
> (_ :~: 0) == (_ :~: _) = False
> (_ :~: _) == (_ :~: 0) = False
> (a :~: b) == (c :~: d) = a `div` b == c `div` d
>
> instance (Ord a, Num a, Integral a) => Ord (Fraction a) where
> (0 :~: 0) `compare` _ = error "Can not compare NaN's"
> _ `compare` (0 :~: 0) = error "Can not compare NaN's"
> (a :~: 0) `compare` (c :~: 0) = compare (signum a) (signum c)
> (a :~: 0) `compare` (_ :~: _) = compare a 0
> (_ :~: _) `compare` (c :~: 0) = compare 0 c
> (a :~: b) `compare` (c :~: d) = compare (a `div` b) (c `div` d)

Defining arithmethic for this type is rather easy. Since I only care about equality of the final number, we can use plain old arithmetic like any old rational type. This includes the instance for Fractional, which defines division. Finally, it is also rather easy to define an instance of Real number for this fractional type, since it’s very easy to generate a rational number from this. The only problem, of course, would be when ‘toRational’ is called on an infinite or a nan.

> instance (Num a, Integral a) => Num (Fraction a) where
> (a :~: b) + (c :~: d) = (a*d + b*c) :~: (b*d)
> (a :~: b) - (c :~: d) = (a*d - b*c) :~: (b*d)
> (a :~: b) * (c :~: d) = (a*c) :~: (b*d)
> negate (a :~: b) = (negate a :~: b)
> abs (a :~: b) = (abs a :~: abs b)
> signum (a :~: b) = (signum a :~: 1)
> fromInteger x = fromInteger x :~: 1
>
> instance (Fractional a, Integral a) => Fractional (Fraction a) where
> (a :~: b) / (c :~: d) = (a*d) :~: (b*c)
> fromRational x = (fromInteger . numerator$ x) :~: (fromInteger . denominator $ x)
>
> instance (Real a, Integral a) => Real (Fraction a) where
> toRational (a :~: b) = toRational a / toRational b

Well that’s all folks, not really complicated stuff. Just thought I’d add a light blog in preparation for the entry on continued fractions which will be reasonably math intensive.

SImple Type Inference in Haskell

In this article I show how to build a simple Hindley-Milner type-inference engine in Haskell. I used this a while back in a MiniML frontend/interpreter. Due to the size of the full code, I will only show the type-inference engine and not the rest of the interpreter. I’ll show some simple uses and then in future articles I will show how this can be used more practically in a real compiler. Or potentially expand the type-checker with new concepts.

For this code I used three custom monads that I made which are actually quite useful in general. They will each go in their separate module. Since this means that this article is not one literate haskell file, I will not use literate style, except for the code belonging to the actual type checker module. Just copy paste the other three files in appropriately named .hs files. The first monad is very simple and was actually written by Cale Gibbard. It takes as input an infinite list of data (for instance variables), and then whenever asked it will give a new fresh symbol. The license of all the code is simple permissive license (I explicitly mention this because the first monad is Cale’s work). The original code comes from MonadSupply. I extended it with a few more monad instances. Of interest is the function ‘makeSupply’ which was also added so that it’s easy to generate an infinite list of unique symbols based on a list of prefixes and a list of characters that can be used to extend the symbols to make new symbols.



{-# OPTIONS_GHC -fglasgow-exts -fallow-undecidable-instances #-}
module MonadSupply where
import Control.Monad
import Control.Monad.State
import Control.Monad.Reader

newtype SupplyT s m a = SupplyT { unSupplyT :: StateT [s] m a}
deriving (Functor, Monad, MonadTrans, MonadIO)

newtype Supply s a = Supply (SupplyT s Maybe a)
deriving (Functor, Monad, MonadSupply s)

class Monad m => MonadSupply s m | m -> s where
supply :: m s

instance Monad m => MonadSupply s (SupplyT s m) where
supply = SupplyT $ do
(x:xs) <- get
put xs
return x

instance MonadState s m => MonadState s (SupplyT x m) where
get = SupplyT . lift $ get
put v = SupplyT . lift $ put v

instance MonadReader r m => MonadReader r (SupplyT x m) where
ask = SupplyT . lift $ ask
local f a = SupplyT . local f . unSupplyT $ a

evalSupplyT (SupplyT s) supp = evalStateT s supp
evalSupply (Supply s) supp = evalSupplyT s supp

runSupplyT (SupplyT s) supp = runStateT s supp
runSupply (Supply s) supp = runSupplyT s supp

--------------------------------------------------------------------------------
--- makeSupply
--- Makes an infinite list
--------------------------------------------------------------------------------
-- makeSupply :: [[a]] -> [[a]] -> [[a]]
-- makeSupply inits tails =
-- let vars = inits ++ (map concat . sequence $ [vars, tails]) in vars
--------------------------------------------------------------------------------
--- Cleaner version supplied by TuringTest
--------------------------------------------------------------------------------
makeSupply :: [[a]] -> [[a]] -> [[a]]
makeSupply inits tails =
let vars = inits ++ (liftM2 (++) vars tails) in vars


The second monad is an environment monad which I made. It’s basically a fancy wrapper around a state monad that stores key-value pairs.



{-# OPTIONS_GHC -fglasgow-exts -fallow-undecidable-instances #-}
module MonadEnv where
import Control.Monad
import Control.Monad.State
import Control.Monad.Reader
import qualified Data.Map as M

type Memory k v = M.Map k v

newtype EnvT k v m a = EnvT { unEnvT :: StateT (Memory k v) m a }
deriving (Functor, Monad, MonadTrans, MonadIO)

newtype Env k v a = Env (EnvT k v Maybe a)
deriving (Functor, Monad, MonadEnv k v)

class (Ord k, Monad m) => MonadEnv k v m | m -> k, m -> v where
getfor :: k -> m (Maybe v)
putfor :: k -> v -> m ()

instance (Ord k, Monad m) => MonadEnv k v (EnvT k v m) where
getfor k = EnvT $ do
s <- get
return $ M.lookup k s
putfor k v = EnvT $ do
modify (\s -> M.insert k v s)

instance MonadState s m => MonadState s (EnvT k v m) where
get = EnvT . lift $ get
put v = EnvT . lift $ put v

instance MonadReader r m => MonadReader r (EnvT k v m) where
ask = EnvT . lift $ ask
local f a = EnvT . local f . unEnvT $ a

evalEnvT (EnvT s) ctx = evalStateT s ctx
evalEnv (Env s) ctx = evalEnvT s ctx

The final monad which I used was actually to scope behaviour during type-checking. This is not required, perse, for the pure typechecker, as you could put it on top of the typechecker monad in your compiler monad. However I think that the three fit nicely together (and I admit for being too lazy to rewrite the code to have split monads). It is a scoping environment. I hope that this is not too much code for one blog article. I plan to release a few more monads I have made in separate articles, so this feels like a collosal article, but I’m rather intent on showing this type-checker :) .



{-# OPTIONS_GHC -fglasgow-exts -fallow-undecidable-instances #-}
module MonadScope where
import Control.Monad
import Control.Monad.State
import Control.Monad.Reader
import qualified Data.Map as M

type Context k v = M.Map k v
type Subst k v = Context k v-> Context k v

newtype ScopeT k v m a = ScopeT { unScopeT :: ReaderT (Context k v) m a }
deriving (Functor, Monad, MonadTrans, MonadIO)

newtype Scope k v a = Scope (ScopeT k v Maybe a)
deriving (Functor, Monad, MonadScope k v)

class (Ord k, Monad m) => MonadScope k v m | m -> k, m -> v where
find :: k -> m (Maybe v)
scope :: Subst k v -> m a -> m a

instance (Ord k, Monad m) => MonadScope k v (ScopeT k v m) where
find k = ScopeT $ do
s <- ask
return $ M.lookup k s
scope s act = ScopeT $ do
local s . unScopeT $ act

instance MonadState s m => MonadState s (ScopeT k v m) where
get = ScopeT . lift $ get
put v = ScopeT . lift $ put v

evalScopeT (ScopeT s) ctx = runReaderT s ctx
evalScope (Scope s) ctx = evalScopeT s ctx

Now that we have the above three modules behind the way, we can start looking at the typechecker. Obviously we start with the definition of the module and the import of these three module as well as a few other basic necessities.



> {-# LANGUAGE FlexibleContexts, GeneralizedNewtypeDeriving, TypeSynonymInstances, MultiParamTypeClasses #-}
> module TypeChecker where
>
> import MonadSupply
> import MonadScope
> import MonadEnv
> import Data.List(nub, (\\), intersperse)
> import qualified Data.Map as M
> import Control.Monad.Trans

Basic type definitions

Now that we have that in place, it is important to specify types. One should realize that a typechecker has two kinds of types inside: monotypes and polytypes. The difference is that any free variables in a mono-type may only be unified with 1 specific type, while a polytype has variables that are parametrically polymorphic. More specifically, if we take the function “map :: (a -> b) -> [a] -> [b]“, this is actually “map :: forall a b. (a -> b) -> [a] -> [b]“. Thus it has a polytype. All top-level definitions and let-bound variables in Haskell are poly-typed. If map were mono-typed, then the first use of map would unify the ‘a’ and ‘b’ to specific types and any other invocation would have to respect those types. So a polytype is a monotype that has been quantified over several variables. As for monotypes, there are two specific options. Either the type is purely a type variable (TyVar), or it is a type-constructor that has several parameters. For convenience sake, we also define a function type (TyFun). In essence it’s just a TyConst with as Const “->”, but it’s always nice to make special cases ;) . For the actual type-variables, we will be using strings.

Besides the above basic ‘types’, we also introduce a few useful type-classes. Additionally, we define a few std type creationers (I know, bad word, but ‘type constructors’ would be ambiguous) to define some std types. Some of the functions are a bit simplistic and could be replaced by a simple call to the data-constructor. These exist because originally my types were a bit more complicated (I had a typeclass for types since I wanted to be able to add extra information to types, as such the type of types was actually ‘MonoType mt’ and then I would tie the knot by fixing the type declaration over whatever record contained the extra info, ignore and just be glad that the functions all have nicely similar names :) .



> --- Type variable
> type Var = String
> --- Type constructor
> type Const = String
> --- MonoTypes
> data MonoType =
> TyVar Var
> | TyFun MonoType MonoType
> | TyConst Const [MonoType]
> deriving (Eq, Ord)
> --- PolyTypes
> data PolyType =
> TyPoly [Var] MonoType
> deriving (Eq, Ord)
>
> instance Show MonoType where
> show (TyVar x) = x
> show (TyFun ta tb) = "(" ++ show ta ++ ") -> " ++ show tb
> show (TyConst "(,)" tis) = "(" ++ (concat . intersperse ", " . map show $ tis) ++ ")"
> show (TyConst n []) = n
> show (TyConst x tis) = x ++ " " ++ (unwords . map show $ tis)
>
> instance Show PolyType where
> show (TyPoly ids mt) = (if null ids then "" else "forall " ++ unwords ids ++ ".") ++ show mt
>
> class HasVars a where
> freeVars :: a -> [Var]
> occurs :: Var -> a -> Bool
> occurs x t = x `elem` freeVars t
>
> instance HasVars MonoType where
> freeVars (TyVar a) = [a]
> freeVars (TyFun ta tb) = nub $ freeVars ta ++ freeVars tb
> freeVars (TyConst _ ts) = nub . concatMap freeVars $ ts
> occurs x (TyVar y) = x == y
> occurs x (TyFun ta tb) = occurs x ta || occurs x tb
> occurs x (TyConst _ ts) = or . map (occurs x) $ ts
>
> instance HasVars PolyType where
> freeVars (TyPoly vs t) = freeVars t \\ vs
>
> toPoly :: MonoType -> PolyType
> toPoly x = TyPoly (freeVars x) x
> fromPoly :: PolyType -> MonoType
> fromPoly (TyPoly fvs t) = t
>
> unitConstructor = "()"
> tupleConstructor = "(,)"
>
> mkTyVar :: Var -> MonoType
> mkTyVar x = TyVar x
> mkTyFun :: MonoType -> MonoType -> MonoType
> mkTyFun ta tb = TyFun ta tb
> mkTyConst :: Const -> [MonoType] -> MonoType
> mkTyConst c ts = TyConst c ts
>
> mkTupleType :: [MonoType] -> MonoType
> mkTupleType ts = mkTyConst tupleConstructor ts
> mkUnitType :: MonoType
> mkUnitType = mkTyConst unitConstructor []
> mkBoolType :: MonoType
> mkBoolType = mkTyConst "Bool" []
> mkIntType :: MonoType
> mkIntType = mkTyConst "Int" []
>
> mkFunType :: [MonoType] -> MonoType
> mkFunType (a:b:r) = mkTyFun a $ mkFunType (b:r)
> mkFunType [a] = a
> mkFunType [] = error "mkFunType: Trying to make a function with 0 args"

Type Checker Monad

Now that we’ve defined some standard functions and smart constructors, we could play around with it, but I will leave that to the reader. Instead, let’s move on to the meat of this blog entry. for this we define a new monad that is the typer monad. The typer monad is actual a stack of monad transformers. As bottom monad, we define the supply monad transformer that will ensure we have an unlimited supply of fresh names that the typer monad can use. On top of this, we define an environment monad (EnvT) that maps typevariables to appropriate MonoType’s. Finally, because we want to have let-binding in whatever language we may use, we define a scoping monad (ScopeT) so that whenever the language uses (let x = … in …) we can have a polytype for ‘x’ such that we do not need to worry about the momonomorphism restriction. With this stack, we define a custom monad (TyperMonad) that has two basic operations besides the standard operations that the stack we made gives and besides the regular monad interface. These two actions allow us to unify two types and allow us to normalize a type based on what has been unified so far. For instance, if we unify (TyVar “a”) with (TyConst “Int” []), then whenever we normalize a type that has (TyVar “a”) in it, we want to be sure that it becomes the Int type. We will show simple examples at the bottom on how this is used.



> -- Ident is the type for identifiers in a program
> type Ident = String
> -------------------------------------------------------------------------------
> --- TyperState:
> --- State of typer monad
> --- Keeps a map of tyvar -> MonoType substitutions
> --- Keeps a list of free tyvars
> --- Requires: tyvars used as keys do not appear in the values
> --- unify needs to ensure this at all times
> --------------------------------------------------------------------------------
> type TyperContext = M.Map Ident PolyType
> type TyperSubst = TyperContext -> TyperContext
>
> newtype TyperT m a = TyperT {
> runTyperT :: ScopeT Ident PolyType (EnvT Var MonoType (SupplyT Var m)) a
> } deriving (Functor, Monad, MonadIO)
>
> instance MonadTrans TyperT where
> lift = TyperT . lift . lift . lift
>
> instance (Monad m) => MonadSupply Var (TyperT m) where
> supply = TyperT . lift . lift $ supply
>
> instance (Monad m) => MonadScope Ident PolyType (TyperT m) where
> find = TyperT . find
> scope f act = TyperT . scope f . runTyperT $ act
>
> class (Monad m, MonadScope Ident PolyType m, MonadSupply Var m) => TyperMonad m where
> unify :: MonoType -> MonoType -> m ()
> normalize :: MonoType -> m MonoType

With the above interface for the TyperMonad given, it is time to actually define it. This is the real meat of the typer monad.



>
> problem :: (Monad m) => String -> String -> m ()
> problem phase message = fail $ phase ++ ": " ++ message
>
> instance (Monad m) => TyperMonad (TyperT m) where
> ------------------------------------------------------------------------------
> --- unify
> ------------------------------------------------------------------------------
> unify type1 type2 = TyperT $ do
> type1' <- runTyperT $ normalize type1
> type2' <- runTyperT $ normalize type2
> unifier type1' type2'
> where unifier (TyVar x) ty@(TyVar y) =
> if x == y
> then return ()
> else lift $ putfor x ty
> unifier (TyVar x) ty =
> if x `occurs` ty
> then problem "TyperChecker" ("Trying to unify recursive type " ++ show x ++ " = " ++ show ty)
> else lift $ putfor x ty
> unifier tx (TyVar y) =
> if y `occurs` tx
> then problem "TyperChecker" ("Trying to unify recursive type " ++ show y ++ " = " ++ show tx)
> else lift $ putfor y tx
> unifier tx@(TyConst nx px) ty@(TyConst ny py) =
> if nx == ny && length px == length py
> -- IMPORTANT TO use unify and not not unifier
> -- or the code will recurse for:
> -- foo x y = x y; bar x = foo bar bar
> then runTyperT . sequence_ $ zipWith unify px py
> else problem "TyperChecker" (show tx ++ " can not be inferred to " ++ show ty)
> unifier (TyFun tax tbx) (TyFun tay tby) = do
> runTyperT $ unify tax tay
> runTyperT $ unify tbx tby
> unifier tx ty =
> problem "TypeChecker" (show tx ++ " can not be inferred to " ++ show ty)
> ------------------------------------------------------------------------------
> --- normalize
> --- Normalizes a type by looking it up if it's a tyvar
> --- Applies a small optimization, it basically removes any extra unnecessary
> --- mappings. For instance:
> --- [a->b, b->Int]: normalize a
> --- [a->b, b->Int]: b' <- normalize b
> --- [a->Int, b->Int]: return b'
> ------- -----------------------------------------------------------------------
> normalize tx@(TyVar x) = TyperT $ do
> tx' <- lift $ getfor x
> case tx' of
> Nothing -> return tx
> Just ty -> do
> ty' <- runTyperT $ normalize ty
> if ty == ty'
> then return ty'
> else do
> lift $ putfor x ty'
> return ty'
> normalize tx@(TyConst n tl) = TyperT $ do
> tl' <- runTyperT $ mapM normalize tl
> return $ TyConst n tl'
> normalize tx@(TyFun ta tb) = TyperT $ do
> ta' <- runTyperT $ normalize ta
> tb' <- runTyperT $ normalize tb
> return $ TyFun ta' tb'
> ------------------------------------------------------------------------------
>
> evalTyperT :: (Monad m) => TyperContext -> TyperT m a -> m a
> evalTyperT ctx action =
> let vars = makeSupply (map return ['a'..'z']) ["'"] in
> evalSupplyT (evalEnvT (evalScopeT (runTyperT action) ctx) M.empty) vars

I will let you grok the code. If any bits are not clear, feel free to tell me so and I will expand upon them. Now we can easily do some experiments:


evalTyperT M.empty (do{unify (TyVar "a") mkIntType; normalize (mkFunType [TyVar "a", TyVar "a", TyVar "b"])})
=> (Int) -> (Int) -> b

You might notice that I included some extra machinery, namely the scoping and all that. In a future blog post I will show this is actually used to allow for typechecking a simple language (like MiniML). But for now, I will leave it at that as it has already grown to become quite a big blogpost.

If you have any questions or comments, please feel free to post them at the bottom and I will make the necessary changes if required to clarify or improve the presentation.

A simple OO-system in Lua

Well, I recently (very recently) started to play around with Lua. I ordered the book before leaving to Colombia and found it when I was back. It took me just a day to read it, as the book is a light read and rather well-written, covering all the important topics.

One reason I’ve started to look at Lua is because it seems like a rather powerful language. Read more if you’re interested in why, and a first OO-system that I’ve rolled in it.


Powerful is a rather abstract word, and perhaps I should first explain why I consider Lua, powerful, especially since I come from a rather different programming language sphere (Haskell).
The reason that lua appeals to me is that it seems to enable a way of meta-programming that cuts cleanly through your code. It takes a scheme-approach of being built-up in an orthogonal manner with a small core. That being said, I find it more appealing than scheme as, at least to me, it is easier to code things in. The meta-programming seems to be more powerful than in scheme.

Anyways, I started coding in it yesterday, wanting to roll an OO-system that would be easy to use. Additionally, I wanted a way to nicely dump my state and then be able to reload it after potentially modifying some of the classes I use.

For the serialization aspect, I took the serialization scheme from the Lua Book and extended it to deal with custom serialization meta-methods as well as the possibilities of keys being tables. Note that it’s not yet finished, it need a small piece of code to generate unique names, but that’s rather trivial.

As for the OO system, this is a second version. My original version was rather primitive. One thing that I wanted was the ability to call the static-super method of a method. Now a simple way would be:


function Class:method(...)
SuperClass.method(self, ....)
end

I did not like this idea, however, as it requires explicitly referring to the super-class. Certainly, for simple class-trees this is a viable option. However, since I would like to introduce the concept of mixins, and in general don’t like the verbositoy of this, I introduced a ‘static’ way of defining __super (The super-method). Basically, whenever a method is defined, it’s function-environment is modified to have a link to __super (as well as __class). The reason for having a link to __class is that this is the static-class that the method belongs to, not the class of the object.

The (unfinished) code can be seen below. It is based on ideas from: ClassesAndMethodsExample and InheritanceTutorial. As mentioned before, this is only my second day of hacking Lua, so I’m certain there are things that could be done much better. Suggestions, as always, welcome


local setmetatable = setmetatable
--------------------------------------------------------------------------------
-- Helper functions:
--------------------------------------------------------------------------------
function memoize(f)
local self = setmetatable({}, {__mode = "k"})
self.__index = function(self, k) local v = f(k); self[k] = v return v end
self.__call = function(self, k) return self[k] end
return self
end

--------------------------------------------------------------------------------
-- Class System
--------------------------------------------------------------------------------
-- Methods should be called on the klass object with as parameter the object itself!
--
Root = {
super = nil;
name = "Root";
new =
function(class, ...)
local obj = {class = class}
local meta = {
__index = class.methods,
__serialize = class.__serialize or Root.__serialize
}
setmetatable(obj, meta)
if (class.methods.init) then
class.methods.init(obj, ...)
end
return obj
end;
methods = {
classname = function(self)
return (self.class.name)
end;
isa = function(self, aClass)
local cur_class = self.class
while (nil ~= cur_class) do
if cur_class == aClass then
return true
else
cur_class = cur_class.super
end
end
return false
end
};
data = {};
__serialize = function(self, serializer, name)
serializer:write("setmetatable({class = " .. self:classname() .. "}, {\n")
serializer:write(" __index = " .. self:classname() .. ".methods,\n")
serializer:write(" __serialize = " .. self:classname() .. ".__serialize or Root.__serialize\n")
serializer:write("})\n")
for k,v in pairs(self) do -- save its fields
if k ~= "class" then
local kname
if serializer:isBasic(k) then
kname = serializer:b asicSerialize(k)
elseif type(k) == "table" then
if saved[k] then
kname = saved[k]
else
kname = serializer:generateName()
serializer:serialize(kname, k)
end
end
local fname = string.format("%s[%s]", name, kname)
serializer:serialize(fname, v)
end
end
end
}

function Class(name, super)
super = super or Root
class = {
super = super,
name = name,
methods = {}
}

-- if class slot unavailable, check super class
-- if applied to argument, pass it to the class method new
setmetatable(class, {
__index = super,
__call = function(self,...) return self:new(...) end
})

setmetatable(class.methods, {
-- if instance method unavailable, check method slot in super class
__index = class.super.methods,
-- when defining a new method, set the environment for proper, static, class access and super-method access
__newindex = function (c, m, f)
if (class.super.methods[m]) then
setfenv(f, setmetatable({__class = class, __super = class.super.methods[m]}, {__index = getfenv(f) }))
else
setfenv(f, setmetatable({__class = class}, {__index = getfenv(f) }))
end
rawset(c,m,f)
end
})

return class
end

function Mixin(name, mixin, super)
-- return (name, nil, super)
end

function Singleton(name, super)
end

--------------------------------------------------------------------------------
-- Serializer
-- TODO: Make a way to easily define what attributes -not- to save
--------------------------------------------------------------------------------
Serializer = Class("Serializer")
function Serializer.methods:init(stream, temp, saved)
self.stream = stream or io.stdout
self.temp = temp or 0
self.saved = saved or {}
end

function Serializer.methods:basicSerialize(value)
if type(value) == "boolean" then
return tostring(value)
elseif type(value) == "number" then
return tostring(value)
elseif type(value) == "string" then
return string.format("%q", value)
end
end

function Serializer.methods:isBasic(value)
return type(value) == "boolean" or type(value) == "number" or type(value) == "string"
end

function Serializer.methods:write(...)
self.stream:write(...)
end

function Serializer.methods:serialize(name, value)
self:write(name, " = ")
if self:isBasic(value) then
self:write(self:basicSerialize(value), "\n")
elseif type(value) == "table" then
if self.saved[value] then -- value already saved?
self:write(self.saved[value], "\n") -- use its previous value
else
self.saved[value] = name -- save name for next time
serializer = (getmetatable(value) or self).__serialize or self.__serialize
serializer(value, self, name)
end
else
error("cannot save a " .. type(value))
end
end

function Serializer.methods:__serialize(serializer, name)
serializer:write("{}\n< /font>") -- create a new table
for k,v in pairs(self) do -- save its fields
local kname
if serializer:isBasic(k) then
kname = serializer:basicSerialize(k)
elseif type(k) == "table" then
if saved[k] then
kname = saved[k]
else
kname = serializer:generateName()
serializer:serialize(kname, k)
end
end
local fname = string.format("%s[%s]", name, kname)
serializer:serialize(fname, v)
end
end

function deserialize(o, stream)
stream = stream or io.stdin
end

--------------------------------------------------------------------------------
-- Test Cases
--------------------------------------------------------------------------------
Foo = Class("Foo")
function Foo.methods:init(...)
self.foo = 1
end

function Foo.methods:__serialize(serializer, name)
serializer:write("{foo = " .. self.foo .. "}\n")
end

Bar = Class("Bar", Foo)
function Bar.methods:init(...)
__super(self, ...)
self.bar = 2
end

function Bar.methods:boom()
if __super then __super(self) end
print "Bar:boom"
end

Bum = Class("Bum", Bar)
function Bum.methods:init(...)
if __super then __super(self, ...) end
self.bum = 3
end

function Bum.methods:boom()
if __super then __super(self) end
print "Bum:boom"
end

Continuations

Here I am again, delving into continuations for the fun of it. I am certain that by the time I have worked it out completely, I will have seen my error and my entire objection at the beginning of this entry will be flawed. But one never knows, besides, hopefully someone might get something useful out of this, besides me just rubberducking.

But let’s start at the beginning. When one learns about call-with-current-continuation, one of the first thing one wishes to do is actually save the continuation to later use it. A simple pattern for this is:

(define k (call/cc (lambda (k) k)))

Simple, no? Now it is very easy to go back there and to save a value into k, all you have to do is call it (k 1). For those that do not understand what continuations do, let’s look under the hood briefly (where I mean, let’s delve into the semantics, implementation details will be spared). A continuation is basically a place that is waiting to be filled by a value. For instance, if you have (+ 1 x), and you want the current continuation of x, you get (+ 1 []) or an expression with one hole to plug a value into, but then turned into a function so you can call it.

Now going back to the above explanation, what happens under the hood?

(define k (call/cc (lambda (k) k)))
=> Call (lambda (k) k) with the hole in:
(define k []) or {\x -> (define k x)]
=> (lambda (k) k) returns this hole, and then call/cc
returns the value returned by it's argument
=>
(define k {\x -> (define k x)})
; Yes I know, but this is the best syntax I can come up with

So if we call it later:
(k 1), we will go back to the return-continuation of
call/cc which is the define statement itself, and end up with
(define k 1)

Now one often introduced idiom that many schemers like to use instead of (call/cc (lambda (k) k)) is (call/cc call/cc), and indeed, in mzscheme it behaves just the same. At first I thought that it should not behave exactly the same, but as I was writing the explanation I figured why it should. Therefore I will detail that here. Unlike last time, I will not try to get the full CPS form of what goes on. So let’s get started:


(define k (call/cc call/cc))
Let's call them cc1 and cc2 to differentiate:
(define k (cc1 cc2))
The hole []1 for cc1 is:
(define k []1) or {\x -> (define k x)}

How does it go:
(define k (cc1 cc2))
=> call cc2 with []{\x -> (define k x)}
=> call {\x -> (define k x)} with []2 or
{\x -> (define k (cc1R x))}
Here I use cc1R as "returning" because by the time we create []2,
cc1 is already called.
This is the crucial aspect which I was missing first, when []2
is created, it's cc1 on the return path (aka already called and
being filled in with return value).
The behaviour of call/cc is that whatever value is returned to it,
it just returns itself.

So what has happened:
(define k []2) where []2 = {\x -> (define k (cc1R x))}
(k 1) => ([]2 1) => (define k (cc1R 1)) => (define k 1)

Delimited continuations

Playing around with delimited continuations in scheme, I read on another blog that the implementation as by Filinski is buggy and that mzscheme does it much better.

To try it out, simply perform this in mzscheme:

(require (lib "control.ss"))
(define f
(let ((x 0))
(reset
(begin
(set! x (+ x 1))
(cons x (shift k k))))))

Calling this several times does not lead to the expected result:

(f 1) => (1 . 1)
(f 1) => (1 . 1)
(f 1) => (1 . 1)

One would expect that calling it more than once would lead to an increment in the counter, as the let variable is only performed once and as such should be inside the closure that reset returns. Observationally, I was expecting the code to behave like this

(define f
(let ((x 0))
(lambda (y)
(begin
(set! x (+ x 1))
(cons x y)))))

It seems that scheme48 displays the same behaviour:

,open escapes signals
,load /usr/lib/scheme48/misc/shift-reset.scm
...

Any comments regarding this would be more than welcome as I am currently stumped on why this does not increment. I guess the only way to really explore this is to do the full CPS transformation as originally introduced by Oliver Danvy and Andrzej Filinski in “Abstracting Control” from 1990, although I will refer to the version from Chung-chie Shan in “Shift to Control” from 2004 as I personally found it more legible (my background is EE, not CS). Because I can not use overlines in html, I will represent the CPS-version of something by making it bold. (Note that this is taken from the paper of Shan, in his figure 1)

1. x                 = (lambda (c) (c x))
2. (lambda (x) E) = (lambda (c) (c (lambda (x) E)))
3a. (E1 E2) = (lambda (c) (E1 (lambda (f) (E2 (lambda (x) ((f x) c))))))
3b. (E1 E2 E3) = (lambda (c) (E1 (lambda (f) (E2 (lambda (x) (E3 (lambda (y) ((f x y) c))))))))

4. (reset E) = (lambda (c) (c (E (lambda (v) v))))
5. (shift f E) = (lambda (c) (let ((f (lambda (x) (lambda (c2) (c2 (c x))))))
(E (lambda (v) v)))

Lastly, of course, one needs a way to sequence operations begin.

6. (begin E1 E2) = (lambda (c) (E1 (lambda (x) (E2 c))))
;Derived through:
(begin E1 E2)
Equivalent code => ((lambda (_) (E2)) E1)
Apply Rule 3a => (lambda (c) ((lambda (_) (E2))
(lambda (f) (E1 (lambda (x) ((f x) c))))))
Apply Rule 2 => (lambda (c) ((lambda (c2) (c2 (lambda (x) E2)))
(lambda (f) (E1 (lambda (x) ((f x) c))))))
Beta Reduction => (lambda (c) ((lambda (f) (E1 (lambda (x) ((f x) c))))
(lambda (_) E2)))
Beta Reduction => (lambda (c) (E1 (lambda (x) (((lambda (_) E2) x) c))))
Beta Reduction => (lambda (c) (E1 (lambda (x) (E2 c))))

Applying this to the function definition above:

(let ((x 0))
(reset
(begin
(set! x (+ x 1))
(cons x (shift k k)))))


Desugaring =>
((lambda (x)
(reset
(begin
(set! x (+ x 1))
(cons x (shift k k)))))
0)

Apply Rule 3a =>
(lambda (c) ((lambda (x) `
(reset
(begin
(set! x (+ x 1))
(cons x (shift k k)))))

(lambda (f) (0 (lambda (x) ((f x) c))))))
Apply Rule 1 =>
(lambda (c) ((lambda (x)
(reset
(begin
(set! x (+ x 1))
(cons x (shift k k)))))

(lambda (f) ((lambda (c2) (c2 0)) (lambda (x) ((f x) c))))))
Beta Reduction =>
(lambda (c) ((lambda (x)
(reset
(begin
(set! x (+ x 1))
(cons x (shift k k)))))

(lambda (f) ((f 0) c))))
Apply Rule 2 =>
(lambda (c) ((lambda (c2) (c2 (lambda (x) (reset
(begin
(set! x (+ x 1))
(cons x (shift k k))))
)))
(lambda (f) ((f 0) c))))
Beta Reduction =>
(lambda (c) ((lambda (f) ((f 0) c))
(lambda (x) (reset
(begin
(set! x (+ x 1))
(cons x (shift k k))))
)))

Apply Rule 4 =>
(lambda (c) ((lambda (f) ((f 0) c))
(lambda (x) (lambda (c2) (c2 ((begin
(set! x (+ x 1))
(cons x (shift k k)))

(lambda (v) v)))))))
Beta Reduction =>
(lambda (c) (((lambda (x) (lambda (c2) (c2 ((begin
(set! x (+ x 1))
(cons x (shift k k)))

(lambda (v) v)))))
0) c))
Resugaring =>
(lambda (c) ((let ((x 0))
(lambda (c2) (c2 ((begin
(set! x (+ x 1))
(cons x (shift k k)))

(lambda (v) v)))))
c))
Apply Rule 6 =>
(lambda (c) ((let ((x 0))
(lambda (c2) (c2 ((lambda (c3)
((set! x (+ x 1))
(lambda (_) ((cons x (shift k k)) c3))))
(lambda (v) v)))))
c))
Beta Reduction =>
(lambda (c) ((let ((x 0))
(lambda (c2) (c2 ((set! x (+ x 1))
(lambda (_) ((cons x (shift k k))
(lambda (v) v)))))))
c))
Apply Rule 3b =>
(lambda (c) ((let ((x 0))
(lambda (c2) (c2 ((set! x (+ x 1))
(lambda (_) ((lambda (c4)
(cons
(lambda (f)
(x
(lambda (y)
((shift k k)
(lambda (z) ((f y z) c4))))))))
(lambda (v) v)))))))
c))
Beta Reduction =>
(lambda (c) ((let ((x 0))
(lambda (c2) (c2 ((set! x (+ x 1))
(lambda (_)
(cons
(lambda (f)
(x
(lambda (y)
((shift k k)
(lambda (z) ((f y z) (lambda (v) v)))))))))))))
c))
Apply Rule 1 =>
(lambda (c) ((let ((x 0))
(lambda (c2) (c2 ((set! x (+ x 1))
(lambda (_)
((lambda (c5) (c5 cps-cons))
(lambda (f)
(x
(lambda (y)
((shift k k)
(lambda (z) ((f y z) (lambda (v) v))))))))
)))))
c))
Beta Reduction =>
(lambda (c) ((let ((x 0))
(lambda (c2) (c2 ((set! x (+ x 1))
(lambda (_)
((lambda (f)
(x
(lambda (y)
((shift k k)
(lambda (z) ((f y z) (lambda (v) v)))))))
cps-cons))))))
c))
Beta Reduction =>
(lambda (c) ((let ((x 0))
(lambda (c2) (c2 ((set! x (+ x 1))
(lambda (_)
(x
(lambda (y)
((shift k k)
(lambda (z) ((cps-cons y z) (lambda (v) v)))))))))))
c))
Apply Rule 1 =>
(lambda (c) ((let ((x 0))
(lambda (c2) (c2 ((set! x (+ x 1))
(lambda (_)
((lambda (c) (c x))
(lambda (y)
((shift k k)
(lambda (z) ((cps-cons y z) (lambda (v) v)))))))))))
c))
Beta Reduction =>
(lambda (c) ((let ((x 0))
(lambda (c2) (c2 ((set! x (+ x 1))
(lambda (_)
((lambda (y)
((shift k k)
(lambda (z) ((cps-cons y z) (lambda (v) v)))))
x))))))
c))
Beta Reduction =>
(lambda (c) ((let ((x 0))
(lambda (c2) (c2 ((set! x (+ x 1))
(lambda (_)
((shift k k)
(lambda (z) ((cps-cons x z) (lambda (v) v)))))))))
c))
Apply Rule 5 =>
(lambda (c) ((let ((x 0))
(lambda (c2) (c2 ((set! x (+ x 1))
(lambda (_)
((lambda (c6) (let ((f (lambda (w) (lambda (c7) (c7 (c6 w))))))
(f (lambda (v) v))))
(lambda (z) ((cps-cons x z) (lambda (v) v)))))))))
c))
Apply Rule 1 =>
(lambda (c) ((let ((x 0))
(lambda (c2) (c2 ((set! x (+ x 1))
(lambda (_)
((lambda (c6) (let ((f (lambda (w) (lambda (c7) (c7 (c6 w))))))
((lambda (c8) (c8 f)) (lambda (v) v))))
(lambda (z) ((cps-cons x z) (lambda (v) v)))))))))
c))
Beta Reduction =>
(lambda (c) ((let ((x 0))
(lambda (c2) (c2 ((set! x (+ x 1))
(lambda (_)
((lambda (c6) (let ((f (lambda (w) (lambda (c7) (c7 (c6 w))))))
((lambda (v) v) f)))
(lambda (z) ((cps-cons x z) (lambda (v) v)))))))))
c))
Beta Reduction =>
(lambda (c) ((let ((x 0))
(lambda (c2) (c2 ((set! x (+ x 1))
(lambda (_)
((lambda (c6) (let ((f (lambda (w) (lambda (c7) (c7 (c6 w))))))
f))
(lambda (z) ((cps-cons x z) (lambda (v) v)))))))))
c))
De-CPS =>
(lambda (c) ((let ((x 0))
(lambda (c2) (c2 ((set! x (+ x 1))
(lambda (_)
((lambda (c6) (let ((f (lambda (w) (lambda (c7) (c7 (c6 w))))))
f))
(lambda (z) (cons x z)))))))))
c))
...unfinished due to...

pkhoung on irc.freenode.net made a good point after I was nearly done with the entire exercise. It seems, basically, that once you get to the shift part, the set! is already executed. Putting this another way, the only context captured between shift and reset is the return path from shift to reset Looking back at it, this makes sense as set! it is not in the return path from shift to reset. The following code does work as expected.

(define f
(let ((x 0))
(reset
(let ((y (shift k k)))
(set! x (+ x 1))
x))))

Or even better, we can just pause a computation:

(define f
(reset
(begin
(shift k k)
1)))

Kernel Language

Recently on lambda the ultimate I read about an interesting language named Kernel. Basically it aspires to do what Scheme originally aspired to do, namely have an ultra clean langauge design. Having read some more about scheme as of late, especially the fact that macros create certain places that are not first-class, I found the idea of Kernel interesting. I fear, however, that if he does not change the lexical syntax somewhat that it might die early. I found the naming convention he used rather illegible. True, naming convention is but a very very small part of a language, but it does help a lot in getting a user community.

I have to read the pdf again as I only briefly skimmed it, but if I understood correctly those parts that I skimmed, the distinction between macro and function dissapears completely, which I think should be a desireable feature, at least in terms of purity.

If I have more time and read it again I’ll drop some more comments