blog.poucet.org Rotating Header Image

Imported from Blogger

I have just imported all my old blogs from http://notvincenz.blogspot.com, randomly sampling one entry, I noticed that some of the blogs need to be reeditted a bit.  If you happen to notice any broken blogs, please ping me so I can fix it.

Thanks!

Fibonacci in the Type System

I thought I would share an old piece of code that I’ve had lying around for a while.  Basically, it calculates Fibonacci numbers in the type system:

{-# LANGUAGE EmptyDataDecls, MultiParamTypeClasses,
    FunctionalDependencies, FlexibleInstances, UndecidableInstances #-}
module Fibonacci where

data Nat
data S a = S a
class Numeral a where
value :: a -> Integer

prev :: S a -> a
prev = undefined

instance Numeral Nat where
value _ = 0

instance (Numeral a) => Numeral (S a) where
value x = 1 + (value . prev $ x)

class Add a b c | a b -> c where
add :: a -> b -> c

instance Add Nat b b where
add = undefined

instance Add a b c => Add (S a) b (S c) where
add = undefined

class Fib a b | a -> b where
fib :: a -> b

instance Fib Nat Nat where
fib = undefined

instance Fib (S Nat) (S Nat) where
fib = undefined

instance (Fib (S a) b, Fib a c, Add b c d) => Fib (S (S a)) d where
fib x = undefined

main = print . value . fib . S . S . S . S . S
                           . S . S . S . S $ (undefined :: Nat)

Setting up a mini-network

It has been a while since I last blogged, mostly due to the fact that I’ve been busy settling into my new place, travelling to IKEA, breaking IKEA mobility vans, buying furniture, etc.  However, I’ve started on a new little project, and I thought that I’d document as I go along.

The project I’m working on is a little home network.  You could say that a home network is rather easy, and it is, but I’ve decided to make my life more complicated.  I want to learn more about routers, DNS, firewalls, traffic-shaping, etc, so I thought that this was the perfect project for that.  Additionally, I might soon get another computer with some colleagues at a colo, so that will give me enough reliability (along with my linode box) to set up a set of nameservers for my own domain.

Currently I would like to:

Let’s start by defining the hardware I have:

  • Two ALIX engines, one with a 1GB CF (compact flash) card and one with a 4GB CF card.  I’m expecting another 4GB CF card to come soon, to play around.  These both have 3 ethernet ports, which gives you a lot of possibilities to play with.  One also has a miniPCI WLAN card.  I bought antennas for it too, sadly the connectors I got to attach the two is wrong, so I’m waiting for new connectors.
  • One cablemodem, with my current contract I should be able to get 4 ips from it.
  • One Netgear GS608, 8-Port Gigabit Switch.  I was going to get a 100 Megabit switch, but sadly they were out, so I decided to upgrade.
  • One Netgear ProSafe GS105 Gigabit Switch.
  • Several ethernet cables, it would be a rather pointless exercise without it, wouldn’t it? :)
  • A Kingston 19-in-1 Card Reader.  I saw it in the store, it was rather cheap, and I figured I might be able to use this to set up the CF cards.  Though I’m currently leaning towards PXE boot.
  • Finally, I still have an old linksys WRT54G wireless router.
  • Additionally, I have a macbook pro which I’m using to write this blog, as well as Google random information, and a Dell Latitude D610 which I will use both as primary installation device, as well as laptop to connect to the serial port of the Alixen.

Now, originally, we had the WRT54G wireless router in our office, on the first floor. Sadly, due to the incredible swiss building style, we’re not able to get WIFI in the living room, which is exactly 1 floor up.  We’re investigating whether we can use the ISDN-plugs that seem to be available in each room to route ethernet upstairs.  If that’s possible, then the links router will go upstairs for WIFI in the living room, and I’ll use the ALIX engine with the WLAN card to supply our downstairs with WIFI.  Let’s hope it’ll work, otherwise we might have to get another WIFI bridge, which wouldn’t be very neat.

Currently, the plan is as follows:

  • Use the 5 port switch to get the most out of my public IP space.  I’ll probably attach my PS3 directly there, so I do not have to set up natting etc for online gaming.  Additionally, if I can route through the ISDN cables, I’ll use this to send a public IP straight to the linksys WRT54G WLAN router in the living room.
  • Use one of the Alix boxen as router, firewall and WLAN device.  For now the other ALIX will be for experimentation, though I’m tempted to set up VRRP.  The primary alix box will serve as router between the public IP space and the private IP space.  Since it has three ports, I might even go for a DMZ.
  • Use the 8 port switch as switch in the private IP space.

I’ve been contemplating quite a bit which OS to install on the Alix boxen. I’ve always worked with Ubuntu (well not quite always, way way long ago I had Mandrake, and prior to that Slackware). There are mini-distros available, but given the amount of space I have on the CFs and since I want this to be a learning exercise, I’ve decided to stay away from them.  This leaves me with the following choices:

  • Ubuntu.  I’m the most familiar with it, and generally it tends to be rather painless. That being said, it usually installs with a X11 system, no matter which version you choose, and
  • FreeBSD or OpenBSD.  This are usually ideal for this, or so one should believe.  However, I have little experience with them.  It might make a good learning exercise, bu on the other hand, I’m expecting to get a colo’d box soon with some colleagues that will run Debian.  So it is best to capitalize on one system.
  • Install debian on the main Alix box.  I’ve been contemplating a lot about this.  On my dell, I use ubuntu, and I quite like it.  However, one of the downsides of ubuntu is the frequency with which I get patches for it.  That is perfect for a device that one uses actively.  However, for a server/router box, this is not ideal, especially if you have CF.  Stability is key.  I’ve also considered FreeBSD or OpenBSD, just to learn more about them, but since the colo-box I will share with my colleagues will most likely be debian, it makes sense to use debian, that way I can reuse the configs, scripts etc.

For the PXE install, I followed the advice on this page.  Sadly, after a lot of struggling, I would get the device to start dhcp, it would actually start copying files, but it would not boot.  I then moved to trying to install plain old debian, however this failed as well, it would boot properly, but then it would simply hang.  Perhaps it is the serial settings of minicom that were screwing up.

Therefore, I’ve moved on to try Voyage Linux, which seems to be a slimmed down Debian specifically suited for this purpose, but not limited to being just the slimmed down version.  Installing it was a breeze, I just followed the instructions given in the README, and everything worked perfectly.

The first things I did, was install vim and screen.  I also set up proper .screenrc and .vimrc.

Looking further into how things are mounted, I noticed the following:”

  • /var/lock and /var/run are mounted purely in memory with tmpfs.  This is done by the /etc/init.d/mountkernfs.sh script
  • /var/log and /var/tmp are mounted as aufs, which is a union file system that will write to memory but also show what’s on disk.  They are setup by the /etc/init.d/voyage-sync script. Additionally, this script also rsyncs files back to disk when shutting down.

This shows two different ways of not using the CF when not required, one which is purely transient and one which is non-transient but saves on CF writes.  Currently, I think the following setups might be good:

  • Use the overlay file system, aufs, for .ssh.  Do not sync automatically on this, but instead selectively do so.
  • Use tmpfs for /var/lib/apt.  I do not need my CF to be spammed with a bunch of stuff just because I want to rarely install a package.

This article has remained as draft for far too long, so I will publish it and create a second article to continue.

A day in the life of an SRE

Yesterday was a crappy day. I was oncall and for those that might not have heard, here is the link http://googleblog.blogspot.com/2009/02/current-gmail-outage.html.  For those that did not know, I work as an SRE (Site Reliability Engineer) for GMail at Google.

On a better note, I have recently installed xmonad, and I’m really happy with it as it has increased my productivity.  Currently, I’m still using a rather standard configuration, except for some custom key-bindings.  I will probably post when I get a more customized configuration.

A SIGINT signal handler for lua

It has been a while since I last blogged, mostly due to the fact that I’ve been busy settling into my new place, travelling to IKEA, breaking IKEA mobility vans, buying furniture, etc.  I’ve started a little project on home networking, and I have a blog post in progress documenting the details.  But I thought I’d blog about something else in the meantime.

I’m currently on vacation in Barbados for Christmas and New Year, visiting my parents.  It would have been a great vacation, had it not been for the several mishaps.  The first day I stepped on a sea urchin, which led to me having several bouts of fevers over the next 2 weeks.  Additionally, I nearly landed on some corral cliffs after losing having the leash of my bodyboard snap and finding myself fighting pointlessly against a current that was too strong to even fight when I could stand.

Anyways, the reason I wanted to blog was because I’ve been contemplating about focussing more effort again on writing personal software projects.  I have a bunch becoming stale on my laptop, and would like to share them better.  In the past, on my previous blog, I would show snippets inside of blog posts, and I’ll probably do that again today to share something I wrote a while back.  However, this is not a very robust solution, and it makes it a pain to share bigger projects or to have others use it.  I have a VPS host, however the homepage I have on it is truly shameful.  Which brings up the question in this blog post: Does anyone have any suggestions for a nice layout for a homepage? It probably will be more organizational and less prose-based, as I can use my blog for prose.  Just a place to jot some details about myself, share various pieces of code (either in a darcs or a git repository), and put on some snippets with nice syntax highlighting.  Is the standard 3-column page still the hot meme, or is there something better?  And any decent suggestions on what color-scheme and typography to use?

So coming back to the source code snippet.  A while back, while working on some lua code, I wanted to have the ability to register lua code for the CTRL+C handler. The following piece of code does this by first registering a C function as a signal handler for SIGINT that sets a lua hook on the next line of lua call.  In the case that it takes too long to actually evaluate the debug hook, it temporarily sets the SIGINT back to SIG_DFL, such that two ctrl+c’s will kill the program as expected.

The use of it is fairly trivial, after loading the module with require 'signal', you simply call signal.set(function() ... some code ... end). If you want to reset the original signal handler that was set prior to your first signal.set you simply call it with no parameters.

/*------------------------------------------------------------------------------
 * signal.c - Allows for the registration of a lua signal handler for SIGINT.
 *
 * Authors: Christophe Poucet
 * License: BSD
 * Created: September 23, 2007
 *
 * Copyright 2007-2009 © Christophe Poucet. All Rights Reserved.
 *----------------------------------------------------------------------------*/

#include
#include "lua.h"

#include "lauxlib.h"
#include "lualib.h"

static lua_State *globalL = NULL;

static void (*old_handler)(int) = NULL;

static void laction(int i);

static void lstop (lua_State *L, lua_Debug *ar) {
  (void)ar;  /* unused arg. */
  lua_sethook(L, NULL, 0, 0);
  lua_getfield(L, LUA_REGISTRYINDEX, "handler");
  lua_pcall(L, 0, 0, 0);
  signal(SIGINT, laction);
}

static void laction (int i) {
  /* if another SIGINT occurs before lstop, terminate process(default action) */
  signal(i, SIG_DFL);
  lua_sethook(globalL, lstop, LUA_MASKCALL | LUA_MASKRET | LUA_MASKCOUNT, 1);
}

static int l_setsignal(lua_State *L) {
  globalL = L;
  luaL_checkany(L, 1);
  if (lua_isnil(L, 1)) {
    lua_pushnil(L);
    lua_setfield(L, LUA_REGISTRYINDEX, "handler");
    if (old_handler != NULL) {
      signal(SIGINT, old_handler);
      old_handler = NULL;
    }
  } else if (lua_isfunction(L, 1)) {
    lua_pushvalue(L, 1);
    lua_setfield(L, LUA_REGISTRYINDEX, "handler");
    if (old_handler != NULL) {
      signal(SIGINT, laction);
    } else {
      old_handler = signal(SIGINT, laction);
    }
  } else {
    luaL_error(L, "The argument should be either nil or a function");
  }
  return 0;
}

static const struct luaL_Reg s_signal [] = {
  {"set", l_setsignal},
  {NULL, NULL} /* sentinel */
};

int luaopen_signal(lua_State *L) {
  luaL_register(L, "signal", s_signal);
  return 1;
}

Chrome to be released soon

As posted on blogoscoped, Google Chrome will be released soon, and I have to admit that I’m very excited about.  In the spirit of Google, Chrome takes what works and improves upon it in an incremental and robust fashion.  The cartoon is very telling to show what features it will include.

My favourite features for it are the independent tabs and better multithreading.  Whlie I think firefox is great, it’s starting to hang often for me as I have a bunch of tabs open, some with rather heavy web-apps.  One app hanging causes my entire browser to hang and breaks my work-flow.  Additionally, at times firefox crashes and then you’ve just lost everything you were busy with.

ICFP Contest 2008

For those that are unaware, in exactly one month the 3 day ICFP contest will be held again. This contest has been held for 10 years already and has seen a steady increase in participants, especially in the last two years. Curiously, this ties in with my previous post as it seems that this contest has reached critical mass in 2006 to become really big in 2007.

For those wondering what the contest is about, it is hard to specify a specific challenge since the contest is so different each year.  The idea is that a problem gets put online and that you have 72 hours to solve it. Each year, a different university designs the problem.  Usually this involves one or more professors, a bunch of phd-students and a plethora of master students slaving away for 6 months trying to make sure the problem is not too trivial, nor too hard, and is, of course, interesting.

This contest has several plus points when compared to more traditional programming contests of shorter duration. Unlike other contests, this contest is not simply about pure programming or trivial problem solving, but additionally requires algorithmical insight, creativity and good engineering principles.  Nonetheless, clever programming is then required to solve the subproblems of the final solution.  Another aspect that I personally prefer over other programming contests is that typically the solution is not just an aha-erlebnis, but usually involves tweaking of the solution to optimize the final outcome. In essence, the final scoring function for the participants is a smooth gradient instead of a sudden step function.

Some typical example problems from the past include:

  • Writing a solid-geometry ray-tracer that is scripted by a postscript-like language. (2000)
  • Finding optimal driving instructions for a little race car driving over a race-track with a semi-realistic physics model. (2003)
  • Finding smart collaborative algorithms for a nest of ants that fight against in an artificial environment against another nest of ants, and then pouring that into a very convoluted “ant-assembly” language. (2004)
  • Determining good tactics for cops and robbers playing on a graph trying to outsmart each other. (2005)

The last two years, the focus has switched a bit:

In 2006, CMU came out with whole ‘umix’ system that they had compiled for a custom vm that the participants then had to write.  Once inside this ‘umix’ system, there were a bunch of really nifty puzzles to solve.  From a technological standpoint, the implementation of that umix system was very impressive. I think this might also be one of the reasons why 2007 saw such a boom in participants.

In 2007, given the success of the 2006 formula, a system was designed where DNA would map through RNA to a set of drawing instructions.  If one then executed these drawing instructions, pictures would emerge that contained different types of puzzles.  To be honest, I think the 2007 contest was greatly an attempt to outdo the 2006 one, and by doing so, they undermined the fun of the competition a bit.  If one looks at the gradient of the scores of this contest, it shows a very flat and low beginning, and then a sudden ramp up in the top 17 positions.  This means that 852 teams all had nearly the same score.

I hope that this year they will have a more traditional style of contest.  While puzzles might be fun, they are too much of a yes/no answer and thus allow people little room to tweak and optimize their algorithms to get better scores, which I have found is always the funnest part of the contest: tweaking with the solution in the final day or two days after having built the necessary code and tools to get started.

I also wonder how many people will participate this year.   Last year there was a big jump in the number of registered competitors, where registration is completely optional and can be delayed until the final day.  Due to the way the competition turned out, there was also a huge letdown compared to 2006, so I wonder whether the people that joined due to the 2006 hype will look past the singular fluke to the fun this competition has brought 10 years now.

This will be the fifth year I participate and the third year as my team Lazy Bottoms.  The members of the team this year are nearly all different from last year’s, but I’m definitely confident in their abilities and look forward to collaborating with them on this challenge.

There are several tactics that I have found to be useful for this contest:

  • Come prepared.

This one should be obvious.  Make sure you have decided what tools, compilers, libraries you will use.  Make sure that everyone is using compatible libraries.  Determine what you will use for source code management, personally I use darcs on a external machine with a singular account and several allowed public keys (one for each team member).  Also make sure that everyone is able to connect to the box in question and that all permissions are set up correctly.

  • Communicate effectively.

One of those obvious mantras that is mentioned in every management and GTD book.  If your team is distributed accross the globe like mine is, make sure to have reliable and efficient means of communication.  Typically, we use irc for communicating since that is were actually met and since it is a reasonably efficient means of communicating in a group.  While one might opt for more fancy means, such as video-conferencing, that is not always an option if not every member has those facilities. Additionally, IRC is easier to ignore when working on a problem, and easier to read up on once we wish to find out what is being discussed.

  • Be organized.

While IRC works well for the different communicating threads that occur while working on a problem, it does not work well from an organizational perspective.  Due to the short time that the competition offers and the sometimes serious software engineering that is required to get the necessary tools in place, it is important to have everyone on one page.  For this I prefer using some collaborative editor, for instance gobby, or this year we’ll go for google-docs.  That way the meeting becomes each one writing their part of the meeting notes and then going down over it together to hash out any inconsistencies.  This could potentially also work with one note-taker, but given the limited size of ICFP-teams (4-6 people, usually), it does not harm to have multiple people writing in the same document.  The advantage of this model is that you do not have to reiterate 10 times over a discussion, and you’re sure nothing will get missed by the one person having to write up the meeting notes after the meeting, since they are written and discussed as part of the meeting.

  • Know when to cut your losses.

In a time schedule it’s important to know when to cut losses.  In 2007, part of our team spent a whole day on one particular problem, namely the DNA compiler.  In the end, they could not get it to work and were completely stuck on it.  Another team member (in this case me) then stepped in and took a completely fresh stab at the problem, using a working implementation in python we had and solved it in in several hours.  I do not consider myself particularly smarter than the other team members, on the contrary, we had some very smart people in the team.  The major problem here was that a set of people was so stuck on getting an implementation working while instead we should’ve started fresh and let someone else work on it. This eventually happened, as mentioned, but should’ve occurred a lot earlier. A pair of fresh eyes or a new attempt at a problem from scratch can work miracles at times.

If one looks over the previously mentioned techniques, it quickly becomes obvious that they are also useful in a general software engineering environment.  After all, the ICFP contest is like a small software project with a very tight time-schedule.

Anyways, I’m very excited about the upcoming contest and hope the style will be more like the earlier contests instead of the ones of 2006 and 2007.

Cognitive Surplus

I found an interesting post a while back on the concept of cognitive surplus. The core of this post was basically a reference to a video which I will place here:

[blip.tv ?posts_id=862384&dest=-1]

It does make you wonder about the future. Indeed, there are many small projects that carve out cognitive surplus, and the question then becomes how do you engage people to give some of their cognitive surplus to your project.

What I also wonder is how much this is a chicken-egg situation. After all, if you look at social sites, that carve out a part of the cognitive surplus (Though albeit somewhat mindlessly) they mostly operate on the possibiity to meet other people and thus require enough people to take off. A critical mass of sorts is required to launch such systems.

But even when we look away from those types of sites to sites that provide more useful information, they nonetheless require a critical mass to ensure that the amount of information found is large enough to attract people to contribute themselves.

The question then becomes on how to attract the early adopters as to create the required critical mass to make it of interest to the public, as well as how big the critical mass is for a given system.

I presume that earlier adopters are probably reached more easily if the interface of the web-application is intuitive, innovative and interesting. And obviously, that the problem you’re trying to address is something of relevance to the user.

Lazy memoization

It seems the concept of an “infinite array” in my previous blog post prompted some interest. Two very good suggestions came from Chung-chieh Shan. As usual, his code was minimal but powerful.

  • Use an infinite list of exponentially growing arrays.
  • Use the following data-structure: data Tree a = Tree a (Tree a) (Tree a)

I have to admit that I am not fully clear yet on how to encode it in the above tree.  Conceptually it should be trivial to encode, for instance, [0..] to such a binary structure, after all it’s just bits.  However, momentarily it escapes me.

Another solution came up through a discussion on #haskell with dcoutts and sjanssen(Spencer Janssen). At first the idea was to deal with lazy arrays, but quickly the discussion turned to the concept of a generic caching policy, brought up by dcoutts. One can imagine a combinator library of caching policies. And then one would memoize a function as follows:

memo :: CachingPolicy -> (i -> e) -> (i :->: e)

With this structure, one would generate a memoized function that uses the caching policy to evict old entries. Obviously this would require the use of unsafePerformIO under the hood to mutate the data-structure. It would definitely be an interesting paper, according to dcoutts, all that is required is an application as a strong showcase.

Potentially, a thread could even be used to do time-based eviction policies. However, here either a lock would have to be used, or the current STM system would have to be modified to allow for STM transactions to run in unsafePerformIO, something that is currently unsupported.

With the new concept of Type Families, it would even be possible to choose an appropriate data-structure based on the key-type, to guarantee optimal performance.

Blog move

As much as I like blogger, and the fact that I can use my google account, I have finally decided that I will be moving to a new wordpress blog.

I will see how this experiment goes. This blog and the old entries will remain here, and if at some future point I find wordpress unsatisfactory, I can always come back.

One thing that I will miss is the the fact that in blogger I could insert javascript to use analytics.google.com. However, I disliked certain minor things about blogger. For instance the incredible amount of whitespace it would add between list items as well as before or after code-snippets. I also disliked the supported designs, though granted, that could be tweaked.

WordPress feels very snappy and fresh, and it seems to come with built-in latex support. We will see where this new blog goes and whether I decide to stay there or return here. However, for now I feel like trying it out.