blog.poucet.org Rotating Header Image

Changes

I haven’t written to this blog in SOO long. A lot has changed in the meantime.

I think the last time I wrote, I was still a GMail Site Reliability Engineer. Since then I’ve worked on abuse fighting for Google+ and Blogger and am now working on Google Maps.

My interests have also shifted. While I still love Haskell, as of late I’m really enjoying doing javascript in my private time (and sometimes professionally too, though I’m still a backend engineer). So I hope to revamp my site soon.

Haskell Platform on Mac OSX

The other day I was trying to install ghc-core, which requires pcre.

I had previously installed macports as this allows you to install a variety of unix applications and tools. Unfortunately, I just could not get it to work. After some fiddling with paths to try to get it to find the macports installed libraries (after having port installed pcre), I came up with the following incantation:

LIBRARY_PATH=/usr/lib:/opt/local/lib sudo cabal install --extra-lib-dirs=/opt/local/lib --extra-include-dirs=/opt/local/include ghc-core

However, this unfortunately did not work:
Linking dist/build/ghc-core/ghc-core ...
Undefined symbols:
"_iconv_open", referenced from:
_hs_iconv_open in libHSbase-4.2.0.2.a(iconv.o)
"_iconv_close", referenced from:
_hs_iconv_close in libHSbase-4.2.0.2.a(iconv.o)
"_iconv", referenced from:
_hs_iconv in libHSbase-4.2.0.2.a(iconv.o)
ld: symbol(s) not found
collect2: ld returned 1 exit status
cabal: Error: some packages failed to install:
ghc-core-0.5.4 failed during the building phase.

After some Googling, I was finally able to find the issue

So I decided to completely uninstall macports.

As my friend put it on IRC:

kormat> unix on osx is a ghetto

Fortunately, there is an alternative to macports: Homebrew.

It was a breeze to install:
ruby -e "$(curl -fsSL https://gist.github.com/raw/323731/install_homebrew.rb)"
sudo brew install pcre

After that, cabal installing ghc-core worked as expected =)

Migration

I migrated my linode box from an old version of Ubuntu to a new one.  Unfortunately it was a bit of a pain as a plain migration did not work.  So I had to install a new system and then reset up my blog and copy over the MySQL database.  This is also the reason why this site was down for a day.

I have also imported my old blogs from blogspot.com as I had planned to do once.  You will find that for now they are completely illegible since unfortunately a lot of formatting was lost in the source code.  I’ll try to fix those, time permitting.

Setting up iptables to throttle incoming ssh

So I decided today, since I seem to be getting a lot of ssh attempts to my firewall at home to set up some iptable rules.

It took me quite a while to figure it out, since I needed to set up some modprobe options.

First, I set that I can count up to 250 (I think the maximum is 256) recent ip hits.

cat /etc/modprobe.d/options
options ipt_recent ip_pkt_list_tot=250

Then I created a firewall script:

cat firewall.sh

#!/bin/sh
ipt=/sbin/iptables

set -x

if [ -z $1 ] ; then
echo "$0 <public device>"
exit
fi

# Clear rules
$ipt -D INPUT -i $1 -p TCP --dport ssh -m state --state NEW -j "$1"-SSH 2>/dev/null

# Set up an ssh and blacklist chain.
$ipt -F "$1"-SSH 2>/dev/null
$ipt -F "$1"-BLACKLIST 2>/dev/null
$ipt -X "$1"-SSH 2>/dev/null
$ipt -X "$1"-BLACKLIST 2>/dev/null

$ipt -N "$1"-SSH
$ipt -N "$1"-BLACKLIST

# Make sure that we update the recency of the packet, and then drop them.  The timing is controlled by the ssh chain.
$ipt -A "$1"-BLACKLIST -m recent --name BLACKLIST --set
$ipt -A "$1"-BLACKLIST -j DROP

# In the ssh chain, incoming connections from BLACKLIST hosts are dropped.  The timer is restarted everytime we get a packet within 600 s.
$ipt -A "$1"-SSH -m recent --update --name BLACKLIST --seconds 600 --hitcount 1 -j DROP

# Create several counting buckets.
$ipt -A "$1"-SSH -m recent --set --name "$1"-BUCKET1
$ipt -A "$1"-SSH -m recent --set --name "$1"-BUCKET2
$ipt -A "$1"-SSH -m recent --set --name "$1"-BUCKET3
$ipt -A "$1"-SSH -m recent --set --name "$1"-BUCKET4

# Blacklist if:
#   More than 2 connections in 10 seconds
#   More than 14 connections in 120 seconds
#   More than 79 connections in 600 seconds
#   More than 250 connections in 1800 seconds
$ipt -A "$1"-SSH -m recent --update --name "$1"-BUCKET1 --seconds   10 --hitcount   3 -j "$1"-BLACKLIST
$ipt -A "$1"-SSH -m recent --update --name "$1"-BUCKET2 --seconds  120 --hitcount  15 -j "$1"-BLACKLIST
$ipt -A "$1"-SSH -m recent --update --name "$1"-BUCKET3 --seconds  600 --hitcount  80 -j "$1"-BLACKLIST
$ipt -A "$1"-SSH -m recent --update --name "$1"-BUCKET4 --seconds 1800 --hitcount 250 -j "$1"-BLACKLIST

# All other ssh access is allowed.
$ipt -A "$1"-SSH -j ACCEPT

# Allow packets that belong to existing connections.
$ipt -D INPUT -i $1 -m state --state RELATED,ESTABLISHED -j ACCEPT 2>/dev/null
$ipt -A INPUT -i $1 -m state --state RELATED,ESTABLISHED -j ACCEPT

# Allow all packets from loopback device.
$ipt -D INPUT -i lo -j ACCEPT 2>/dev/null
$ipt -A INPUT -i lo -j ACCEPT

# Redirect all incoming ssh connections to the chain of the same name.
$ipt -A INPUT -i $1 -p TCP --dport ssh -m state --state NEW -j "$1"-SSH

# What remains has no right to continue.
$ipt -D INPUT -i $1 -j DROP 2>/dev/null
$ipt -A INPUT -i $1 -j DROP

Finally, I set it up in my /etc/network/interfaces, that this should be called for my main interface (my public one):
auto eth0
iface eth0 inet dhcp
up firewall.sh eth0

I hope this helps anyone.

To my family and friends

I think this is probably one of the best XKCD comics.  I wish I had had this in the past when people or family would ask me questions about computers (usually windows, being an avid Linux user).

This blog is obsolete

It seems I am still getting traffic on this blog.

I have moved my blog to http://blog.poucet.org

A fresh start

For those that know my prior two blogs, they will know this is not the first time that I moved blog. While the move from the first blog to the second (blogger to wordpress) was due to technical reasons, this time I am moving because I want to finally consolidate all my content in one place.

I have had a virtual host for a while now, and never truly utilized it.  I’ve been using linode and I have to say that I am very happy with the service and quality of the user-interface.  Having recently bought my own domain (yes, the one at the top in your address bar), I decided to finally run my setup completely myself.

So what am I running, and what do I want to write about?  Well for that I have to take a step back.

My last blog focused a lot on Haskell, for that is a language I am very passionate about.  Unfortunately, I have had less and less time to actually devote to it.  In the meantime, my eyes have opened to whole new worlds.

While I have used linux for about 10 years now, it was always a means to an end.  Finally stripped of my academic bubble I have come to appreciate the more technical aspects of it.  It is therefore that I have started devouring a variety books on the topic.

Additionally, since I was never formally a software engineer, having a background in electrical engineering, I decided to purchase a few books in the direction of being a better programmer.  It is one thing to be confident of the fact that you can implement an efficient kernel algorithm in your language of choice.  It is another entirely, to be able to tackle big projects and properly design software in layer.  I am starting to realize that slowly.

While reading the Passionate Programmer by Chad Fowler, I decided that I wanted to blog more on these types of topics.  The combination of technical and psychological/sociological elements is always one that has fascinated me.

So that is what I hope to achieve on this blog, a place to reflect on books such as the one above, a combination of software development and career development, speckled with little technical tidbits I learn along the way or that interest me.

It is somewhat ironic that I am writing this now, when I am back in the city I used to study at, Leuven, to put the last dots on the ‘i’ of my Ph.D. thesis. Writing is not something that is my forte, though I admit it is mostly laziness.  And as with everything, I hope to improve upon it.   Perhaps it is that drive for constantly improving myself that makes me enjoy reading books like Chad Fowler’s.

I have to say that starting this blog has definitely been rife of the Paradox of Choice. Questions such as what theme to use, whether to import the content of my old blog here.

Simple questions are often the ones that take one the longest to answer, down to what name to give to a variable.

Perhaps, I should open this up for discussion instead, for you, inexistent reader of this new blog, feel free to reply in the comments:

Should I import my old content even though the topic was mostly directed towards one very specifical technical niche, namely Haskell.

And for those wondering the original question, the answer is wordpress on lighttpd.  I am already created a DNS entry for a wiki extension as well, where I can jot down interesting links I might want to discuss or unfinished content, however I am still deciding on what wiki software to use (Again, choices… :) Though I am heavily leaning towards MoinMoin).

Finally, if anyone has an interesting suggestion for what to put on the main webpage (http://www.poucet.org), comments are always welcome.  Discussion is the only way to learn new things, for otherwise we grow stale.

Note to self: Don’t be afraid of writing your own opinions, instead of sticking to purely technical things.

Flattening Data.Map

While at the haskell hackathon, I decided to work on adaptive containers, which is an initiative that had been started by Don Stewart, to get Data.Map and Data.IntMap more memory compact. Together with Nicolas Pouillard(ertai) we tackled this project. He worked on IntMap while I worked on Map. You can find the code at: http://patch-tag.com/repo/adaptive-containers.

The main motivation for these adaptive containers it that it is impossible to flatten (or unpack) small data-types directly into the memory used by the data-constructor, so you end up with extra pointers for each value inside a container (specifically 2 pointers, 1 from the container to the data-type, and then one from the data-type to the unboxed one).

Update: A video was posted where Don Stewart presents the basic concepts and where I briefly present the associative adaptive containers mentioned in this article.

It is very easy to create flattened maps now, all you have to do is instantiate the following typeclass:

class AdaptMap k a where
  data Map k a

  tip :: Map k a
  unsafeBin :: Size -> k -> a -> Map k a -> Map k a -> Map k a
  mapMap :: c -> (Size -> k -> a -> Map k a -> Map k a -> c) -> Map k a -> c
  mapMap2 :: (AdaptMap k b)
          => c
          -> (Size -> k -> a -> Map k a -> Map k a -> c) -- Right is Tip
          -> (Size -> k -> b -> Map k b -> Map k b -> c) -- Left is Tip
          -> (Size -> k -> a -> Map k a -> Map k a ->
              Size -> k -> b -> Map k b -> Map k b -> c)
          -> Map k a
          -> Map k b
          -> c
  mapMap2 t b1 b2 c t1 t2 = mapMap tip1 bin1 t1
    where tip1                = mapMap t b2 t2
          bin1 s1 k1 i1 l1 r1 = mapMap tip2 bin2 t2
            where tip2                = b1 s1 k1 i1 l1 r1
                  bin2 s2 k2 i2 l2 r2 = c s1 k1 i1 l1 r1 s2 k2 i2 l2 r2

Notice that the case for two tree deconstruction is completely defined based on single tree deconstruction, so you only need to worry about tip, unsafeBin and mapMap.  Also note that this representation assumes you will have an implementation that is similar to the original one.

An example would be:

instance AdaptMap Int Int where
  data Map Int Int  = TipIntInt
                    | BinIntInt
                      {-# UNPACK #-} !Size
                      {-# UNPACK #-} !Int
                      {-# UNPACK #-} !Int
                      !(Map Int Int)
                      !(Map Int Int)

  tip                              = TipIntInt
  unsafeBin                        = BinIntInt
  mapMap t b TipIntInt             = t
  mapMap t b (BinIntInt s k i l r) = b s k i l r

After we got the basic code working, I thought it would be interesting to flatten nodes inside of Data.Adaptive.Map further, so I created a different instance for Int32 (to get as similar as possible to the original keys and values).  The definition is much hairier:

instance AdaptMap Int32 Int32 where
  data Map Int32 Int32  = TipInt32Int32
                        | BinInt32Int32
                          {-# UNPACK #-} !Size
                          {-# UNPACK #-} !Int32
                          {-# UNPACK #-} !Int32
                          !(Map Int32 Int32)
                          !(Map Int32 Int32)
                        | BinTipTipInt32Int32
                          {-# UNPACK #-} !Size
                          {-# UNPACK #-} !Int32
                          {-# UNPACK #-} !Int32
                        | BinBinTipInt32Int32
                          {-# UNPACK #-} !Size
                          {-# UNPACK #-} !Int32
                          {-# UNPACK #-} !Int32
                          {-# UNPACK #-} !Size
                          {-# UNPACK #-} !Int32
                          {-# UNPACK #-} !Int32
                          !(Map Int32 Int32)
                          !(Map Int32 Int32)
                        | BinTipBinInt32Int32
                          {-# UNPACK #-} !Size
                          {-# UNPACK #-} !Int32
                          {-# UNPACK #-} !Int32
                          {-# UNPACK #-} !Size
                          {-# UNPACK #-} !Int32
                          {-# UNPACK #-} !Int32
                          !(Map Int32 Int32)
                          !(Map Int32 Int32)
                        | BinBinBinInt32Int32
                          {-# UNPACK #-} !Size
                          {-# UNPACK #-} !Int32
                          {-# UNPACK #-} !Int32
                          {-# UNPACK #-} !Size
                          {-# UNPACK #-} !Int32
                          {-# UNPACK #-} !Int32
                          !(Map Int32 Int32)
                          !(Map Int32 Int32)
                          {-# UNPACK #-} !Size
                          {-# UNPACK #-} !Int32
                          {-# UNPACK #-} !Int32
                          !(Map Int32 Int32)
                          !(Map Int32 Int32)

  tip                              = TipInt32Int32
  unsafeBin s k v TipInt32Int32 TipInt32Int32 = BinTipTipInt32Int32 s k v
  unsafeBin s k v (BinInt32Int32 s1 k1 v1 l1 r1) TipInt32Int32
    = BinBinTipInt32Int32 s k v s1 k1 v1 l1 r1
  unsafeBin s k v TipInt32Int32 (BinInt32Int32 s2 k2 v2 l2 r2)
    = BinTipBinInt32Int32 s k v s2 k2 v2 l2 r2
  unsafeBin s k v (BinInt32Int32 s1 k1 v1 l1 r1) (BinInt32Int32 s2 k2 v2 l2 r2)
    = BinBinBinInt32Int32 s k v s1 k1 v1 l1 r1 s2 k2 v2 l2 r2
  unsafeBin s k v l r
    = BinInt32Int32 s k v l r

  mapMap t b TipInt32Int32                              = t
  mapMap t b (BinInt32Int32 s k v l r)                  = b s k v l r
  mapMap t b (BinTipTipInt32Int32 s k v)                =
    b s k v TipInt32Int32 TipInt32Int32
  mapMap t b (BinBinTipInt32Int32 s k v s1 k1 v1 l1 r1) =
    b s k v (BinInt32Int32 s1 k1 v1 l1 r1) TipInt32Int32
  mapMap t b (BinTipBinInt32Int32 s k v s2 k2 v2 l2 r2) =
    b s k v TipInt32Int32 (BinInt32Int32 s2 k2 v2 l2 r2)
  mapMap t b (BinBinBinInt32Int32 s k v s1 k1 v1 l1 r1 s2 k2 v2 l2 r2) =
    b s k v (BinInt32Int32 s1 k1 v1 l1 r1) (BinInt32Int32 s2 k2 v2 l2 r2)

The reason for this is that I’ve read in the past that flattening 3 tree nodes into one can give you very good cache locality.  While you are recreating nodes on deconstruction, these will be cache local anyways, so it should not be that expensive to access the data in these.

To see this in action (apologies about the long data-constructors), the following image is a representation of fromList $ zip [1..100] [1..100].  While the image is not fully clear, it is clear that the number of nodes used to represent this is much smaller.

100-nodes

In contrast, the simpler Int version of the same data-structure has many more nodes:

100-nodes2

Bootstrapping cabal

Often I have to install cabal onto a machine where I just have GHC 6.8.2 running. I’ve found the following little script quite practical to install the basics.  Once you have that installed, it is very easy to install other haskell packages with cabal.

#!/bin/bash
#
# Copyright 2009 Christophe Poucet. All Rights Reserved.
# Author: <christophe (dot) poucet (at) gmail (dot) com> (Christophe Poucet)
# License: BSD3

webroot=http://hackage.haskell.org/packages/archive

install_package() {
  local name=$1
  local version=$2

  wget $webroot/$name/$version/$name-$version.tar.gz
  tar xvzf $name-$version.tar.gz
  cd $name-$version
  runghc Setup configure
  runghc Setup build
  sudo runghc Setup install
  cd ..
  rm -rf $name-$version
  rm -f $name-$version.tar.gz
}

install_package mtl 1.1.0.2
install_package parsec 3.0.0
install_package network 2.2.0.1
install_package HTTP 4000.0.4
install_package zlib 0.5.0.0
install_package Cabal 1.6.0.2
install_package cabal-install 0.6.2

Staying with the times… (?)

I’ve decided to try out twittering.  I know that I’m seriously lagging on the hype-wave, but I think it might be a nice way to share links that I find interesting.  I have delicious too, but it is not as conversational.  I’ve also had a friendfeed for a while, so I’ve linked all my services together there:

http://friendfeed.com/poucet

http://twitter.com/poucet