blog.poucet.org Rotating Header Image

Uncategorized

Changes

I haven’t written to this blog in SOO long. A lot has changed in the meantime.

I think the last time I wrote, I was still a GMail Site Reliability Engineer. Since then I’ve worked on abuse fighting for Google+ and Blogger and am now working on Google Maps.

My interests have also shifted. While I still love Haskell, as of late I’m really enjoying doing javascript in my private time (and sometimes professionally too, though I’m still a backend engineer). So I hope to revamp my site soon.

Haskell Platform on Mac OSX

The other day I was trying to install ghc-core, which requires pcre.

I had previously installed macports as this allows you to install a variety of unix applications and tools. Unfortunately, I just could not get it to work. After some fiddling with paths to try to get it to find the macports installed libraries (after having port installed pcre), I came up with the following incantation:

LIBRARY_PATH=/usr/lib:/opt/local/lib sudo cabal install --extra-lib-dirs=/opt/local/lib --extra-include-dirs=/opt/local/include ghc-core

However, this unfortunately did not work:
Linking dist/build/ghc-core/ghc-core ...
Undefined symbols:
"_iconv_open", referenced from:
_hs_iconv_open in libHSbase-4.2.0.2.a(iconv.o)
"_iconv_close", referenced from:
_hs_iconv_close in libHSbase-4.2.0.2.a(iconv.o)
"_iconv", referenced from:
_hs_iconv in libHSbase-4.2.0.2.a(iconv.o)
ld: symbol(s) not found
collect2: ld returned 1 exit status
cabal: Error: some packages failed to install:
ghc-core-0.5.4 failed during the building phase.

After some Googling, I was finally able to find the issue

So I decided to completely uninstall macports.

As my friend put it on IRC:

kormat> unix on osx is a ghetto

Fortunately, there is an alternative to macports: Homebrew.

It was a breeze to install:
ruby -e "$(curl -fsSL https://gist.github.com/raw/323731/install_homebrew.rb)"
sudo brew install pcre

After that, cabal installing ghc-core worked as expected =)

Migration

I migrated my linode box from an old version of Ubuntu to a new one.  Unfortunately it was a bit of a pain as a plain migration did not work.  So I had to install a new system and then reset up my blog and copy over the MySQL database.  This is also the reason why this site was down for a day.

I have also imported my old blogs from blogspot.com as I had planned to do once.  You will find that for now they are completely illegible since unfortunately a lot of formatting was lost in the source code.  I’ll try to fix those, time permitting.

Setting up iptables to throttle incoming ssh

So I decided today, since I seem to be getting a lot of ssh attempts to my firewall at home to set up some iptable rules.

It took me quite a while to figure it out, since I needed to set up some modprobe options.

First, I set that I can count up to 250 (I think the maximum is 256) recent ip hits.

cat /etc/modprobe.d/options
options ipt_recent ip_pkt_list_tot=250

Then I created a firewall script:

cat firewall.sh

#!/bin/sh
ipt=/sbin/iptables

set -x

if [ -z $1 ] ; then
echo "$0 <public device>"
exit
fi

# Clear rules
$ipt -D INPUT -i $1 -p TCP --dport ssh -m state --state NEW -j "$1"-SSH 2>/dev/null

# Set up an ssh and blacklist chain.
$ipt -F "$1"-SSH 2>/dev/null
$ipt -F "$1"-BLACKLIST 2>/dev/null
$ipt -X "$1"-SSH 2>/dev/null
$ipt -X "$1"-BLACKLIST 2>/dev/null

$ipt -N "$1"-SSH
$ipt -N "$1"-BLACKLIST

# Make sure that we update the recency of the packet, and then drop them.  The timing is controlled by the ssh chain.
$ipt -A "$1"-BLACKLIST -m recent --name BLACKLIST --set
$ipt -A "$1"-BLACKLIST -j DROP

# In the ssh chain, incoming connections from BLACKLIST hosts are dropped.  The timer is restarted everytime we get a packet within 600 s.
$ipt -A "$1"-SSH -m recent --update --name BLACKLIST --seconds 600 --hitcount 1 -j DROP

# Create several counting buckets.
$ipt -A "$1"-SSH -m recent --set --name "$1"-BUCKET1
$ipt -A "$1"-SSH -m recent --set --name "$1"-BUCKET2
$ipt -A "$1"-SSH -m recent --set --name "$1"-BUCKET3
$ipt -A "$1"-SSH -m recent --set --name "$1"-BUCKET4

# Blacklist if:
#   More than 2 connections in 10 seconds
#   More than 14 connections in 120 seconds
#   More than 79 connections in 600 seconds
#   More than 250 connections in 1800 seconds
$ipt -A "$1"-SSH -m recent --update --name "$1"-BUCKET1 --seconds   10 --hitcount   3 -j "$1"-BLACKLIST
$ipt -A "$1"-SSH -m recent --update --name "$1"-BUCKET2 --seconds  120 --hitcount  15 -j "$1"-BLACKLIST
$ipt -A "$1"-SSH -m recent --update --name "$1"-BUCKET3 --seconds  600 --hitcount  80 -j "$1"-BLACKLIST
$ipt -A "$1"-SSH -m recent --update --name "$1"-BUCKET4 --seconds 1800 --hitcount 250 -j "$1"-BLACKLIST

# All other ssh access is allowed.
$ipt -A "$1"-SSH -j ACCEPT

# Allow packets that belong to existing connections.
$ipt -D INPUT -i $1 -m state --state RELATED,ESTABLISHED -j ACCEPT 2>/dev/null
$ipt -A INPUT -i $1 -m state --state RELATED,ESTABLISHED -j ACCEPT

# Allow all packets from loopback device.
$ipt -D INPUT -i lo -j ACCEPT 2>/dev/null
$ipt -A INPUT -i lo -j ACCEPT

# Redirect all incoming ssh connections to the chain of the same name.
$ipt -A INPUT -i $1 -p TCP --dport ssh -m state --state NEW -j "$1"-SSH

# What remains has no right to continue.
$ipt -D INPUT -i $1 -j DROP 2>/dev/null
$ipt -A INPUT -i $1 -j DROP

Finally, I set it up in my /etc/network/interfaces, that this should be called for my main interface (my public one):
auto eth0
iface eth0 inet dhcp
up firewall.sh eth0

I hope this helps anyone.

To my family and friends

I think this is probably one of the best XKCD comics.  I wish I had had this in the past when people or family would ask me questions about computers (usually windows, being an avid Linux user).

This blog is obsolete

It seems I am still getting traffic on this blog.

I have moved my blog to http://blog.poucet.org

Flattening Data.Map

While at the haskell hackathon, I decided to work on adaptive containers, which is an initiative that had been started by Don Stewart, to get Data.Map and Data.IntMap more memory compact. Together with Nicolas Pouillard(ertai) we tackled this project. He worked on IntMap while I worked on Map. You can find the code at: http://patch-tag.com/repo/adaptive-containers.

The main motivation for these adaptive containers it that it is impossible to flatten (or unpack) small data-types directly into the memory used by the data-constructor, so you end up with extra pointers for each value inside a container (specifically 2 pointers, 1 from the container to the data-type, and then one from the data-type to the unboxed one).

Update: A video was posted where Don Stewart presents the basic concepts and where I briefly present the associative adaptive containers mentioned in this article.

It is very easy to create flattened maps now, all you have to do is instantiate the following typeclass:

class AdaptMap k a where
  data Map k a

  tip :: Map k a
  unsafeBin :: Size -> k -> a -> Map k a -> Map k a -> Map k a
  mapMap :: c -> (Size -> k -> a -> Map k a -> Map k a -> c) -> Map k a -> c
  mapMap2 :: (AdaptMap k b)
          => c
          -> (Size -> k -> a -> Map k a -> Map k a -> c) -- Right is Tip
          -> (Size -> k -> b -> Map k b -> Map k b -> c) -- Left is Tip
          -> (Size -> k -> a -> Map k a -> Map k a ->
              Size -> k -> b -> Map k b -> Map k b -> c)
          -> Map k a
          -> Map k b
          -> c
  mapMap2 t b1 b2 c t1 t2 = mapMap tip1 bin1 t1
    where tip1                = mapMap t b2 t2
          bin1 s1 k1 i1 l1 r1 = mapMap tip2 bin2 t2
            where tip2                = b1 s1 k1 i1 l1 r1
                  bin2 s2 k2 i2 l2 r2 = c s1 k1 i1 l1 r1 s2 k2 i2 l2 r2

Notice that the case for two tree deconstruction is completely defined based on single tree deconstruction, so you only need to worry about tip, unsafeBin and mapMap.  Also note that this representation assumes you will have an implementation that is similar to the original one.

An example would be:

instance AdaptMap Int Int where
  data Map Int Int  = TipIntInt
                    | BinIntInt
                      {-# UNPACK #-} !Size
                      {-# UNPACK #-} !Int
                      {-# UNPACK #-} !Int
                      !(Map Int Int)
                      !(Map Int Int)

  tip                              = TipIntInt
  unsafeBin                        = BinIntInt
  mapMap t b TipIntInt             = t
  mapMap t b (BinIntInt s k i l r) = b s k i l r

After we got the basic code working, I thought it would be interesting to flatten nodes inside of Data.Adaptive.Map further, so I created a different instance for Int32 (to get as similar as possible to the original keys and values).  The definition is much hairier:

instance AdaptMap Int32 Int32 where
  data Map Int32 Int32  = TipInt32Int32
                        | BinInt32Int32
                          {-# UNPACK #-} !Size
                          {-# UNPACK #-} !Int32
                          {-# UNPACK #-} !Int32
                          !(Map Int32 Int32)
                          !(Map Int32 Int32)
                        | BinTipTipInt32Int32
                          {-# UNPACK #-} !Size
                          {-# UNPACK #-} !Int32
                          {-# UNPACK #-} !Int32
                        | BinBinTipInt32Int32
                          {-# UNPACK #-} !Size
                          {-# UNPACK #-} !Int32
                          {-# UNPACK #-} !Int32
                          {-# UNPACK #-} !Size
                          {-# UNPACK #-} !Int32
                          {-# UNPACK #-} !Int32
                          !(Map Int32 Int32)
                          !(Map Int32 Int32)
                        | BinTipBinInt32Int32
                          {-# UNPACK #-} !Size
                          {-# UNPACK #-} !Int32
                          {-# UNPACK #-} !Int32
                          {-# UNPACK #-} !Size
                          {-# UNPACK #-} !Int32
                          {-# UNPACK #-} !Int32
                          !(Map Int32 Int32)
                          !(Map Int32 Int32)
                        | BinBinBinInt32Int32
                          {-# UNPACK #-} !Size
                          {-# UNPACK #-} !Int32
                          {-# UNPACK #-} !Int32
                          {-# UNPACK #-} !Size
                          {-# UNPACK #-} !Int32
                          {-# UNPACK #-} !Int32
                          !(Map Int32 Int32)
                          !(Map Int32 Int32)
                          {-# UNPACK #-} !Size
                          {-# UNPACK #-} !Int32
                          {-# UNPACK #-} !Int32
                          !(Map Int32 Int32)
                          !(Map Int32 Int32)

  tip                              = TipInt32Int32
  unsafeBin s k v TipInt32Int32 TipInt32Int32 = BinTipTipInt32Int32 s k v
  unsafeBin s k v (BinInt32Int32 s1 k1 v1 l1 r1) TipInt32Int32
    = BinBinTipInt32Int32 s k v s1 k1 v1 l1 r1
  unsafeBin s k v TipInt32Int32 (BinInt32Int32 s2 k2 v2 l2 r2)
    = BinTipBinInt32Int32 s k v s2 k2 v2 l2 r2
  unsafeBin s k v (BinInt32Int32 s1 k1 v1 l1 r1) (BinInt32Int32 s2 k2 v2 l2 r2)
    = BinBinBinInt32Int32 s k v s1 k1 v1 l1 r1 s2 k2 v2 l2 r2
  unsafeBin s k v l r
    = BinInt32Int32 s k v l r

  mapMap t b TipInt32Int32                              = t
  mapMap t b (BinInt32Int32 s k v l r)                  = b s k v l r
  mapMap t b (BinTipTipInt32Int32 s k v)                =
    b s k v TipInt32Int32 TipInt32Int32
  mapMap t b (BinBinTipInt32Int32 s k v s1 k1 v1 l1 r1) =
    b s k v (BinInt32Int32 s1 k1 v1 l1 r1) TipInt32Int32
  mapMap t b (BinTipBinInt32Int32 s k v s2 k2 v2 l2 r2) =
    b s k v TipInt32Int32 (BinInt32Int32 s2 k2 v2 l2 r2)
  mapMap t b (BinBinBinInt32Int32 s k v s1 k1 v1 l1 r1 s2 k2 v2 l2 r2) =
    b s k v (BinInt32Int32 s1 k1 v1 l1 r1) (BinInt32Int32 s2 k2 v2 l2 r2)

The reason for this is that I’ve read in the past that flattening 3 tree nodes into one can give you very good cache locality.  While you are recreating nodes on deconstruction, these will be cache local anyways, so it should not be that expensive to access the data in these.

To see this in action (apologies about the long data-constructors), the following image is a representation of fromList $ zip [1..100] [1..100].  While the image is not fully clear, it is clear that the number of nodes used to represent this is much smaller.

100-nodes

In contrast, the simpler Int version of the same data-structure has many more nodes:

100-nodes2

Bootstrapping cabal

Often I have to install cabal onto a machine where I just have GHC 6.8.2 running. I’ve found the following little script quite practical to install the basics.  Once you have that installed, it is very easy to install other haskell packages with cabal.

#!/bin/bash
#
# Copyright 2009 Christophe Poucet. All Rights Reserved.
# Author: <christophe (dot) poucet (at) gmail (dot) com> (Christophe Poucet)
# License: BSD3

webroot=http://hackage.haskell.org/packages/archive

install_package() {
  local name=$1
  local version=$2

  wget $webroot/$name/$version/$name-$version.tar.gz
  tar xvzf $name-$version.tar.gz
  cd $name-$version
  runghc Setup configure
  runghc Setup build
  sudo runghc Setup install
  cd ..
  rm -rf $name-$version
  rm -f $name-$version.tar.gz
}

install_package mtl 1.1.0.2
install_package parsec 3.0.0
install_package network 2.2.0.1
install_package HTTP 4000.0.4
install_package zlib 0.5.0.0
install_package Cabal 1.6.0.2
install_package cabal-install 0.6.2

Staying with the times… (?)

I’ve decided to try out twittering.  I know that I’m seriously lagging on the hype-wave, but I think it might be a nice way to share links that I find interesting.  I have delicious too, but it is not as conversational.  I’ve also had a friendfeed for a while, so I’ve linked all my services together there:

http://friendfeed.com/poucet

http://twitter.com/poucet

Imported from Blogger

I have just imported all my old blogs from http://notvincenz.blogspot.com, randomly sampling one entry, I noticed that some of the blogs need to be reeditted a bit.  If you happen to notice any broken blogs, please ping me so I can fix it.

Thanks!