Visualizing the Metropolis Algorithm

August 21, 2016

Let’s say you’re doing some sort of Bayesian analysis. You’ll have a prior P(\theta) over your model parameters \theta . You get some data D , and you want to update on this data to get the posterior P(\theta\vert D) , the updated distribution of the model parameters given D . Let’s wheel out Bayes’ theorem and work out how to calculate P(\theta\vert D) , which for convenience we’ll call \pi(\theta) :

\pi(\theta) = P(\theta\vert D)=\dfrac{P(D\vert\theta)P(\theta)}{P(D)}

If we can calculate P(D\vert \theta) by using some sort of loss function, computing the numerator P(D\vert \theta)P(\theta) is relatively straightforward. And once we have the numerator, we’ve specified our posterior \pi(\theta) up to the normalization constant P(D) .

Computing the normalization constant is trickier. P(D) is the probability of seeing this data in the model, which means we have to integrate over all possible values of \theta :

P(D) = \int_{\Theta} P(D\vert\theta)P(\theta)\text{d}\theta

In most cases1, this won’t have a closed-form solution, and deterministic numerical integration can scale poorly with increasing dimensionality.

Monte Carlo integration

Let’s assume we can’t easily compute this integral; we can turn to Monte Carlo methods to estimate it instead. If we can directly draw from the posterior distribution \pi(\theta) , we can simply compute the density of \pi(\theta) at a set of uniformly distributed values \theta_1, \dots, \theta_N that cover a broad range of the parameter space for \theta :

\begin{split} P(D) &= \int_{\Theta} P(D\vert \theta)P(\theta)\text{d}\theta \\ &\approx \dfrac{1}{N}\sum_{i=1}^N P(D\vert\theta^{(i)})\end{split}

By the law of large numbers, our estimate will converge to the true distribution as N goes to infinity. But this only works if we can directly draw from \pi(\theta) . Many Bayesian models use arbitrarily complex distributions over \theta that we can’t easily sample.


Free monads in category theory

Part II of the series Free monads.
August 2, 2016

Forgetting how to multiply

It’s probably easiest to understand what a free monad is if we first understand forgetful functors1.

In category theory, a functor maps between categories, mapping objects to objects and morphisms to morphisms in a way that preserves compositionality2.

A forgetful functor is just a functor that discards some of the structure or properties of the input category.

For example, unital rings have objects (R, +, \cdot, 0, 1) , where R is a set, and (+, \cdot) are binary operations with identity elements (0, 1) respectively.

Let’s denote the category of all unital rings and their homomorphisms by \bf{Ring} , and the category of all non-unital rings and their homomorphisms with \bf{Rng} . We can now define a forgetful functor: \it{I}: \bf{Ring} \rightarrow \bf{Rng} , which just drops the multiplicative identity.

Similarly, we can define another forgetful functor \it{A}: \bf{Rng} \rightarrow \bf{Ab} , which maps from the category of rngs to the category of abelian groups. \it{A} discards the multiplicative binary operation, simply mapping all morphisms of multiplication to morphisms of addition.

Forgetting monoids

The forgetful functor \it{A} forgets ring multiplication. What happens if instead you forget addition? You get monoids! We can define monoids as the triple (S, \cdot, e) , where S is a set, \cdot is an associative binary operation, and e is the neutral element of that operation.

The forgetful functor \it{M}: \bf{Ring} \rightarrow \bf{Mon} maps from the category of rings to the category of monoids, \bf{Mon} , in which the objects are monoids, and the morphisms are monoid homomorphisms.

Monoid homomorphisms map between monoids in a way that preserves their monoidal properties. Given \mathcal{X} , a monoid defined by (X, *, e) , and \mathcal{Y} , a monoid defined by (Y, *’, f) , a function \it{\phi}: \mathcal{X} \rightarrow \mathcal{Y} from \mathcal{X} to \mathcal{Y} is a monoid homomorphism iff:

it preserves compositionality3:

\begin{equation}\phi(a * b) = \phi(a) *' \phi(b), \forall a\; b \in \mathcal{X}\end{equation}

and maps the identity element: \begin{equation}\phi(e) = f\end{equation}


Free monads in 7 easy steps

Part I of the series Free monads.
July 29, 2016

In the next part of this series, we’ll discuss free monads from a category theory perspective. This first installment will be decidedly more practical, focusing on how you actually use free monads in Haskell, and what advantages they bring. This means I’ll gloss over a lot of the mechanics of how free monads actually work, and make this more of a “get it done” kind of post.

The finest imperative language?

Simon Peyton-Jones famously said that Haskell was the world’s finest imperative programming language. When I started writing Haskell, however, I literally tacked -> IO () onto the end of all my functions and went to town writing imperative code.

Of course, not all the functions I wrote returned an IO (), but because that was the only tool that I had to interact with the outside world, a great deal of my code- far more than necessary- simply lived inside the IO monad.

Here’s a pretty representative example, from a program I wrote that implemented anonymized network chat1:

keyExchangeHandler :: Either String PeerData -> MVar ServerState -> IpAddress -> Int -> Handle -> IO ()
keyExchangeHandler p state ip portNumber handle = case p of
  Right peerData -> do
    s <- takeMVar state
    let peer = Participant (publicKey peerData) (nonce peerData) ip portNumber
    let newPs = peer:(peers s)
    if length newPs == groupSize s
       then forkIO $ sendPeerList state
       else forkIO $ putStrLn "Waiting for peers"
    putMVar state s{ peers = newPs }
  Left e -> putStrLn $ "Could not parse public key: " ++ e

The semantics of the application logic are completely mixed up with implementation-level details, and having all these functions just return IO () meant that I wasn’t exploiting a lot of the safety that the type system could provide. It was essentially Haskell-flavoured Java2! You can take a look at what the code looked like back then here.

Alright, so we don’t want a bunch of opaque IO () functions, but wouldn’t refactoring to use the free monad be really painful?

Not at all! Here’s how you use free monads, in just seven easy steps!


How does HaxlSharp work?

June 11, 2016

I wrote a C# version of Haxl called HaxlSharp. The original Haxl paper by Marlow, et al. is brilliantly titled There is no Fork: an Abstraction for Efficient, Concurrent, and Concise Data Access, and provides a great overview of Haxl. This post will focus more on the differences between the original Haskell implementation and my C# implementation.

What is Fetch<>?

In HaxlSharp, we use query syntax to combine functions that return Fetch<> objects. Let’s say we have these three Fetch<> objects/ functions:

Fetch<string> a;
Fetch<int> b;
Func<string, Fetch<int>> c;

We can combine them like this:

from x in a
from y in b
from z in c(a)
select y + z;

Fetch<> is actually a free monad that collects lambda expression trees instead of lambda functions.

It divides these expression trees into groups of expressions that can be written applicatively, using a version of the ApplicativeDo algorithm that was recently merged into GHC1.


What's wrong with async/ await?

June 10, 2016

Async/ await is great for writing sequential-looking code when you’re only waiting for a single asynchronous request at a time. But we often want to combine information from multiple data sources, like different calls on the same API, or multiple remote APIs.

The async/ await abstraction breaks down in these situations1. To illustrate, let’s say we have a blogging site, and a post’s metadata and content are retrieved using separate API calls. We could use async/ await to fetch both these pieces of information:

public Task<PostDetails> GetPostDetails(int postId)
    var postInfo = await FetchPostInfo(postId);
    var postContent = await FetchPostContent(postId);
    return new PostDetails(postInfo, postContent);

Here, we’re making two successive await calls, which means the execution will be suspended at the first request- FetchPostInfo- and only begin executing the second request- FetchPostContent- once the first request has completed.

But fetching FetchPostContent doesn’t require the result of FetchPostInfo, which means we could have started both these requests concurrently! Async/ await lets us write asynchronous code in a nice, sequential-looking way, but doesn’t let us write concurrent code like this.