# Generalized Algebraic Data Types I

*Fun with Functional C#*.

*May 9, 2016*

This is the first of two articles on GADTs. This first part will be a general introduction to GADTs and their utility, while the second part will show how we can wrangle GADT behaviour out of C#.

The canonical GADT introduction involves a demonstration of the inadequacy of algebraic data types. But since this is written from a C# perspective, and C# doesn’t have GADTs, we’ll start with a brief introduction to vanilla ADTs.

## Algebraic Data Types

Algebraic data types allow us a sort of type-level composition that’s more rigorous than what we have in C#. There are two ways to compose types in this algebra: products and sums, which are roughly analogous^{1} to products and sums over the integers.

### Product types

Product types allow us to combine two or more types into one compound type. In Haskell, we can combine two types into a pair:

`data Pair a b = Pair a b`

Even though there are some important differences, at first this doesn’t seem all too different from what we might write in C#:

```
class Pair<A, B> {
A a;
B b;
}
```

We can now encode tuples of arbitrary size by nesting `Pair`

constructors. For example:

```
triple :: Pair (Pair String Int) Int
triple = Pair (Pair "one" 2) 3
```

is equivalent to:

```
triple' :: (String, Int, Int)
triple' = ("one", 2, 3)
```

We’ll use the more compact tuple notation from now on.

### Sum types

While a product type `(a, b)`

is effectively a type that is *both* `a`

and `b`

, a sum type `Sum a b`

is a type that can only be *either* `a`

or `b`

. In Haskell, we represent sum types like this:

`data Either a b = Left a | Right b`

## An expression evaluator

We’re now ready for the canonical GADT introduction.

Let’s say we want to represent expressions for a simple calculator. We can do this with ADTs:

```
data Expr = Val Int
| Add Expr Expr
| Mult Expr Expr
```

We can now represent expressions like `6 * ((3 * 4) + (5 + 9))`

:

`expr1 = Mult (Val 6) (Add (Mult (Val 3) (Val 4)) (Add (Val 5) (Val 9)))`

Now we can very easily write an evaluator that will work for arbitrarily complex expressions:

```
eval :: Expr -> Int
eval (Val x) = x
eval (Add e1 e2) = eval e1 + eval e2
eval (Mult e1 e2) = eval e1 * eval e2
```

```
ghci> eval expr1
156
```

Because the evaluation and representation are separate, we can write multiple evaluators, like a pretty printer:

```
pretty :: Expr -> String
pretty (Val x) = show x
pretty (Add e1 e2) = "(" ++ pretty e1 ++ " + " ++ pretty e2 ++ ")"
pretty (Mult e1 e2) = "(" ++ pretty e1 ++ " * " ++ pretty e2 ++ ")"
```

```
ghci> pretty expr1
"(6 * ((3 * 4) + (5 + 9)))"
```

### Extending the expression

The allure of this style is clear; we have a declarative representation of our data, and we can interpret it in various ways by simply defining a function from `Expr -> a`

, for any `a`

we choose.

Let’s extend our expression type, and see if we can maintain this elegant style.

```
data Expr' = IntVal Int
| BoolVal Bool
| AddInt Expr' Expr'
| MultInt Expr' Expr'
| GreaterThan Expr' Expr'
```

Now we have a greater than operation that can evaluate to a boolean, allowing us to represent expressions like `3 + (2 * 4) > 9 * (9 + 2)`

, which should ideally evaluate to `False`

.

How would we go about writing our evaluator? Well, we could either return an `Int`

or a `Bool`

, so our type should be `Either Int Bool`

.

But hang on, we have `MultInt Expr' Expr'`

, which means `MultInt`

accepts *anything* of type `Expr'`

, even `BoolVal`

! We could write something like:

`MultInt (GreaterThan (AddInt (IntVal 3) (IntVal 4)) (IntVal 2)) (IntVal 9))`

This is effectively `((3 + 4) > 2) * 9`

, which simplifies to `False * 9`

, which is clearly nonsensical. This is a valid value of `Expr'`

, however, so we need to represent the possibility that the evaluation can fail by wrapping the return type in a `Maybe`

:

```
eval' :: Expr' -> Maybe (Either Int Bool)
eval' (AddInt e1 e2) = case (eval' e1, eval' e2) of
(Just (Left i1), Just (Left i2)) -> Just (Left $ i1 + i2)
_ -> Nothing
eval' (MultInt e1 e2) = case (eval' e1, eval' e2) of
(Just (Left i1), Just (Left i2)) -> Just (Left $ i1 * i2)
_ -> Nothing
eval' (GreaterThan e1 e2) = case (eval' e1, eval' e2) of
(Just (Left i1), Just (Left i2)) -> Just (Right $ i1 > i2)
eval' (IntVal x) = Just (Left x)
eval' (BoolVal b) = Just (Right b)
```

This works as expected:

```
ghci> -- 3 > 2:
ghci> eval' $ GreaterThan (IntVal 3) (IntVal 2)
Just (Right True)
ghci> -- (3 + 5) > 2:
ghci> eval' $ GreaterThan (AddInt (IntVal 3) (IntVal 5)) (IntVal 2)
Just (Right False)
ghci> -- (5 > 2) * 6:
ghci> eval' $ MultInt (GreaterThan (IntVal 5) (IntVal 2)) (IntVal 6)
Nothing
```

But we’ve completely lost the elegance and obviousness of the initial implementation. And this is a relatively simple evaluator; a more complex evaluator will be almost entirely consumed with complex and error-prone logic for validating expressions.

## A better solution?

A better solution would be to make invalid expressions fail to typecheck, and offload all this tedious logic to the compiler:

```
data BetterExpr t =
IntVal Int
| BoolVal Bool
| MultInt (BetterExpr Int) (BetterExpr Int)
| AddInt (BetterExpr Int) (BetterExpr Int)
| GreaterThan (BetterExpr Int) (BetterExpr Int)
```

But these invalid expressions still typecheck:

```
ghci> :t GreaterThan (BoolVal True) (IntVal 3)
GreaterThan (BoolVal True) (IntVal 3) :: BetterExpr t
```

We can hack around this by defining our own value constructors that enforce the correct types, and only exporting those. That way, users of our module won’t be able to construct invalid types:

```
module Expression (BetterExpr, boolVal, intVal, greaterThan, multInt) where
intVal :: Int -> BetterExpr Int
intVal = IntVal
boolVal :: Bool -> BetterExpr Bool
boolVal = BoolVal
greaterThan :: BetterExpr Int -> BetterExpr Int -> BetterExpr Bool
greaterThan = GreaterThan
-- etc
```

```
ghci> greaterThan (boolVal True) (intVal 3)
"Couldn't match type ‘Bool’ with ‘Int’
Expected type: BetterExpr Int
Actual type: BetterExpr Bool"
```

## Evaluating better

Let’s try to evaluate our `BetterExpr`

:

```
evaluate :: BetterExpr t -> t
evaluate (IntVal i) = i
evaluate (BoolVal b) = b
-- etc
```

Unfortunately, this doesn’t compile. This is because the type parameter `t`

in `BetterExpr t`

isn’t at all related to the types mentioned in the constructors! For instance, we could define:

`IntVal 3 :: BetterExpr Bool`

This typechecks! We can think of `IntVal`

and other value constructors as functions that return a `BetterExpr t`

:

```
-- The value constructors of this type:
data BetterExpr t =
IntVal Int
| BoolVal Bool
| MultInt (BetterExpr Int) (BetterExpr Int)
| AddInt (BetterExpr Int) (BetterExpr Int)
| GreaterThan (BetterExpr Int) (BetterExpr Int)
-- can be written as:
IntVal :: Int -> BetterExpr t
BoolVal :: Bool -> BetterExpr t
MultInt :: BetterExpr Int -> BetterExpr Int -> BetterExpr t
AddInt :: BetterExpr Int -> BetterExpr Int -> BetterExpr t
GreaterThan :: BetterExpr Int -> BetterExpr Int -> BetterExpr t
```

This syntax makes the problem clear: all value constructors underspecify their return type. This syntax also makes the solution obvious- just specify the return types! This is exactly what GADTs allow us to do.

### GADTs

We need to turn on a language extension to use GADTs:

`{-# LANGUAGE GADTs #-}`

Now we can write type declarations exactly how we want:

```
data BetterExpr t where
IntVal :: Int -> BetterExpr Int
BoolVal :: Bool -> BetterExpr Bool
MultInt :: BetterExpr Int -> BetterExpr Int -> BetterExpr Int
AddInt :: BetterExpr Int -> BetterExpr Int -> BetterExpr Int
GreaterThan :: BetterExpr Int -> BetterExpr Int -> BetterExpr Bool
```

Invalid expressions that previously typechecked are now rejected by the compiler:

```
ghci> MultInt (GreaterThan (IntVal 3) (IntVal 2)) (IntVal 2)
"Couldn't match type ‘Bool’ with ‘Int’
Expected type: BetterExpr Int
Actual type: BetterExpr Bool"
```

and writing our evaluator is again extremely straightforward:

```
eval :: BetterExpr t -> t
eval (IntVal i) = i
eval (BoolVal b) = b
eval (MultInt e1 e2) = eval e1 * eval e2
eval (AddInt e1 e2) = eval e1 + eval e2
eval (GreaterThan e1 e2) = eval e1 > eval e2
```

```
ghci> eval $ MultInt (IntVal 3) (IntVal 2)
6
ghci> eval $ GreaterThan (IntVal 3) (IntVal 2)
True
```

That’s all for today! Next time we’ll implement GADTs in C#.

Or more formally, they’re equivalent up to an isomorphism in that they both form a semiring, with types as the operands, product/ sum composition as the binary operator, and

`Unit`

/`Void`

as the respective identity elements.↩