Efficient type-level division for Haskell (23 May 2010)

Efficient type-level division for Haskell

23 May 2010 (haskell programming math) (1 comment)

Another approach I wanted to explore to Encsé's 9-digit problem was creating a type-level solution in Haskell. First I'll describe what that would entail, and then present a solution to one of its sub-problems: testing special cases of divisibility in a more efficient way than the naïve implementation of repeating subtractions.

Type-level programming

Basically, a type-level Haskell program is to regular Haskell programs what C++ template metaprograms are to C++ proper. Conrad Parker wrote a great introduction to type-level programming using functional dependencies in issue 6 of The Monad.Reader journal, titled Type-Level Instant Insanity. You encode values as types, and then construct an expression whose type, as inferred by the Haskell compiler, encodes the output of your computation. To actually implement the equivalent of a function, typeclasses are used.

For example, take the following (value-level) Haskell definition of length: (deliberately not using any builtin types, and using unary representation for simplicity):

module ValueLength where
        
data Nat = Z | S Nat      
data List a = Nil | a ::: (List a)
infixr 6 :::

length :: List a -> Nat
length Nil = Z
length (x ::: xs) = S (length xs)

(note that due to technical limitations, both : and :: are reserved names in Haskell, and so we have to use ::: for our cons)

To lift this definition to the level of types, we first need a way to represent actual numbers and lists as types. This is pretty straightforward using algebraic datatypes. Since we are ultimately only going to be interested in the type of expressions and not their value, no constructors are defined.

{-# LANGUAGE EmptyDataDecls, TypeOperators,
             FunctionalDependencies,
             MultiParamTypeClasses, UndecidableInstances #-}
module TypeLength where

data Z
data S n

data Nil
data x ::: xs
infixr 6 :::

– Continued below

Of course, System F doesn't allow for meta-types, only kinds; so we can't prohibit meaningless types like S Nil or Z ::: Z (contrast this with the declaration of the S constructor in the first example). Thus, type-level programming in Haskell has to be untyped.

Now comes the clever bit: lifting the definition of length. Basically, what we are looking for is a way to map types to types. Typeclasses with functional dependencies are one possibility of achieving this (type families are another one). First of all, each n-ary operator is rewritten into an n+1-ary relation between its operands and its result: instead of thinking about length Nil = 0, we say Nil and 0 are in the relation Length. We declare a class for each function, and one instance for each equation:

class Length xs n | xs -> n
instance Length Nil Z

The xs -> n part of the class declaration is where the magic comes from. It means that if there are types xs, n₁ and n₂ such that there are instances of Length xs n₁ and Length xs n₂, then the types n₁ and n₂ must be the same. To demonstrate why this is useful, consider the following function definition:

liftedLength :: Length xs n => xs -> n
liftedLength = undefined

What is the type of liftedLength (undefined :: Nil)? The declaration of liftedLength says that the answer is whatever type n such that an instance of Length Nil n exists. But the functional dependency of Length says that there can be at most one such n. Consulting the list of instances above, we find instance Length Nil Z, and thus, n has to be the type Z. We can test this using the interpreter GHCi:

*TypeLength> :t liftedLength (undefined :: Nil)
liftedLength (undefined :: Nil) :: Z

So far, so good. But what about the length of non-empty lists?

To translate the other equation of length, we need a way to tackle recursion. Viewing the recursive definition of length through our relational glasses, it is simply an inductive definition of the relation Length:

instance (Length xs n) => Length (x ::: xs) (S n)

Again, we can try this out with GHCi, by asking for the length of [0, 0]:

*TypeLength> :t liftedLength (undefined :: Z ::: Z ::: Nil)
liftedLength (undefined :: Z ::: Z ::: Nil) :: S (S Z)

Towards a type-level solution of the 9-digit problem

Now that the basics are covered, it should be clear what a type-level solution to the original 9-digit problem should entail. I've decided to directly translate Lőry's solution because it doesn't use any fancy infrastructure like list monads. Take a moment to look at its code because I'm going to explain what I had to change to write its type-level counterpart.

The first version, using unary arithmetics like in the introduction above, was, unsurprisingly, unusably slow (anything beyond the 3-digit sub-problem of the 9-digit problem was hopeless), so I looked around for something better. Péter Diviánszky pointed me to the type-level package, which, promising enough, features decimal arithmetics.

However, the direct transliteration of the value function that converts from a list of digits into its decimal value, and testing for divisibility using type-level's DivMod both proved to be highly inefficient. So instead, I decided to write a faster divisibility checker that exploits the fact that for the 9-digit problem, the divisor is always less than the base of the representation (since we're using base-10, and checking for divisibility by 2, ..., 9).

The basis of the algorithm is how you do division with pen & paper: you take the most significant digit, calculate its reminder, then glue this in front of the next digit, and so on. Here's the Haskell code for a very simple implementation of this idea:

q `divides` []                  = True
q `divides` (0:ds)              = q `divides` ds
q `divides` (d:ds) | d ≥ q      = q `divides` ((d-q):ds)
                   | otherwise  = case ds of
                                    [] -> False
                                    (d':ds) -> q `divides` ((d * 10 + d'):ds)

The repeated subtraction part may seem scary, but we're only ever glueing digits less than q in front of the next digit, so at most 9 subtractions are needed to eliminate a digit.

To actually lift this algorithm to the world of types requires writing quite a bit of boilerplate code, because the most efficient way to branch on d ≥ q is to write out all the 9² cases. Fortunately, my girlfriend just wrote a report on Template Haskell for a university assignment the other day, so I learned of its existence and was able to use it to generate most of the instances of my Divides class.

Results

Using this special-case division algorithm, I was able to push the program to 7 digits, but solving the 9-digit problem proper is still out of reach, as GHC eats up all my memory before dying (but I'm curious if it will work, unchanged, on my 7-Gig box at work). But the 7-digit problem is still roughly 10,000 times larger than the 3-digit one, so the improvement is evident.

You can check out the full type-level 9-digit program (including the fast divisibility checker) on GitHub, at http://github.com/gergoerdi/typeprog.

« From Register Machines to Brainfuck, part 1

All posts

Compiling Haskell into C++ template metaprograms »

solrize 2010-05-25 00:18:00

I left a more "efficient" algorithm as a comment on Encsé's blog. Using such an approach should cut down the amount of division a lot.