Contents

Blog tags RSS

Entries tagged language

How I learned about Java's lack of type erasure the hard way

3 December 2011 (programming language android java) (2 comments)

This week, I started playing around with the Android platform. I've been eyeing Yeti, an ML dialect that compiles to JVM and features structural typing. Even though I only have very limited experience in ML (mostly just reading the snippets in the Okasaki, and having helped Maya once with some F# code), it has to be better than Java, right?

After getting the hang of Yeti (integrating it into Android's ant build system, writing Hello World, etc.) I wanted to write something more substantial to evaluate the language for serious use. However, yesterday evening I started getting weird errors as soon as I started creating non-trivial closures in member functions. This is the chronicle of how I tracked down the problem with just some rudimentary Java knowledge and lots of reverse engineering.

Continue reading »

The case for compositional type checking

23 October 2010 (programming haskell language ELTE) (1 comment)

This is an based on a chapter of the M.Sc. thesis I am writing at ELTE, supervised by Péter Diviánszky.

For my M.Sc. thesis, I've been working on writing a compositional type checker for Haskell 98. The basic idea is to extend Olaf Chitil's compositional type system with ad-hoc polymorphism, Haskell 98's major extension to the Hindley-Milner type system. In this post, I'm showing the motivation behind wanting to go compositional.

A property shared by both commonly-used algorithms for doing Hindley-Milner type inference, W and M, is that both W and M infer the type of composite expressions by inferring one subexpression (in some sense, the “first” one) and using its results in inferring the type of the “next” one. They are linear in the sense that partial results are threaded throughout the type inference.

The effect of linearity on type inference is that certain sub-expressions (those that are processed earlier) can have greater influence on the typing of other subexpressions. This is bad because it imposes a hierarchy on the subexpressions that is determined solely by the actual type checking algorithm, not by the type system; thus, it can lead to misleading error messages for the programmer.

For example, let's take the following definition of a Haskell function:

foo x = (toUpper x, not x)

There are two ways to typecheck this definition using W: either we first typecheck toUpper x, using the context {x :: α}, resulting in the type equation α ~ Char, then checking not x with {x :: Char}, or do it the other way around, by first looking at not x, then as a result recursing into toUpper x with the context {x :: Bool}.

GHC, it seems, does the former, resulting in the following error message:

Couldn't match expected type `Bool' against inferred type `Char'
In the first argument of `not', namely `x'
In the expression: not x
In the expression: (toUpper x, not x)

Whereas Hugs 98 does the latter:

ERROR "test.hs":1 - Type error in application
*** Expression : toUpper x
*** Term : x
*** Type : Bool
*** Does not match : Char

The problem is that they are both misleading, because there is nothing wrong with either not x or toUpper x by itself. The problem only comes from trying to unify their respective views on the type of x.

A compositional type checker, in contrast, descends into toUpper x and not x using the same context, {x :: α}. The first one results in the typing (which is defined to be not just the type of an expression, but also a mapping of monomorphic variables to their types) {x :: Char} ⊢ Char, and the second one in {x :: Bool} ⊢ Bool. Only afterwards are these two typings tried to be unified.

This is better because it becomes meaningful to talk about the typing of a subexpression. For the example above, my work-in-progress compositional type checker can report errors with an (IMO) much more helpful message:

input/test.hs:1:8-25:
(toUpper x, not x)
Cannot unify `Char' with `Bool' when unifying `x':
      toUpper x    not x
      Char         Bool
x ::  Char         Bool

Of course, the devil's in the details — but that's what my thesis will be about.

From Register Machines to Brainfuck, part 2

7 September 2010 (programming haskell language) (1 comment)

This is a continuation of my previous post on register machines vs. Brainfuck programs. We left off at Brainfuck's supposed Turing-completeness.

Now, the most straightforward way to prove Turing-completeness of a given language is to write a compiler that takes a program written in a language that is already known to be Turing-complete, and creates a program written in the language to be proved Turing-complete, that simulates the original program. So an obvious way to prove that Brainfuck is a Turing-complete language is to compile register machine programs into Brainfuck. This has the added advantage that a programmer having some experience in real-world assembly programming can easily write register machine programs, which can then be compiled into (horribly inefficient and over-complicated, as we'll see) Brainfuck programs.

Important note: Of course, to really prove, in a mathematical sense, that Brainfuck is Turing-complete, we would first have to define formal operational semantics for register machines and Brainfuck programs to be even able to argue about simulating one with the another. In this post, I will appeal to intuition instead.

Continue reading »

From Register Machines to Brainfuck, part 1

6 September 2010 (programming haskell language) (1 comment)

The Brainfuck programming language stays a somewhat current topic at the office ever since Maya's 9-digit program, so much so, that now we've even started designing our own Brainfuck computer using first principle logic gates only. But more on that later. Today and in my next post, I'm going to write about compiling register machines to Brainfuck.

Register machines

The register machine is an abstract mathematical machine much like Turing machines or finite automata, and it can be shown to be Turing-complete. On the other hand, it models very closely how real-world computers operate. The program of a register machine (RM for short) is a labelled list of instructions, each operating on one of a finite number of memory cells called registers, holding values of natural numbers. The allowed instructions in the specific formulation that we'll be using here is:

Note that this set of operations is redundant in that both clr and mov can be easily implemented in terms of the others — they are included here only for clarity later on.

For example, to add the contents of register a to b, we can write the following program using a temporary register z:

  1. clr z
  2. jz a 7
  3. dec a
  4. inc z
  5. inc b
  6. jmp 2
  7. jz z 11
  8. dec z
  9. inc a
  10. jmp 7
  11. Done.

The Haskell code for parsing register machine programs and an instantiator for a simple little macro language is available on GitHub.

Brainfuck

Brainfuck is a joke programming language, one designed to be as much of a pain to use as possible, while also being powerful enough to solve real problems with it (languages like that are commonly called Turing tar-pits). The link above explains the intentionally minimalistic syntax and the semantics of the language in detail. The sketchy version is that there is a linear array of memory cells and a pointer moving on this array, and the operations can move the pointer and change the contents of the currently pointed cell:

You can find my parser, interpreter and x86 compiler for Brainfuck here.

In the next post, we will prove that Brainfuck is Turing-complete by translating register machine programs into Brainfuck. As you can see, there are two differences between RM and Brainfuck: one is that RM uses random access for the registers, whereas Brainfuck can access only one register at a time; the other, more major one is that you can't change the order of execution arbitrarily in Brainfuck. This is why we will have to use some trickery to come up with a translation scheme.

If you're feeling impatient, you can, of course, take a peek in the meantime at my RM → Brainfuck compiler in GitHub.

Compiling Haskell into C++ template metaprograms

16 May 2010 (haskell programming language)

After Lőri posted his C++ template meta-program solution to the 9-digit problem, Encsé asked me to help him understand it, being the first real-world metaprogram he's encountered. To help both him and myself, I rewrote, function-by-function, his code into Haskell.

This rewriting process was surprisingly straightforward, and the resulting code, while being in a very strict one-to-one correspondence with the original C++ code, was immensely more readable. So this, naturally, lead to the idea of a simple functional programming language that could be compiled into C++ compile-time metaprograms.

A couple of days later, I had a small accident and had to stay in bed for a week, which gave me plenty of time to throw together a proof-of-concept compiler from a Haskell-like language into C++ TMP.

And so MetaFun was born.

MetaFun is structured into three separate modules, one for parsing and typechecking the input language called Kiff (for Keep It Fun & Functional), one for unparsing C++ metaprograms, and the third one is, of course, the compiler itself. I plan to re-use the Kiff module whenever I come up with an idea for another new throwaway functional compiler.

Older entries:

Entries from all tags