Contents

Blog tags RSS

Entries tagged programming

How I learned about Java's lack of type erasure the hard way

3 December 2011 (programming language android java) (2 comments)

This week, I started playing around with the Android platform. I've been eyeing Yeti, an ML dialect that compiles to JVM and features structural typing. Even though I only have very limited experience in ML (mostly just reading the snippets in the Okasaki, and having helped Maya once with some F# code), it has to be better than Java, right?

After getting the hang of Yeti (integrating it into Android's ant build system, writing Hello World, etc.) I wanted to write something more substantial to evaluate the language for serious use. However, yesterday evening I started getting weird errors as soon as I started creating non-trivial closures in member functions. This is the chronicle of how I tracked down the problem with just some rudimentary Java knowledge and lots of reverse engineering.

Continue reading »

Developing iPhone applications in Haskell — a tutorial

13 February 2011 (programming haskell iphone) (5 comments)

I couldn't find a step-by-step tutorial explaining the process of developing iOS applications in Haskell, so after finally getting "Hello World" to run on an iPhone, I decided to write this tutorial. I should also credit Lőry, who did the iOS side of the work.

The basic overview of what we're going to do in this tutorial is the following:

Writing the backend in Haskell

For this tutorial, we will simulate an intricate Haskell backend with the a simple function. For your real application, this is where you go all out with your Haskell-fu.

module Engine where
      
import Data.Char (ord)
      
engine :: String -> Either String [Int]
engine s | length s < 10 = Left "String not long enough"
         | otherwise = Right $ map ord s
      

FFI bindings

To interface our backend with the frontend developed in Objective-C (or C or C++ or...), we need to represent the input and output of our Haskell function in terms of simple C types. For the function engine, a straightforward API would be, in pseudo-C:

bool engine (in string s, out string error, out int[] result)
      

Of course, we have to use char *'s for strings, and pointers for out parameters and arrays, so our real API will be:

int engine (char* s, char* *error, int* *result, int *result_length)
      

with engine returning 0 on success (Right) and non-zero on failure (Left).

The Haskell FFI representation of this signature is:

foreign export ccall "engine"
  engineC :: CString -> Ptr CString -> Ptr (Ptr CInt) -> Ptr CInt -> IO CInt
    

The next step requires us to actually define engineC that does all the necessary marshalling. We simply evaluate engine and then set the appropriate out-parameters.

module Engine.FFI (engineC) where
    
import Foreign
import Foreign.C
import Control.Monad (zipWithM_)
    
foreign export ccall "engine"
  engineC :: CString -> Ptr CString -> Ptr (Ptr CInt) -> Ptr CInt -> IO CInt
engineC s ptrErr ptrptrResult ptrLen = 
  do
    s' <- peekCString s    
    case engine s' of
      Left err -> do
        cErr <- newCString err
        poke ptrErr cErr
        return 1
      Right result -> do
        pokeList ptrptrResult ptrLen $ map fromIntegral result
        return 0      
    
pokeList :: Storable a => Ptr (Ptr a) -> Ptr CInt -> [a] -> IO (Ptr a)
pokeList ptrptrList ptrLen xs = 
  do
    let len = length xs
    ptr <- mallocBytes $ len * elemSize          
    let ptrs = iterate (`plusPtr` elemSize) ptr
    zipWithM_ poke ptrs xs 
    poke ptrptrList ptr
    poke ptrLen $ fromIntegral len
    return ptr
  where 
    elemSize = sizeOf $ head xs
    

Compiling Haskell to C

The next step is compiling our Haskell project into C, so that we can use Apple's SDK to compile that for the iPhone, and also call engine from other code, like the Objective-C parts that make up the frontend.

Unlike GHC, Jhc doesn't compile individual modules. Instead, it compiles every used definition (but only those) and the runtime into a single C source file. Although we are not going to run our Haskell program directly, and instead call to it from the frontend, Jhc still needs a main function in the source code. So let's create a Main module which we will compile with Jhc:

module Main where

import Engine.FFI

main :: IO ()
main = return ()    
    

We can compile this module into a C file containing the code for engineC and everything else it uses (including imported packages):

jhc -fffi -fjgc --cross -mle32 -C EngineMain.hs -o EngineMain.jhc.c

The -fffi flag turns on FFI support and makes Jhc generate the engine function from the foreign export declaration and the definition of engineC. Note that there is no name clash between engine the C function (defined as engineC in Haskell-land) and engine the Haskell definition. I think in this particular example it is cleaner to use the same name for both.

The -fjgc flag generates GC code. Note that we will also need to enable the GC code in the next step, when compiling the C sources.

The --cross -mle32 flags are important because they instruct Jhc to target little-endian, 32-bit CPUs which is what the ARM is.

Compiling the generated C source code

Everything up to this point can be done without Apple's SDK, and in fact you can run Jhc on any platform you wish. From here on, however, we will use the iOS SDK to compile to ARM.

To compile EngineMain.jhc.c, we first need to set some preprocessor macros:

You also need some important C compiler flags (you can ignore the warning settings if you'd like):

-std=gnu99 -falign-functions=4 -ffast-math -fno-strict-aliasing -marm -Wextra -Wall -Wno-unused-parameter

-marm is very important because otherwise, GCC (or Clang) and Jhc step on each other's toes, leading to strange crashes seemingly out of nowhere.

Creating the frontend and accessing the backend

You can use the standard SDK to create the frontend; I will not cover that here in detail. You also need to create a header file containing the signature of our exported function. The code generated by Jhc also contains initialization and finalization routines that need to be called before and after calling any functions defined in Haskell:

extern void hs_init (int *argc, char **argv[]);
extern void hs_exit (void);

extern int engine (char* s, char* *msgError, int* *result, int *len);
      

You also need to manage the memory returned by the backend. The call to mallocBytes in the marshalling code is compiled into vanilla malloc, so you can simply call free after you're done with it.

To make initialization and memory management easier, Lőry has created a sample XCode project that wraps the C API to use Objective-C types. You can find the tarball here.

The case for compositional type checking

23 October 2010 (programming haskell language ELTE) (1 comment)

This is an based on a chapter of the M.Sc. thesis I am writing at ELTE, supervised by Péter Diviánszky.

For my M.Sc. thesis, I've been working on writing a compositional type checker for Haskell 98. The basic idea is to extend Olaf Chitil's compositional type system with ad-hoc polymorphism, Haskell 98's major extension to the Hindley-Milner type system. In this post, I'm showing the motivation behind wanting to go compositional.

A property shared by both commonly-used algorithms for doing Hindley-Milner type inference, W and M, is that both W and M infer the type of composite expressions by inferring one subexpression (in some sense, the “first” one) and using its results in inferring the type of the “next” one. They are linear in the sense that partial results are threaded throughout the type inference.

The effect of linearity on type inference is that certain sub-expressions (those that are processed earlier) can have greater influence on the typing of other subexpressions. This is bad because it imposes a hierarchy on the subexpressions that is determined solely by the actual type checking algorithm, not by the type system; thus, it can lead to misleading error messages for the programmer.

For example, let's take the following definition of a Haskell function:

foo x = (toUpper x, not x)

There are two ways to typecheck this definition using W: either we first typecheck toUpper x, using the context {x :: α}, resulting in the type equation α ~ Char, then checking not x with {x :: Char}, or do it the other way around, by first looking at not x, then as a result recursing into toUpper x with the context {x :: Bool}.

GHC, it seems, does the former, resulting in the following error message:

Couldn't match expected type `Bool' against inferred type `Char'
In the first argument of `not', namely `x'
In the expression: not x
In the expression: (toUpper x, not x)

Whereas Hugs 98 does the latter:

ERROR "test.hs":1 - Type error in application
*** Expression : toUpper x
*** Term : x
*** Type : Bool
*** Does not match : Char

The problem is that they are both misleading, because there is nothing wrong with either not x or toUpper x by itself. The problem only comes from trying to unify their respective views on the type of x.

A compositional type checker, in contrast, descends into toUpper x and not x using the same context, {x :: α}. The first one results in the typing (which is defined to be not just the type of an expression, but also a mapping of monomorphic variables to their types) {x :: Char} ⊢ Char, and the second one in {x :: Bool} ⊢ Bool. Only afterwards are these two typings tried to be unified.

This is better because it becomes meaningful to talk about the typing of a subexpression. For the example above, my work-in-progress compositional type checker can report errors with an (IMO) much more helpful message:

input/test.hs:1:8-25:
(toUpper x, not x)
Cannot unify `Char' with `Bool' when unifying `x':
      toUpper x    not x
      Char         Bool
x ::  Char         Bool

Of course, the devil's in the details — but that's what my thesis will be about.

From Register Machines to Brainfuck, part 2

7 September 2010 (programming haskell language) (1 comment)

This is a continuation of my previous post on register machines vs. Brainfuck programs. We left off at Brainfuck's supposed Turing-completeness.

Now, the most straightforward way to prove Turing-completeness of a given language is to write a compiler that takes a program written in a language that is already known to be Turing-complete, and creates a program written in the language to be proved Turing-complete, that simulates the original program. So an obvious way to prove that Brainfuck is a Turing-complete language is to compile register machine programs into Brainfuck. This has the added advantage that a programmer having some experience in real-world assembly programming can easily write register machine programs, which can then be compiled into (horribly inefficient and over-complicated, as we'll see) Brainfuck programs.

Important note: Of course, to really prove, in a mathematical sense, that Brainfuck is Turing-complete, we would first have to define formal operational semantics for register machines and Brainfuck programs to be even able to argue about simulating one with the another. In this post, I will appeal to intuition instead.

Continue reading »

From Register Machines to Brainfuck, part 1

6 September 2010 (programming haskell language) (1 comment)

The Brainfuck programming language stays a somewhat current topic at the office ever since Maya's 9-digit program, so much so, that now we've even started designing our own Brainfuck computer using first principle logic gates only. But more on that later. Today and in my next post, I'm going to write about compiling register machines to Brainfuck.

Register machines

The register machine is an abstract mathematical machine much like Turing machines or finite automata, and it can be shown to be Turing-complete. On the other hand, it models very closely how real-world computers operate. The program of a register machine (RM for short) is a labelled list of instructions, each operating on one of a finite number of memory cells called registers, holding values of natural numbers. The allowed instructions in the specific formulation that we'll be using here is:

Note that this set of operations is redundant in that both clr and mov can be easily implemented in terms of the others — they are included here only for clarity later on.

For example, to add the contents of register a to b, we can write the following program using a temporary register z:

  1. clr z
  2. jz a 7
  3. dec a
  4. inc z
  5. inc b
  6. jmp 2
  7. jz z 11
  8. dec z
  9. inc a
  10. jmp 7
  11. Done.

The Haskell code for parsing register machine programs and an instantiator for a simple little macro language is available on GitHub.

Brainfuck

Brainfuck is a joke programming language, one designed to be as much of a pain to use as possible, while also being powerful enough to solve real problems with it (languages like that are commonly called Turing tar-pits). The link above explains the intentionally minimalistic syntax and the semantics of the language in detail. The sketchy version is that there is a linear array of memory cells and a pointer moving on this array, and the operations can move the pointer and change the contents of the currently pointed cell:

You can find my parser, interpreter and x86 compiler for Brainfuck here.

In the next post, we will prove that Brainfuck is Turing-complete by translating register machine programs into Brainfuck. As you can see, there are two differences between RM and Brainfuck: one is that RM uses random access for the registers, whereas Brainfuck can access only one register at a time; the other, more major one is that you can't change the order of execution arbitrarily in Brainfuck. This is why we will have to use some trickery to come up with a translation scheme.

If you're feeling impatient, you can, of course, take a peek in the meantime at my RM → Brainfuck compiler in GitHub.

Older entries:

Entries from all tags