Entries tagged programming
How I learned about Java's lack of type erasure the hard way
3 December 2011 (programming language android java) (2 comments)This week, I started playing around with the Android platform. I've been eyeing Yeti, an ML dialect that compiles to JVM and features structural typing. Even though I only have very limited experience in ML (mostly just reading the snippets in the Okasaki, and having helped Maya once with some F# code), it has to be better than Java, right?
After getting the hang of Yeti (integrating it into Android's ant build system, writing Hello World, etc.) I wanted to write something more substantial to evaluate the language for serious use. However, yesterday evening I started getting weird errors as soon as I started creating non-trivial closures in member functions. This is the chronicle of how I tracked down the problem with just some rudimentary Java knowledge and lots of reverse engineering.
Continue reading »Developing iPhone applications in Haskell — a tutorial
13 February 2011 (programming haskell iphone) (5 comments)I couldn't find a step-by-step tutorial explaining the process of developing iOS applications in Haskell, so after finally getting "Hello World" to run on an iPhone, I decided to write this tutorial. I should also credit Lőry, who did the iOS side of the work.
The basic overview of what we're going to do in this tutorial is the following:
- Write the backend of our application in Haskell
- Create FFI bindings for our backend
- Use Jhc to compile the Haskell code into vanilla C
- Write the front-end using the official iOS SDK, using Objective-C
- Create wrappers on the Objective-C side to make resource managment easier
- Compile and link it all using Objective-C and C compilers targeting the iOS devices
Writing the backend in Haskell
For this tutorial, we will simulate an intricate Haskell backend with the a simple function. For your real application, this is where you go all out with your Haskell-fu.
module Engine where
import Data.Char (ord)
engine :: String -> Either String [Int]
engine s | length s < 10 = Left "String not long enough"
| otherwise = Right $ map ord s
FFI bindings
To interface our backend with the frontend developed in Objective-C (or C or C++ or...), we need to represent the input and output of our Haskell function in terms of simple C types. For the function engine, a straightforward API would be, in pseudo-C:
bool engine (in string s, out string error, out int[] result)
Of course, we have to use char *'s for strings, and pointers for out parameters and arrays, so our real API will be:
int engine (char* s, char* *error, int* *result, int *result_length)
with engine returning 0 on success (Right) and non-zero on failure (Left).
The Haskell FFI representation of this signature is:
foreign export ccall "engine"
engineC :: CString -> Ptr CString -> Ptr (Ptr CInt) -> Ptr CInt -> IO CInt
The next step requires us to actually define engineC that does all the necessary marshalling. We simply evaluate engine and then set the appropriate out-parameters.
module Engine.FFI (engineC) where
import Foreign
import Foreign.C
import Control.Monad (zipWithM_)
foreign export ccall "engine"
engineC :: CString -> Ptr CString -> Ptr (Ptr CInt) -> Ptr CInt -> IO CInt
engineC s ptrErr ptrptrResult ptrLen =
do
s' <- peekCString s
case engine s' of
Left err -> do
cErr <- newCString err
poke ptrErr cErr
return 1
Right result -> do
pokeList ptrptrResult ptrLen $ map fromIntegral result
return 0
pokeList :: Storable a => Ptr (Ptr a) -> Ptr CInt -> [a] -> IO (Ptr a)
pokeList ptrptrList ptrLen xs =
do
let len = length xs
ptr <- mallocBytes $ len * elemSize
let ptrs = iterate (`plusPtr` elemSize) ptr
zipWithM_ poke ptrs xs
poke ptrptrList ptr
poke ptrLen $ fromIntegral len
return ptr
where
elemSize = sizeOf $ head xs
Compiling Haskell to C
The next step is compiling our Haskell project into C, so that we can use Apple's SDK to compile that for the iPhone, and also call engine from other code, like the Objective-C parts that make up the frontend.
Unlike GHC, Jhc doesn't compile individual modules. Instead, it compiles every used definition (but only those) and the runtime into a single C source file. Although we are not going to run our Haskell program directly, and instead call to it from the frontend, Jhc still needs a main function in the source code. So let's create a Main module which we will compile with Jhc:
module Main where
import Engine.FFI
main :: IO ()
main = return ()
We can compile this module into a C file containing the code for engineC and everything else it uses (including imported packages):
The -fffi flag turns on FFI support and makes Jhc generate the engine function from the foreign export declaration and the definition of engineC. Note that there is no name clash between engine the C function (defined as engineC in Haskell-land) and engine the Haskell definition. I think in this particular example it is cleaner to use the same name for both.
The -fjgc flag generates GC code. Note that we will also need to enable the GC code in the next step, when compiling the C sources.
The --cross -mle32 flags are important because they instruct Jhc to target little-endian, 32-bit CPUs which is what the ARM is.
Compiling the generated C source code
Everything up to this point can be done without Apple's SDK, and in fact you can run Jhc on any platform you wish. From here on, however, we will use the iOS SDK to compile to ARM.
To compile EngineMain.jhc.c, we first need to set some preprocessor macros:
- _GNU_SOURCE
- NDEBUG
- _JHC_GC=_JHC_GC_JGC (turns on the generated GC code)
- _JHC_STANDALONE=0 (this disables the main function defined in our module)
You also need some important C compiler flags (you can ignore the warning settings if you'd like):
-marm is very important because otherwise, GCC (or Clang) and Jhc step on each other's toes, leading to strange crashes seemingly out of nowhere.
Creating the frontend and accessing the backend
You can use the standard SDK to create the frontend; I will not cover that here in detail. You also need to create a header file containing the signature of our exported function. The code generated by Jhc also contains initialization and finalization routines that need to be called before and after calling any functions defined in Haskell:
extern void hs_init (int *argc, char **argv[]);
extern void hs_exit (void);
extern int engine (char* s, char* *msgError, int* *result, int *len);
You also need to manage the memory returned by the backend. The call to mallocBytes in the marshalling code is compiled into vanilla malloc, so you can simply call free after you're done with it.
To make initialization and memory management easier, Lőry has created a sample XCode project that wraps the C API to use Objective-C types. You can find the tarball here.
The case for compositional type checking
23 October 2010 (programming haskell language ELTE) (1 comment)This is an based on a chapter of the M.Sc. thesis I am writing at ELTE, supervised by Péter Diviánszky.
For my M.Sc. thesis, I've been working on writing a compositional type checker for Haskell 98. The basic idea is to extend Olaf Chitil's compositional type system with ad-hoc polymorphism, Haskell 98's major extension to the Hindley-Milner type system. In this post, I'm showing the motivation behind wanting to go compositional.
A property shared by both commonly-used algorithms for doing Hindley-Milner type inference, W and M, is that both W and M infer the type of composite expressions by inferring one subexpression (in some sense, the “first” one) and using its results in inferring the type of the “next” one. They are linear in the sense that partial results are threaded throughout the type inference.
The effect of linearity on type inference is that certain sub-expressions (those that are processed earlier) can have greater influence on the typing of other subexpressions. This is bad because it imposes a hierarchy on the subexpressions that is determined solely by the actual type checking algorithm, not by the type system; thus, it can lead to misleading error messages for the programmer.
For example, let's take the following definition of a Haskell function:
There are two ways to typecheck this definition using W: either we first typecheck toUpper x, using the context {x :: α}, resulting in the type equation α ~ Char, then checking not x with {x :: Char}, or do it the other way around, by first looking at not x, then as a result recursing into toUpper x with the context {x :: Bool}.
GHC, it seems, does the former, resulting in the following error message:
In the first argument of `not', namely `x'
In the expression: not x
In the expression: (toUpper x, not x)
Whereas Hugs 98 does the latter:
*** Expression : toUpper x
*** Term : x
*** Type : Bool
*** Does not match : Char
The problem is that they are both misleading, because there is nothing wrong with either not x or toUpper x by itself. The problem only comes from trying to unify their respective views on the type of x.
A compositional type checker, in contrast, descends into toUpper x and not x using the same context, {x :: α}. The first one results in the typing (which is defined to be not just the type of an expression, but also a mapping of monomorphic variables to their types) {x :: Char} ⊢ Char, and the second one in {x :: Bool} ⊢ Bool. Only afterwards are these two typings tried to be unified.
This is better because it becomes meaningful to talk about the typing of a subexpression. For the example above, my work-in-progress compositional type checker can report errors with an (IMO) much more helpful message:
(toUpper x, not x)
Cannot unify `Char' with `Bool' when unifying `x':
toUpper x not x
Char Bool
x :: Char Bool
Of course, the devil's in the details — but that's what my thesis will be about.
From Register Machines to Brainfuck, part 2
7 September 2010 (programming haskell language) (1 comment)This is a continuation of my previous post on register machines vs. Brainfuck programs. We left off at Brainfuck's supposed Turing-completeness.
Now, the most straightforward way to prove Turing-completeness of a given language is to write a compiler that takes a program written in a language that is already known to be Turing-complete, and creates a program written in the language to be proved Turing-complete, that simulates the original program. So an obvious way to prove that Brainfuck is a Turing-complete language is to compile register machine programs into Brainfuck. This has the added advantage that a programmer having some experience in real-world assembly programming can easily write register machine programs, which can then be compiled into (horribly inefficient and over-complicated, as we'll see) Brainfuck programs.
Important note: Of course, to really prove, in a mathematical sense, that Brainfuck is Turing-complete, we would first have to define formal operational semantics for register machines and Brainfuck programs to be even able to argue about simulating one with the another. In this post, I will appeal to intuition instead.
Continue reading »From Register Machines to Brainfuck, part 1
6 September 2010 (programming haskell language) (1 comment)The Brainfuck programming language stays a somewhat current topic at the office ever since Maya's 9-digit program, so much so, that now we've even started designing our own Brainfuck computer using first principle logic gates only. But more on that later. Today and in my next post, I'm going to write about compiling register machines to Brainfuck.
Register machines
The register machine is an abstract mathematical machine much like Turing machines or finite automata, and it can be shown to be Turing-complete. On the other hand, it models very closely how real-world computers operate. The program of a register machine (RM for short) is a labelled list of instructions, each operating on one of a finite number of memory cells called registers, holding values of natural numbers. The allowed instructions in the specific formulation that we'll be using here is:
- Increment (inc) r, Decrement (dec) r
- Clear (clr) r: Sets the value of register r to 0
- Copy (mov) rdest rsrc: Overwrite the contents of register rdest with the contents of register rsrc
- Jump to (jmp) label: Continue execution from the instruction marked with label
- Jump if (jz) r is zero to label: If the contents of register r is non-zero, continue execution as normal. If it is zero,
- Output (out) r, Input (in) r: Not strictly necessary for the theory to work out, it's convenient to include serial input/output in the model
Note that this set of operations is redundant in that both clr and mov can be easily implemented in terms of the others — they are included here only for clarity later on.
For example, to add the contents of register a to b, we can write the following program using a temporary register z:
- clr z
- jz a 7
- dec a
- inc z
- inc b
- jmp 2
- jz z 11
- dec z
- inc a
- jmp 7
- Done.
The Haskell code for parsing register machine programs and an instantiator for a simple little macro language is available on GitHub.
Brainfuck
Brainfuck is a joke programming language, one designed to be as much of a pain to use as possible, while also being powerful enough to solve real problems with it (languages like that are commonly called Turing tar-pits). The link above explains the intentionally minimalistic syntax and the semantics of the language in detail. The sketchy version is that there is a linear array of memory cells and a pointer moving on this array, and the operations can move the pointer and change the contents of the currently pointed cell:
- Increment cell (+), Decrement cell (-)
- Move pointer left (<) or right (>)
- Loop ([]): Repeatedly execute the statements between the [ and the ], as long as the value of the pointed cell is non-zero at the start of each iteration.
- Input (,), Output (.): These modify or print the contents of the currently pointed cell.
You can find my parser, interpreter and x86 compiler for Brainfuck here.
In the next post, we will prove that Brainfuck is Turing-complete by translating register machine programs into Brainfuck. As you can see, there are two differences between RM and Brainfuck: one is that RM uses random access for the registers, whereas Brainfuck can access only one register at a time; the other, more major one is that you can't change the order of execution arbitrarily in Brainfuck. This is why we will have to use some trickery to come up with a translation scheme.
If you're feeling impatient, you can, of course, take a peek in the meantime at my RM → Brainfuck compiler in GitHub.
Older entries:
- 23 May 2010: Efficient type-level division for Haskell (4 comments)
- 16 May 2010: Compiling Haskell into C++ template metaprograms
- 19 April 2010: Unary arithmetics is even slower than you'd expect (8 comments)
- 22 February 2010: The B Method for Programmers (Part 2)
- 16 February 2010: The B Method for Programmers (Part 1)
- 8 February 2010: The downfall of Soviet microprocessor technology and Lisp (1 comment)
- 7 May 2009: Relativisztikus hiperszámítógépek (3 comments)
- 17 March 2009: Eliminating magic strings from ELisp's (interactive)
- 3 March 2009: Composite pattern a gyakorlatban (4 comments)
- 25 February 2009: λ: A tiltott kalkulus (1 comment)
- 22 January 2009: Type inference for CLazy (1 comment)
- 12 January 2009: CLazy interpreter and compiler (2 comments)
- 17 October 2008: Generikus programozás (18 comments)
- 21 September 2008: Hacktivity 2008 (8 comments)
- 4 September 2008: Programozás visszavezetéssel (4 comments)
- 2 June 2008: Incompetent fuckwits (3 comments)
- 25 May 2008: Tooltips for SLIME
- 16 April 2008: Structure and Interpretation of Computer Programs
- 27 October 2007: Sapkakiválasztási axióma (14 comments)
- 12 October 2007: K stands for Kwality (6 comments)
- 8 July 2007: Goblin
- 8 November 2006: Ruby on Rails a seggem
- 13 October 2006: Activity log (1 comment)
- 8 October 2006: Registering Windows file types with NSIS (9 comments)
- 22 September 2006: Queueing timeouts in JavaScript
- 24 June 2006: Programming by Google