Cactus

How I learned about Java's lack of type erasure the hard way

3 December 2011 (programming language android java)

This week, I started playing around with the Android platform. I've been eyeing Yeti, an ML dialect that compiles to JVM and features structural typing. Even though I only have very limited experience in ML (mostly just reading the snippets in the Okasaki, and having helped Maya once with some F# code), it has to be better than Java, right?

After getting the hang of Yeti (integrating it into Android's ant build system, writing Hello World, etc.) I wanted to write something more substantial to evaluate the language for serious use. However, yesterday evening I started getting weird errors as soon as I started creating non-trivial closures in member functions. This is the chronicle of how I tracked down the problem with just some rudimentary Java knowledge and lots of reverse engineering.

I'll illustrate the problem with a very small repro case here; however, a large part of the puzzle was actually minimizing the code that still exhibits the bug.

The bug manifested itself by my application crashing with a java.lang.noSuchFieldError exception while running a function defined inside a class method. Here is a simplified version of the code:

module hu.erdi.Hello.yetihello;

import android.app.Activity;
import android.os.Bundle;

class HelloYeti extends Activity
  var x = 42,
  var y = 0,
  
  void onCreate(Bundle savedInstanceState)
    super#onCreate(savedInstanceState);
    f _ = (y := x);
    f (),
end;

()

And here is the exception I get when trying to run it on the Android emulator:

Uncaught handler: thread main exiting due to uncaught exception
java.lang.NoSuchFieldError: hu.erdi.Hello.HelloYeti.$0
        at hu.erdi.Hello.HelloYeti.f(hu/erdi/Hello/yetihello.yeti)
 	at hu.erdi.Hello.HelloYeti.onCreate(hu/erdi/Hello/yetihello.yeti:14)
        ...

While that field name, hu.erdi.Hello.HelloYeti.$0 does seem suspicious, disassembling the classfile created by the Yeti compiler with jclassinfo shows there is in fact such a field:

$ jclassinfo --fields bin/classes/hu/erdi/Hello/HelloYeti.class 
[FIELDS]                                                                        
yeti.lang.Num $0 
yeti.lang.Num $1

Looking at the types of these two fields, a good working theory is that these correspond to the member variables x and y in the source. So $0 should be x, and thus it's probably accessed by the code compiled from the y := x statement in the function f. The error message, then, means there is no x field in whatever f tries to read it from.

My first hunch was that f probably mixes up the closure, so I looked at the disassembled version of both definition of f and its usage site:

public void onCreate(android.os.Bundle)
        ;; ...
	5 iconst_1
	6 anewarray java.lang.Object
	9 dup
	10 iconst_0
	11 aload_0
	12 checkcast hu.erdi.Hello.HelloYeti
	15 aastore
	16 astore_2
	17 aload_2
	18 aconst_null
	19 invokestatic hu.erdi.Hello.HelloYeti.f(java.lang.Object[], java.lang.Object)
        ;; ...

static java.lang.Object f(java.lang.Object[], java.lang.Object)
	0 aload_0
	1 iconst_0
	2 aaload
	3 aload_0
	4 iconst_0
	5 aaload
	6 getfield hu.erdi.Hello.HelloYeti.$0
	9 putfield hu.erdi.Hello.HelloYeti.$1
        ;; ...

I looked up the opcodes in the JVM spec and reconstructed what happens. The f function gets its closure in the first argument as a list of java.lang.Objects and its "real" arguments follow (a single () value in this case, represented as a Java null). Both the call site and the function definition seems to agree on this convention, so the problem is not in the two mixing up the protocol.

My next idea was to write the same thing (with packing the closure into an array) in Java and see what it gets compiled into. The code I wrote was:

package hu.erdi.Hello;

import android.app.Activity;
import android.os.Bundle;

public class Hello extends Activity
{
    int x = 42;
    int y = 0;

    public void onCreate (Bundle savedInstance)
    {
        super.onCreate (savedInstance);
        Object[] closure = new Object[1];
        closure[0] = this;

        f (closure, null);
    }

    static void f (Object[] closure, Object _)
    {
        Hello _this = (Hello)closure[0];
        _this.y = _this.x;
    }
}

Let's disassemble this version of f:

static void f(java.lang.Object[], java.lang.Object)
	0 aload_0
	1 iconst_0
	2 aaload

I've highlighted the only difference in this particular version; in the actual code I was working with, there were more substantial differences related to how stuff is laid out on the stack. At this point, I started to suspect that maybe types are necessary for the JVM at runtime to access fields, so I wanted to test this this theory. That meant taking the compiled version of the Java code and just removing that single checkcast instruction.

Unfortunately, the only JVM assembler I could find was Jasmin which uses a somewhat different syntax than jclassinfo, so there was no roundtripping. But based on the handful of examples bundled with Jasmin, I managed to rewrite the program and then try it in the Android emulator.

This experiment immediately validated my suspicion: even though the pointer on the stack points to an object of dynamic type hu.erdi.Hello.Hello, the JVM needs the static cast before it can find its field. Why it does so I don't know: since JVM only admits single inheritance, the field offsets should be independent of the actual type.

And thus we come to the solution: I wrote a trivial patch to the Yeti compiler to always checkcast after unpacking from the closure. Using that patch, my code runs just fine on Android.


« Mod-N counters in Agda 
All posts
 Bali »