jzbrooks

Dalvik Bytecode

In the spirit of understanding the machine you’re programming, it’s important to understand how Android’s machinery differs from the Java Virtual Machine. Understanding the Android compiler toolchain and ART require it.

Android and the Java Virtual Machine

Android programs are not executed by a Java Virtual Machine. The Android Runtime (ART) is ultimately responsible for how Android app code is executed on Android devices. Why? Mobile devices have different constraints than other machines, specifically less memory and often more task switching.

Let’s take a peek at Dalvik bytecode. We’ll generate it from the existing JVM bytecode with the Android dexer, D8.

java -jar $R8_HOME/d8.jar --output . Hello_worldKt.class

This generates a classes.dex file, which is ultimately what’s delivered inside of APKs to end user’s devices. Next we’ll peek inside with a tool that’s delivered with the Android SDK dexdump.

$ANDROID_BUILD_TOOLS_HOME/dexdump -d classes.dex produces:

Processing ’classes.dex’...
Opened ’classes.dex’, DEX version ’035’
Class #0            -
  Class descriptor  : ’LHello_worldKt;’
  Access flags      : 0x0011 (PUBLIC FINAL)
  Superclass        : ’Ljava/lang/Object;’
  Interfaces        -
  Static fields     -
  Instance fields   -
  Direct methods    -
    #0              : (in LHello_worldKt;)
      name          : ’main’
      type          : ’([Ljava/lang/String;)V’
      access        : 0x0019 (PUBLIC STATIC FINAL)
      code          -
      registers     : 3
      ins           : 1
      outs          : 2
      insns size    : 13 16-bit code units

000174: Hello_worldKt.main:([Ljava/lang/String;)V
000184: const-string v0, "args" // string@0011
000188: invoke-static {v2, v0}, Lkotlin/jvm/internal/Intrinsics;.checkParameterIsNotNull:(Ljava/lang/Object;Ljava/lang/String;)V // method@0002
00018e: sget-object v0, Ljava/lang/System;.out:Ljava/io/PrintStream; // field@0000
000192: const-string v1, "Hello, world" // string@0003
000196: invoke-virtual {v0, v1}, Ljava/io/PrintStream;.println:(Ljava/lang/Object;)V // method@0001
00019c: return-void

For starters, there are four fewer bytecodes for the same program in DEX. In this simple example, the reduction can be attributed to the fact that the JVM is a stack based machine while ART is a register based machine (notice the register count in the method summary). The JVM is built around a stack abstraction and its bytecode pushes data onto and off of its stack. DEX stuffs data in registers (v0, v1, etc) and deals with them more directly. There’s no abstraction of a stack in DEX bytecode. This is largely due to the memory constraints of mobile devices, especially in the early days of Android.

Another important way the JVM differs from ART is support for certain bytecodes and other APIs. Android’s Java 8 Support & Android’s Java 9, 10, 11, and 12 Support are concise and informative introductions to the subject that are worth a read.

Interestingly certain optimizations like preferring StringBuilder to concatenation are implemented entirely in the kotlin compiler (kotlinc) and are applied regardless of target platform. This optimization wasn’t added to javac until JDK 9.

Multidex

If you’ve ever wondered where the 64k method limit comes from, look no further than the Dalvik bytecode format. Specifically, the invoke-kind instructions limit methods references to 16 bits. 16 bits can only encode 216-1 (65535) values. The format doesn’t prohibit containing more methods than that. However, it’s impossible to invoke a method you can’t reach.

Thankfully if you’re minSDK is ≥ 21, the dex method limit is a thing of the past. ART is capabale of loading multiple dex files from an apk and compiling them into a single .oat file during ahead-of-time compilation at install time. If you have more than 64k methods in your project, run a debug build and analyze the apk. You’ll find multiple *.dex files.

You can learn more about Dalvik and ART at https://source.android.com/devices/tech/dalvik.