Monday, November 26, 2007

VMs are everywhere

One of the interesting things about Android is that Google is using their own virtual machine, called Dalvik, which is not the same as the Java VM but JVM code can be converted to Dalvik code using a tool included with the Android SDK. This started me thinking about how virtual machines have taken off. You have JVM, Microsofts CLR, and Flash forming the big three with many other's such as YARV (Ruby's upcoming VM) and Parrot (Perl 6s VM) also starting to take off. While VMs have the advantage of being processor and operating system independent, ultimately the code has to be converted to native machine language. There are three basic ways of doing this. Interpretation, JIC, and re-compilation.

Interpretation is the slowest of these techniques, though when combined with specialized compilation (which is essentially what Hotspot does) can actually be the fastest. The idea is that the program converts the program to machine language by converting the code every time it is executed. Hotspot improves this by detecting the parts of the code that run frequently and compiling those sections into highly optimized native machine language. When you consider that most code runs rarely, this can be very efficient.

JIC, which stands for Just In-time Compilation, compiles the classes into native machine language as each class is needed. There is an initial delay when a class is first used, but once compiled the class will run fairly fast. This has largely become the way of creating virtual machines. Often optimization of the native code is not done due to the fact that the compilation is happening as the program is running so it has to happen quickly.

Finally there is re-compilation. This is essentially a compiler that takes virtual machine bytecodes as input and outputs native machine language. This could be a good way of distributing code as you could do the translation from bytecode to native code as part of the program's installation. In this particular case, when you install a program, part of the installation is compiling the virtual machines code into native code. There is no overhead when ran as the compilation was done before hand. While the initial installation of the program on the machine would take longer, most people wouldn't notice and this only has to be done once.

No comments: