JVM Architecture In Depth6 min read

JVM architecture

The above JVM Architecture may look complicated, but we’ll break it into 3 parts(subsystems):

  • ClassLoaders Subsystem
  • Runtime Data Area
  • Execution Engine

Before I even begin to tell you about the JVM and its subsystem, let us discuss and explore as to where the real magic begins. JVM needs some input to process it, and that input is a .class file. This ‘.class’ file contains instructions set(sort of interface b/w hardware & software) for JVM and is generated by a Java compiler(javac.exe) located in your JDK i.e. Jdk \bin\Javac.exe.

compiler
Command used: javac <filename>.java

-> Above mentioned step would give us the .class file required for the JVM processing.

Now let us begin with the first Subsystem;

1. ClassLoaders Subsystem

The ClassLoader class uses a delegation model in order to search for classes and resources. Each instance of ClassLoader has an associated parent class loader. Whenever a ClassLoader is requested to find a class or resource, ClassLoader instance will delegate the search to its parent class loader before even attempting to find the class or resource itself. The virtual machine’s built-in class loader, called the ‘Bootstrap ClassLoader’, does not itself have a parent but may serve as the parent of a ClassLoader instance.

In Short; whenever we try to load a class, our System ClassLoader⇒ delegates it to the Extension ClassLoader⇒, which in turns delegates it to Bootstrap ClassLoader, which ultimately finds the class and loads it into the JVM.

1.1 Loading:

  • Bootstrap ClassLoader:
    • The bootstrap class loader loads the core Java libraries located in the /jre/lib directory specifically loads rt.jar
    • This ClassLoader is written in a native language which is C/C++.
    • Bootstrap is the parent ClassLoader and Extension ClassLoader falls under it.
  • Extension ClassLoader:
    • It loads classes from the JDK Extension located in the, /jre/lib/ext
    • Extension ClassLoader is implemented by the sun.misc.Launcher$ExtClassLoader class.
    • System ClassLoader is the child of this ClassLoader.
  • System ClassLoader:
    • This loader is responsible for loading the classes found on java classpath.
    • It is implemented by the sun.misc.Launcher$AppClassLoader class.
    • Setting the classpath through an environment variable.

      set CLASSPATH=D:\myprogram
      java org.mypackage.HelloWorld

1.2 Linking:

Linking a class or interface involves ‘verifying’ and ‘preparing’ that class or interface, its direct superclass, its direct super interfaces, and its element type (if it is an array type), if necessary. ‘Resolution’ of symbolic references in the class or interface is an optional part of linking.

  • Verification: ¹Ensures that the binary representation of class/Interface is structurally correct. ²If an attempt by JVM to verify class/Interface fails then an instance of LinkageError or its subclass is thrown.
  • Preparation: It involves creating the static fields for a class or an Interface and initializing such fields with their default values.
  • Resolution: It is the process of locating classes, interfaces, fields, and methods referenced symbolically from a type’s constant pool, and replacing those symbolic references with direct references.

1.3 Initialization:

This step readies a class for its first active usage. Initialization sets the actual variable values instead of the default values allotted during class preparation.

  • Initialization of a class consists of two steps:
    1. Initializing the class’s direct superclass (if any), if the direct superclass hasn’t already been initialized.
    2. Executing the class’s class initialization method, if it has one.
  • Initialization of an interface does not require initialization of its super interfaces. Just the interface’s interface initialization method is executed, if it has one.

2. Runtime Data Area

  • Method Area: It is a logical part of the non Heap memory, mainly used to store per class structure along with the static fields, method, method’s data & static fields. Method Area is created during JVM startup and is shared among all the threads.
  • Heap Area: This area is used to store the objects of classes and arrays. Heap area is created during startup and JVM may throw OutOfMemoryError, if sufficient memory is not available during startup.
    Garbage collection is another JVM inbuilt feature to clear the unwanted and used up objects so that free memory in JVM is maintained at all times.
  • Stack Area: This area is always referenced in LIFO(Last in first out) order and whenever a method is invoked, a new block is created in the Stack Area that in turn holds the local variables and references to other objects in the heap. Once the method execution ends, the formerly created block becomes available for other methods in the class.
  • PC Registers: PC Registers, as read in operating systems simply keeps track of the currently executing instruction step, and in terms of JVM; PC Registers is created every time a new thread is created and keeps a pointer to the current statement getting executed in the thread.
  • Native Methods: JVM having support for native methods, maintains its own native method stack, When a thread invokes a native method, it enters a new world in which the structures and security restrictions of the Java virtual machine no longer hamper its freedom. Native methods are commonly found to be coded in ‘C’ language.

3. Execution Engine

The execution engine reads the Java Bytecodes in the form of instructions one by one. Each and every command of Bytecode contains 1-Byte OpCode and additional Operand. The execution engine gets one OpCode and executes the task with the Operand, and then executes the next OpCode and so on, but in order to do this Bytecode first needs to be converted to a form that can be understood by the machine. This can be done in two ways.

  • Interpreter: Reads, interprets and executes the Bytecode instructions one by one. As it interprets and executes instructions one by one, it can quickly interpret one Bytecode, but slowly executes the interpreted result. This is the disadvantage of the interpreted language. The ‘language’ called Bytecode basically runs like an interpreter.
  • JIT (Just-In-Time) compiler: The JIT compiler compensates for the disadvantages of the interpreter. The execution engine runs as an interpreter first, and at the appropriate time, the JIT compiler compiles the entire Bytecode to change it to native code. After that, the execution engine no longer interprets the method but directly executes using native code. Execution in native code is much faster than interpreting instructions one by one. The compiled code can be executed quickly since the native code is stored in the cache.
  • Garbage Collector: It is a Daemon thread invoked by the JVM to free up the ‘Heap Space’ taken by the unused/unclaimed objects. more in detail here..

Q. So a question arises, how JVM decides what to use Interpreter or JIT compiler and when?
A. JVM decides on the factor of code usage frequency, i.e. if a code is going to be used more often then JIT compiler would be used, as once Bytecode is compiled into native code by JIT, then the code can be executed quickly, as native code is stored in the cache.
On the other hand, If the code is to be executed just once, then Interpreter is used, that in turn executes instructions one by one.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.