当前位置:网站首页>Come and walk into the JVM

Come and walk into the JVM

2022-07-06 11:38:00 Geek Yunxi

In depth understanding of JVM

Each use Java The developers know that Java Bytecode is in JRE Run in (JRE: Java Runtime environment ).JVM It is JRE The core component of , Undertake analysis and execution Java Bytecode works , and Java Programmers usually don't need to know more JVM Large applications and class libraries can be developed by running . For all that , If you are right about JVM Have enough knowledge of , Would be right. Java Have a better grasp , And it can solve some seemingly simple but unsolved problems .

therefore , In this article , I will introduce JVM working principle , internal structure ,Java Execution of bytecode and execution sequence of instructions , And will introduce some common JVM Errors and their solutions . Finally, I will briefly introduce Java SE7 Bring new features .

virtual machine

JRE from Java API and JVM form ,JVM Through class loader (Class Loader) Additive Java application , And pass Java API To perform .

virtual machine (VM: Virtual Machine) It simulates the physical machine's executor by software . first Java Language is designed to be based on virtual machines rather than physical machines , Heavy and realized WORA( Write once , Run anywhere ) Purpose , Although this goal is almost forgotten by the world . therefore ,JVM It can be executed on all hardware environments Java Bytecode without adjustment Java Execution mode of .

JVM The basic characteristics of :

  • Based on stack (Stack-based) Virtual machine : differ Intel x86 and ARM And other popular computer processors are based on register (register) framework ,JVM yes Stack based execution .
  • Symbol reference (Symbolic reference): All but the basic types Java type ( Classes and interfaces ) They are all related through symbolic references , Instead of explicit memory address based references .
  • Garbage collection mechanism : An instance of the class is explicitly created by user code , But it's automatically destroyed by garbage collection mechanism .
  • Ensure platform independence by clarifying basic types : image C/C++ Traditional programming languages for int Type data will have different byte lengths on the same platform .JVM However, the platform compatibility of the code is maintained by defining the byte length of the basic type , So that the platform is irrelevant .
  • Network byte order (Network byte order): Java class The binary representation of the file uses network-based byte order (network byte order). In order to use the small end (little endian) Of Intel x86 Platform and in use big end (big endian) Of RISC Keep platform independent between series platforms , A fixed byte order must be defined .JVM The network byte order used in the network transmission protocol is selected , Based on the big end (big endian) Byte order of .

Sun The company developed Java Language , But anyone can follow JVM Develop and provide under the premise of specification JVM Realization . So at present, there are many different JVM Realization , Include Oracle Hostpot JVM and IBM JVM.Google Used by the company Dalvik VM It is also a kind of JVM Realization , Although it is not fully followed JVM standard . And stack based Java The difference between virtual machines is Dalvik VM It's register based ,Java Bytecode is also converted to Dalvik VM Register instruction set used .

Java Bytecode

JVM Use Java Bytecode — One runs on Java( User language ) And machine language , In order to achieve WORA Purpose .Java Bytecode is deployment Java The smallest unit of the program .

Introducing Java Before bytecode , Let's first look at what bytecode is . The following case is a situation that I have encountered in a real development scenario .

The phenomenon

A program that once worked well cannot run again after updating the class library , And threw the following exception :

1 Exception in thread "main" java.lang.NoSuchMethodError: com.nhn.user.UserAdmin.addUser(Ljava/lang/String;)V
2     at com.nhn.service.UserService.add(UserService.java:14)
3     at com.nhn.service.UserService.main(UserService.java:19)

The program code is as follows , And this code has not been changed before updating the class library :

1 // UserService.java
2 …
3 public void add(String userName) {
4     admin.addUser(userName);
5 }

The comparison of the updated code in the class library is as follows :

 Copy code

 1 // UserAdmin.java - Updated library source code
 2 …
 3 public User addUser(String userName) {
 4     User user = new User(userName);
 5     User prevUser = userMap.put(userName, user);
 6     return prevUser;
 7 }
 8 // UserAdmin.java - Original library source code
 9 …
10 public void addUser(String userName) {
11     User user = new User(userName);
12     userMap.put(userName, user);
13 }

 Copy code

In a nutshell addUser() Method returns after update void And after the update, I returned User Type instance . And the program code doesn't care addUser The return value of , Therefore, no changes have been made in the process of use .

First look ,com.mhn.user.UserAdmin.addUser() There is still a , but Why does it show up NoSuchMethodError?

Problem analysis

The main reason is that the program code does not recompile the code when updating the class library , in other words , Although the program code still seems to be calling addUser Method, regardless of its return value , For compiled class files , He wants to know the return value type of the calling method clearly .

This can be illustrated by the following exception information :

 java.lang.NoSuchMethodError: com.nhn.user.UserAdmin.addUser(Ljava/langString;)V 

NoSuchMethodError  Because "com.nhn.user.UserAdmin.addUser(Ljava/lang/String;)V" Method not found caused . to glance at "Ljava/lang/String;" And the back of the "V". stay Java Bytecode representation ,"L;" Represents an instance of a class . So the top addUser Method requires a java/lang/String Object as parameter . In this case , Class library addUser() The parameters of the method have not changed , So the parameters are normal . Take another look at the last of the exception messages "V", It represents the return value type of the method . stay Java Bytecode representation ,"V" It means that the method has no return value . So the above exception message means that you need a java.lang.String Parameter without any return value com.nhn.user.UserAdmin.addUser No way to find .

Because the program code is compiled using the previous version of the class library ,class What is defined in the file is that it should call to return "V" Method of type . However , After changing the class library , return "V" Method of type no longer exists , Instead, the return type is "Lcom/nhn/user/User;" Methods . So what we saw above happened NoSuchMethodError.

notes

Because the developer did not recompile the program code for the new class library , So something went wrong . For all that , Class library providers are also responsible for this . Because there was no return value before addUser() Since the method is public Method , But later it was changed to return user Realization , This means that the method signature has changed significantly . This means that the class library is not compatible with previous versions , Therefore, the class library provider must notify this in advance .

Let's go back to Java Bytecode ,Java Bytecode yes JVM The basic elements of ,JVM Itself is a tool for executing Java Bytecode actuator .Java Compilers don't put things like C/C++ That's how to turn high-level language into machine language (CPU Execution instruction ), It's about what developers can understand Java Language to JVM I understand Java Bytecode . because Java Bytecode is platform independent , So it can be installed JVM( Accurately speaking , yes JRE Environmental Science ) In any hardware environment , Even their CPU And the operating system ( So in Windows PC Developed and compiled on the machine class Without any adjustment, the file can be in Linux Executed on machine ). The size of the compiled file is basically the same as that of the source file , Therefore, it is easier to transmit and execute through the network Java Bytecode .

Java class The file itself is a binary based file , So it is difficult for us to intuitively understand the instructions . To manage these class file , JVM Provides javap Command to decompile the binary . perform javap What we get is intuitive java Command sequence . In the case above , By executing the program code javap -c Can be applied UserService.add() Method instruction sequence , as follows :

 Copy code

1 public void add(java.lang.String);
2   Code:
3    0:   aload_0
4    1:   getfield        #15; //Field admin:Lcom/nhn/user/UserAdmin;
5    4:   aload_1
6    5:   invokevirtual   #23; //Method com/nhn/user/UserAdmin.addUser:(Ljava/lang/String;)V
7    8:   return

 Copy code

Above Java In the instruction ,addUser() Method is called on the fifth line , namely "5: invokevirtual #23". This sentence means that the index position is 23 The method of will be called , Method is indexed by javap Marked by the program .invokevirtual yes Java One of the most commonly used opcodes in bytecode , Used to call a method . in addition , stay Java There is... In the bytecode 4 An opcode representing the calling method : invokeinterfaceinvokespecialinvokestaticinvokevirtual . The meaning of each of them is as follows :

  • invokeinterface: Call interface method
  • invokespecial: Call initialization method 、 Private method 、 Or the method defined in the parent class
  • invokestatic: Call static methods
  • invokevirtual: Invoking an instance method

Java The instruction set of bytecode contains opcodes (OpCode) And operands (Operand). image invokevirtual Such an opcode requires a 2 Operand of byte length .

For the program code in the above case , If you recompile the program code after updating the class library , Then we decompile the bytecode and see the following results :

 Copy code

1 public void add(java.lang.String);
2   Code:
3    0:   aload_0
4    1:   getfield        #15; //Field admin:Lcom/nhn/user/UserAdmin;
5    4:   aload_1
6    5:   invokevirtual   #23; //Method com/nhn/user/UserAdmin.addUser:(Ljava/lang/String;)Lcom/nhn/user/User;
7    8:   pop
8    9:   return

 Copy code

As we can see above #23 The corresponding method becomes a return value type "Lcom/nhn/user/User;" Methods .

In the decompile result above , What does the number in front of the code mean ?

It is a one byte number , Maybe that's why JVM The code executed is called “ Bytecode ”. image  aload0, getfield_ and  invokevirtual  Are represented as a single byte number .(aload_0 = 0x2a, getfiled = 0xb4, invokevirtual = 0xb6). therefore Java The maximum instruction code represented by bytecode is 256.

image aload0 and aload1 Such an opcode does not require any operands , therefore aload_0 The next byte of is the opcode of the next instruction . And like getfield and invokevirtual Such an opcode requires a 2 Byte operand , So the second instruction in the first byte getfield The next instruction of the instruction is in 4 Bytes , Which skipped 2 Bytes . adopt 16 The binary editor checks the bytecode as follows :

  2a b4 00 0f 2b b6 00 17 57 b1

stay Java In bytecode , Class instances are represented as "L;", and void Expressed as "V", Similar other types also have their own representations . The following table lists them Java Type representation in bytecode .

surface 1: Java The type in bytecode indicates

Java Bytecode type describe
Bbyte Single byte
CcharUnicode character
Ddouble Double precision floating point
Ffloat Single-precision floating-point
Iint integer
Jlong Long integer
L quote classname Instance of type
Sshort Short
Zboolean Boolean type
[ quote One dimensional array

surface 2: Java Bytecode example of code

java Code Java Bytecode means
double d[][][][[[D
Object mymethod(int i, double d, Thread t)mymethod(I,D,Ljava/lang/Thread;)Ljava/lang/Object;

stay 《Java Virtual machine technical specifications, Second Edition 》 Of 4.3 The descriptor (Descriptors) There is a detailed description of this in the chapter , In the 6 Chapter "Java Virtual machine instruction set " More different instructions are introduced in .

Class file format

Before explaining the class file format , Look at one first Java Web Problems that often occur in applications .

The phenomenon

stay Tomcat Write and run in the environment JSP when ,JSP The file was not executed , Accompanied by the following errors :

Servlet.service() for servlet jsp threw exception org.apache.jasper.JasperException: Unable to compile class for JSP Generated servlet error:
The code of method _jspService(HttpServletRequest, HttpServletResponse) is exceeding the 65535 bytes limit"

Problem analysis

For different Web Application container , The error message above will be slightly different , But the core message is consistent , namely 65535 Byte limit . The limit is JVM Defined , Used to specify The definition of method cannot be greater than 65535 Bytes .

Now I will introduce 65535 Byte limit of , Then explain in detail why there is this limitation .

Java In bytecode ,"goto" and "jsr" Instructions represent branch and jump respectively .

goto [branchbyte1] [branchbyte2]
jsr [branchbyte1] [branchbyte2]

These two operation instructions are followed by 2 Byte operand , and 2 The maximum offset that bytes can represent can only be 65535. However, in order to support a wider range of branches ,Java Bytecode is defined separately "gotow" and "jsrw" Used to receive 4 Branch offset of bytes .

goto_w [branchbyte1] [branchbyte2] [branchbyte3] [branchbyte4]
jsr_w [branchbyte1] [branchbyte2] [branchbyte3] [branchbyte4]

Thanks to these two instructions , The maximum offset that branches can represent far exceeds 65535, In this way java There will be no more methods 65535 The limit of bytes . However , because Java Various other limitations of class files ,java The definition of method still cannot exceed 65535 The limit of bytes . Let's take a look at the class file by explaining java Methods cannot exceed 65535 Other reasons for bytes .

Java The general structure of the class file is as follows :

 Copy code

ClassFile {
    u4 magic;
    u2 minor_version;
    u2 major_version;
    u2 constant_pool_count;
    cp_info constant_pool[constant_pool_count-1];
    u2 access_flags;
    u2 this_class;
    u2 super_class;
    u2 interfaces_count;
    u2 interfaces[interfaces_count];
    u2 fields_count;
    field_info fields[fields_count];
    u2 methods_count;
    method_info methods[methods_count];
    u2 attributes_count;
    attribute_info attributes[attributes_count];}

 Copy code

The above file structure comes from 《Java Virtual machine technical specifications, Second Edition 》 Of 4.1 section " Class file structure ".

As I said before UserService.class Before the document 16 Bytes of 16 The hexadecimal representation is as follows :

ca fe ba be 00 00 00 32 00 28 07 00 02 01 00 1b

Through the analysis of this symbol, we can understand the specific format of a class file .

  • magic: Class file 4 Bytes are a set of magic numbers , It is used to distinguish Java Predefined values of class files . As you can see above , Its value is fixed to 0xCAFEBABE. That is to say, the front of a file 4 Bytes if it is 0xCAFABABE, You can think of it as Java Class file ."CAFABABE" Is with the "JAVA" An interesting magic number about .
  • minorversion, majorversion: Next 4 Bytes represent the version number of the class . As shown above ,0x00000032 The class version number represented is 50.0. from JDK 1.6 The version number of the compiled class file is 50.0, And by the JDK 1.5 The compiled version number is 49.0.JVM Backward compatibility must be maintained , That is to maintain the compatibility of class files compared with the lower version . And if in a lower version JVM Running a higher version of the class file , It will appear java.lang.UnsupportedClassVersionError Happen .
  • constantpoolcount, constant_pool[]: Next to the version number is the constant pool information of the class . The information here will be allocated to the running time pool area at runtime , Memory allocation will be introduced later . stay JVM When loading class files , The information in the constant pool of the class will be allocated to the runtime constant pool , The runtime constant pool is contained in the method area . above UserService.class Of documents constantpoolcount by 0x0028, So by definition contant_pool The array will have (40-1) namely 39 Element values .
  • access_flags: 2 Modifier information of byte class , Indicates whether the class is public, private, abstract perhaps interface.
  • thisclass, superclass: Respectively means saved in constant_pool The index value of the current class and parent class information in the array .
  • interface_count, interfaces[]: interfacecount To save in constantpool The index value of the number of interfaces implemented by the current class in the array ,interfaces[] That is, each interface information implemented by the current class .
  • fields_count, fields[]: The number of fields and field information array of class . The field information contains the field name 、 type 、 Modifier and in constant_pool The index value in the array .
  • methods_count, methods[]: The number of methods of the class and the array of method information . Method information includes method name 、 Type and number of parameters 、 Return value 、 Modifier 、 stay constant_pool Index value in 、 Method executable code and exception information .
  • attributes_count, attributes[]: attributeinfo There are many different properties , Be separated by fieldinfo, method_into Use .

javap The program put class The file format is output in a readable way . In the face of UserService.class Files use "javap -verbose" Command analysis , The output is as follows :

 Copy code

Compiled from "UserService.java"

public class com.nhn.service.UserService extends java.lang.Object
  SourceFile: "UserService.java"
  minor version: 0
  major version: 50
  Constant pool:const #1 = class        #2;     //  com/nhn/service/UserService
const #2 = Asciz        com/nhn/service/UserService;
const #3 = class        #4;     //  java/lang/Object
const #4 = Asciz        java/lang/Object;
const #5 = Asciz        admin;
const #6 = Asciz        Lcom/nhn/user/UserAdmin;;// … omitted - constant pool continued …

{
// … omitted - method information …

public void add(java.lang.String);
  Code:
   Stack=2, Locals=2, Args_size=2
   0:   aload_0
   1:   getfield        #15; //Field admin:Lcom/nhn/user/UserAdmin;
   4:   aload_1
   5:   invokevirtual   #23; //Method com/nhn/user/UserAdmin.addUser:(Ljava/lang/String;)Lcom/nhn/user/User;
   8:   pop
   9:   return  LineNumberTable:
   line 14: 0
   line 15: 9  LocalVariableTable:
   Start  Length  Slot  Name   Signature
   0      10      0    this       Lcom/nhn/service/UserService;
   0      10      1    userName       Ljava/lang/String; // … Omitted - Other method information …
}

 Copy code

For reasons of length , Only part of the output results are extracted above . In all the output information , It will show you all kinds of information including constant pool and content of each method .

Methodical 65535 The limit of bytes is Structure method_info Influence . Like above "javap -verbose" The output of is shown in , Structure methodinfo Including code (Code)、 Line number table (LineNumberTable) And local variable table (LocalVariableTable). The row number table 、 Local variable table and exception table in code (exceptiontable) The total length of is a fixed 2 Byte value . Therefore, the size of the method cannot exceed the row number table 、 Local variable table 、 Length of exception table , That is, it can't exceed 65535 Bytes .

Although many people complain about the size limit of the method ,JVM The specification also claims to expand this size , However, there has been no clear progress so far . because JVM The technical specification defines that almost the contents of the entire class file should be loaded into the method area , Therefore, if the method length will bring great challenges to the backward compatibility of the program .

For a by Java What happens to the wrong class file caused by compiler errors ? If it is in the process of network transmission or file replication , What will happen if the class file is damaged ?

In response to these scenarios ,Java The loading process of class loader is designed as a very rigorous process .JVM The specification describes this process in detail .

notes

How do we verify JVM Successfully executed the verification process of class file ? How to verify different JVM Whether the implementation conforms to JVM standard ? So ,Oracle Special testing tools are provided :TCK(Technology Compatibility Kit).TCK By executing a large number of test cases ( Including a large number of error class files generated in different ways ) To verify JVM standard . Only through TCK The test of JVM Ability is called JVM.

similar TCK, One more JCP(Java Community Process; http://jcp.org), Used to verify the new Java The technical specification . For one JCP, Must have detailed documentation , Related implementation and submission to JSR(Java Specification Request) Of TCK test . If the user imagines JSR Use the new Java technology , Then he has to start with RI The provider has permission to , Or you can implement it and do it yourself TCK test .

JVM structure

Java The execution process of the program is shown in the figure below :


chart 1: Java Code execution process

Class loader handle Java Bytecode is loaded into the runtime data area , The execution engine is responsible for Java Bytecode execution .

Class loading

Java It provides the feature of dynamic loading , Only when the class is first encountered at runtime will it be loaded and linked , Instead of loading it at compile time .JVM The class loader of is responsible for the dynamic loading process of classes .Java The characteristics of class loader are as follows :

  • hierarchy :Java The classloader of is organized according to the hierarchy of parent-child relationship .Boostrap The class loader is at the top of the hierarchy , Is the parent of all class loaders .
  • Agent model : Hierarchical organization based on class loader , Class loaders can be proxied . When a class needs to be loaded , You will first ask the parent loader to determine whether the class has been loaded . If the parent class adder has loaded this class , Then it can be used directly without reloading . If not already loaded , Only the current class loader is needed to load this class .
  • Visibility limits : The subclass loader can get classes from the parent class loader , Not the other way around .
  • Can't uninstall : The class loader can load a class but cannot unload it . But you can unload classes by deleting the class loader .

Each classloader has its own space , Used to store the loaded class information . When the class loader needs to load a class , It passes through FQCN)(Fully Quanlified Class Name: Fully qualified class name ) First, check whether this class already exists in your own storage space . stay JVM in , Even if they have the same FQCN Class , If it appears in two different classloader spaces , They will also be considered different . Being in different spaces means that classes are loaded by different loaders .

The following figure illustrates the proxy model of the classloader :


chart 2: Proxy model of class loader

When JVM Ask the class loader to load a class , The loader always caches according to the slave class loader 、 Find and load classes in the order of the parent loader and its own loader . That is, the loader will first judge whether this class already exists from the cache , If it does not exist, ask the parent class loader to determine whether it exists , If until Bootstrap This class does not exist in class loaders , Then the current class loader will find the class file from the file system and load it .

  • Bootstrap loader :Bootstrap The loader is running JVM Created on , Used for loading Java APIs, Include Object class . Unlike other class loaders, which consist of Java Code implementation ,Bootstrap The loader is made up of native Code implemented .
  • Extended loader (Extension class loader): The extended loader is used to load except basic Java APIs Extra extension classes . Also used to load various security extensions .
  • System loader (System class loader): if Bootstrap and Extension The loader is used to load JVM Runtime components , Then the system loader loads the application related classes . It will load the user specified CLASSPATH Class in .
  • User defined loader : This is a class loader created by the user's program code .

image Web application server (WAS: Web Application Server) And other frameworks make Web Applications and enterprise applications can run independently in their respective class loading space . In other words, the application independence can be guaranteed through the proxy model of class loader . Different WAS It will be slightly different when customizing the class loader , But nothing more than using the hierarchy principle of loaders .

If a class loader finds an unloaded class , The loading and linking process of this class is shown in the following figure :


chart 3: Class loading steps

The specific description of each step is as follows :

  • load (Loading): Get the class from the file and load it into JVM Memory space .
  • verification (Verifying): Verify that the loaded class conforms to Java Language norms and JVM standard . During the test of the class loading process , This step is the most complex and time-consuming part . Most of the JVM TCK All of the test cases are used to check whether the given error class file can get the corresponding validation error information .
  • Get ready (Preparing): Prepare the data structure according to the memory requirement , And describe the fields defined in the class 、 Method and implemented interface information .
  • analysis (Resolving): Convert all symbol references in the class constant pool to direct references .
  • initialization (Initializing): Initialize the appropriate values for the variables of the class . Perform static initialization of domain , And initialize the corresponding value for the static field .

JVM Specifications define rules , But it also allows flexibility at runtime .

Run time data area


chart 4: Runtime data area structure

The runtime data area is JVM The area of memory allocated on the operating system when the program runs . The runtime data area can be subdivided into 6 Parts of , namely : Created separately for each thread PC register JVM Stack Native Method Stack And shared by all threads Data heap Method area and Runtime constant pool .

  • PC register : Every thread has one PC(Program Counter) register , And follow the start of the thread to create .PC The register stores the to be executed JVM Address of instruction .
  • JVM Stack : Each thread has one JVM Stack , And follow the start of the thread to create . The data stored in it is called stack frame (Stack Frame).JVM Each stack frame will be pressed JVM Stack or pop a stack frame from it . If there is any exception thrown , image printStackTrace() Each line of stack trace information output by the method represents a stack frame . 
    chart 5: JVM The stack structure

    • Stack frame : stay JVM Once there is a method to execute ,JVM A stack frame will be created for it , And add it to the current thread JVM In the stack . When the method ends , The stack frame will also start from JVM Remove from stack . An array of local variables is stored in the stack frame 、 Operand stack and reference to the runtime constant pool belonging to the current running method . The size of the array of local variables and the stack of operands is determined at compile time , So the stack frame size of the method at runtime is fixed .
    • Array of local variables : The index of the local variable array is from 0 Start counting , Its location stores references to instances of the class to which the method belongs . From index location 1 The parameters passed to the method are saved at the beginning . After that, the local variables of the real method are stored .
    • The stack of operands : It is the actual running space of the method . Each method transforms the operand stack and the array of local variables , And the results of calling other methods are bounced or pushed from the stack . At compile time , The compiler can calculate the memory required by the operand stack , Therefore, the size of the operand stack is also determined at compile time .
  • Native Method Stack : For the wrong Java Write the stack space defined by the local program . That is to say, it is basically used to pass through JNI(Java Native Interface) Method call and execute C/C++ Code . As the case may be ,C Stack or C++ The stack will be created .

  • Method area : The method area is the memory space shared by all threads , stay JVM Create at startup . It stores the runtime constant pool 、 Field and method information 、 Static variables and are JVM Bytecode of methods of all classes and interfaces loaded . Different JVM Providers usually have different forms when implementing the method area . stay Oracle Of Hotspot JVM The method area is called Permanent Area( The permanent zone ) or Permanent Generation(PermGen, Forever ).JVM Standardize and do not impose restrictions on garbage collection in the method area , So for JVM For implementers , Garbage collection in the method area is optional .

  • Runtime constant pool : A memory space that stores the constant pool table in the class file format . Although this part of space exists in the method area , But in JVM Play a decisive role in operation , therefore JVM The specification describes this part separately . Except for constants defined in each class or interface , It also contains all references to methods and fields . So when you need a method or field ,JVM Find the corresponding actual address from the memory space through the information in the runtime constant pool .

  • Data heap : All class instances or objects are stored in the heap , And it is also the target place of garbage collection . When it comes to JVM Performance optimization , Usually, the size setting of data heap space is also mentioned .JVM Providers can decide to partition heap space or not to perform garbage collection .

Let's return to the decompiled bytecode discussed earlier :

 Copy code

public void add(java.lang.String);
  Code:
   0:   aload_0
   1:   getfield        #15; //Field admin:Lcom/nhn/user/UserAdmin;
   4:   aload_1
   5:   invokevirtual   #23; //Method com/nhn/user/UserAdmin.addUser:(Ljava/lang/String;)Lcom/nhn/user/User;
   8:   pop
   9:   return

 Copy code

Compare the decompiled bytecode above with our common bytecode based on x86 The difference between the machine code of the architecture , Although they have a similar format 、 opcode , But there is an obvious difference :Java There is no register name in bytecode 、 Memory address or operand offset . As mentioned before ,JVM The stack model is used , So it doesn't need x86 Registers used in the architecture . because JVM Manage memory by yourself , therefore Java Used in bytecode like 15、23 Such an index value rather than a direct memory address . above 15 and 23 Points to the location in the constant pool of the current class ( namely UserService class ). That is to say JVM Create a constant pool for each class , And store the reference of the real object in the constant pool .

The explanation of each line of code above is as follows :

  • aload_0: Array the local variables 0 Element number is added to the operand stack . Array of local variables 0 The number element is always  this , That is, the reference of the current class instance object .
  • getfield #15: In the constant pool of the current class , hold 15 Element number is added to the operand stack . above 15 The number element is UserAdmin admin Field . because admin Is a class instance object , So its reference is added to the operand stack .
  • aload_1: Array the local variables 1 Number element is added to the operand stack . From the local variable array 1 The element starting at position stores the parameters of the method . So call add() Method passed in String userName A reference to the parameter will be added to the operand stack .
  • invokevirtual #23: Call the... In the constant pool of the current class 23 Method referenced by element No , Simultaneously being aload_1 and getField #15 The reference information added by the operation to the operand stack will be passed to the method call . When the method call is complete , The result will be added to the operand stack .
  • pop: Pass invokevirtual The result of the method call pops up from the operand stack . In the previous description, there is no return value when using the previous class library , There is no need to pop the result from the operand stack .
  • return: Method to complete .

The following figure will help to understand the above text explanation easily :


chart 6: Load from the runtime data area Java Bytecode example

As an example , In the above method, the value in the local variable array has not changed , So in the figure above, we only see the changes of operand stack . actually , In most scenarios, the local variable array is also changed . Data is loaded through instructions (aload, iload) And store instructions (astore, istore) Changes and moves between the array of local variables and the stack of operands .

In this chapter, we discuss the runtime constant pool and JVM Stack is clearly introduced . stay JVM Runtime , Instances of each class are allocated to the data heap , Class information ( Include User, UserAdmin, UserService, String) Etc. are stored in the method area .

Execution engine

JVM Loading bytecode into the runtime data area through the classloader is executed by the execution engine . The execution engine reads in instructions Java Bytecode , It's like CPU Executing machine commands one by one is the same . Each bytecode command contains one byte of opcode and optional operands . The execution engine reads an instruction and executes the corresponding operands , Then read and execute the next instruction .

For all that ,Java Bytecode is also written in an understandable language , Unlike those unreadable languages directly executed by machines . therefore JVM The execution engine must convert bytecode into language instructions that can be executed by the machine . The execution engine has two common ways to do this :

  • Interpreter (Interpreter): Read 、 Explain and execute each bytecode instruction one by one . Because the interpreter interprets and executes instructions one by one , So it can quickly interpret every bytecode , But the execution of interpretation results is slow . All interpretative languages have similar disadvantages . A language person called bytecode is essentially like an interpreter .
  • Just in time compiler (JIT: Just-In-Time): The introduction of instant compiler is used to make up for the lack of interpreter . The execution engine runs as an interpreter first , And then at the right time , The immediate compiler compiles the refurbished bytecode into local code . Then the execution engine will no longer interpret the execution of the method, but directly execute it by using local code . The speed of executing local code is much faster than interpreting and executing each instruction one by one , And by caching the local code , The compiled code can run faster .

However , The just in time compiler takes more time to compile code than to interpret and execute each instruction one by one , So if the code is executed only once , Explain that execution may have better performance . therefore JVM Check the execution frequency of the method , Then only the methods that reach a certain frequency will be compiled on the fly .


chart 7: Java Compilers and just in time compilers

JVM The specification does not impose constraints on how the execution engine operates . So different JVM When implementing various execution engines, we use various technical means and introduce a variety of real-time compilers to improve performance .

The running flow of most real-time compilers is shown in the following figure :


chart 8: Just in time compiler

The real-time compiler first converts the bytecode into an intermediate form of expression (IR: Itermediate Representation), And optimize it , Then turn this expression into local code .

Oracel Hotspot VM The immediate compiler used is called Hotspot compiler . It's called Hotspot Because Hotspot Compiler Based on the analysis, hot code with higher compilation priority will be found , Then these hot codes are converted to local codes . If a compiled method is no longer called frequently , That is, it's no longer hot code ,Hotspot VM The local code is removed from the cache and executed in interpreter mode again .Hotspot VM Yes Server VM and Client VM after , They also use different instant compilers .


chart 9: Hotspot ClientVM and Server VM

Client VM and Server VM Use the same runtime environment , As shown in the figure above , The difference between them is that they use different just in time compilers .Server VM Better performance by using a variety of more complex performance optimization techniques .

IBM VM In his IBM JDK6 Introduced in AOT(Ahead-Of-Time) Compiler technology . Through this technology, multiple JVM Can share compiled local code through shared cache . That is to say through AOT The code compiled by the compiler can be used by others JVM Use directly without recompiling . in addition IBM JVM By using AOT The compiler precompiles the code into JXE(Java Executable) File format thus provides a way to quickly execute code .

Most of the Java Performance improvement is achieved by optimizing the performance of the execution engine . Various optimization techniques such as just in time compilation are constantly introduced , Thus making JVM The performance has been continuously optimized and improved . Old fashioned JVM With the latest JVM The biggest difference between them actually comes from the improvement of the execution engine .

Hotspot Compiler from Java 1.3 It was introduced into Oracle Hotspot VM in , The real-time compiler starts from Android 2.2 It was introduced into Android Dalvik VM in .

notes

Like other compiled languages that use middle tier languages like bytecode ,VM It is also like JVM Execute bytecode , It introduces technologies such as instant compilation to improve VM Efficiency of execution . image Microsoft Of .Net Language , When it runs VM be called CLR(Common Language Runtime).CLR Execute a language similar to bytecode CIL(Common Intermediate Language).CLR It also provides AOT Compilers and just in time compilers . Because if you use C# or VB.NET Programming , The compiler will compile the source code into CIL,CLR Execute by using an immediate compiler CIL.CLR There's also garbage collection , and JVM It also runs in a stack based way .

Conclusion

Although the use of Java You don't need to know Java How it was created , And many programmers are not in-depth study JVM Still developed many great applications and class libraries . But if you can understand JVM, You can be right Java There are more in-depth improvements , And it is helpful in solving the case problem scenarios in the article .

Apart from the above ,JVM There are also many features and technical details .JVM The technical specification is JVM Developers provide a flexible specification space , To help developers use a variety of technical means to create better performance JVM Realization . In addition, although garbage collection surgery has been a lot of similar VM Ability programming language as a commonly used new way to improve performance , But because there are a lot of detailed information about it , So there is no in-depth explanation here .

Original author :Se Hoon Park, Message platform development team ,NHN company .

原网站

版权声明
本文为[Geek Yunxi]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/187/202207060912541730.html