当前位置:网站首页>. Bytecode structure of class file

. Bytecode structure of class file

2022-07-07 00:46:00 Java programmer Zhou Yu

 Insert picture description here

One . brief introduction

Before entering the text, Xiaobian sorted out a lot of learning materials here in order to help you learn better , Welcome to receive at the end of the article .
Write a simple Demo.java The procedure is as follows

1 package com.lijiankun24.classpractice;
2
3 public class Demo {
4
5 private int m;
6
7 public int inc() {
8 return m + 1;
9 }
10  }

Use javac Command compilation Demo.java File generation Demo.class file

1 $ javac Demo.java

Then open the generated with a text editor Demo.class file , As shown below
 Insert picture description here
 Insert picture description here

You can see , The file is composed of hexadecimal symbols , This long string of hexadecimal symbols follows Java Virtual machine specification

Two . Java Virtual machine specification

stay Java The virtual machine specification specifies Java Virtual machine structure 、Class Class file structure 、 Bytecode instructions and so on , You can refer to GitHub Upper 《Java Virtual machine specification 》

2.1 Java virtual machine

  1. so to speak Java Virtual machines have two major features : Platform independence and language independence , This article mainly introduces the important knowledge of language irrelevance :.class File structure

  2. Java A virtual machine is a virtual computer , Just like a real computer ,Java Virtual machine has its own perfect hardware system , Such as processor 、 Stack 、 register , And the corresponding instruction set system . The only difference between a virtual machine and a real computer is : The processor of the virtual machine 、 Memory stack is virtualized by software , And the processor of a real computer 、 Memory is real

  3. stay Java In the virtual machine specification , To introduce the Java The overall architecture of virtual machine 、Java Virtual machine memory area 、 Garbage collection 、.class File structure 、 Class loading mechanism and Java Virtual machine instruction set . This article mainly introduces .class File structure , Other contents can be found in relevant books and Java Virtual machine specification

2.2 class Class file structure

.class Documents are a set of 8 Bit byte is the base unit of binary stream , The data items are arranged in strict order and compactly .class In file , There's no separator in the middle , This makes the whole .class The contents stored in the file are almost all the data needed by the program , There is no gap

  1. .class The file is similar to C The structure of language structure to store data , There are two kinds of data stored : Unsigned numbers and tables

2. Unsigned numbers belong to the most basic data type , With u1、u2、u4、u8 Separate codes 1 Bytes 、2 Bytes 、4 Bytes and 8 An unsigned number of bytes , Unsigned numbers can be used to describe numbers 、 Index reference 、 Quantity value or according to UTF-8 The string value formed by encoding

  1. Table is a composite data structure , Composed of unsigned numbers or other tables , All watches are used to “info” ending

  2. stay .class There is a concept of set in . A set represents a set of data items of the same type , It is generally composed of a pre counter plus several consecutive data items of the same type , The counter represents the number of data items in this collection Count , Data item is the real data content

  3. Whole .class A document is essentially a table , It consists of the data items shown in the table below
     Insert picture description here

  4. The above table can be divided into the following seven parts ,.class Bytecode files include :

  • Magic number and class Document version
  • often The amount pool
  • Access signs
  • Class index 、 Parent index 、 Interface index
  • Field table set
  • Method table set
  • Property sheet set

3、 ... and . class File,

We go through Demo.class As an example to explain .class Of documents 7 Parts of

3.1 Magic sum class Document version

3.1.1 The concept is introduced

In magic number and class The following four points need to be introduced in the document version :

  1. magic number (Magic Number):.class File first 1 - 4 Bytes , Its only function is to determine whether the file is acceptable to the virtual machine class file , Its fixed value is :0xCAFEBABE( Coffee baby ). If one class The magic of files is not 0xCAFEBABE, Then the virtual machine will refuse to run this file

  2. Sub version number (minor version):.class File first 5 - 6 Bytes , That is, compile and generate the .class Of documents JDK Sub version number

  3. The major version number (major version):.class File first 7 - 8 Bytes , That is, compile and generate the .class Of documents JDK The major version number

  4. Note: The high version of the JDK Can be downward compatible with lower versions of .class file , But you can't run a new version of

.class file . For example, one .class The file is used JDK 1.5 Compilation of , Then we can use JDK 1.7 The virtual machine runs it , But it can't be used. JDK 1.4 The virtual machine runs it . Versions of SDK The minor version number and major version number of are shown in the following table

 Insert picture description here

3.1.2 Example

stay On Noodles Of Demo.class writing Pieces of in ,Magic Number:0xcafe babe,minor version: 0x0000,major version:0x0034, It can be seen that we use JDK 1.8 Compile generated Demo.class file

3.2 Constant pool

3.2.1 The concept is introduced

Next to the version number is the entry of the constant pool , Constant pools can be understood as class The repository of resources in the file , It is occupied class One of the data items with the largest file space .

A constant pool is a collection , It consists of two parts : Constant pool counters and constant pools

  1. Constant pool counter (constant_pool_count) It's a u2 The unsigned number of

  2. Constant pool (constant_pool): Immediately following the constant pool counter is this .class The constant pool content of the file , The data stored in the constant pool is generally divided into two types : Literal and symbolic references

  • Literal : Refers to the text string 、 Declare as final Constant value of
  • Symbol reference : Is a concept that is more inclined to the compilation principle , It mainly includes three types of constants :1). Fully qualified names of classes and interfaces ,2). Name and descriptor of the field ,3). The name and descriptor of the method

3. Constants in the constant pool share 14 Types , Each constant is a table , Each table has its own composition structure . this 14 A constant has a common characteristic , Each constant starts with a u1 Flag bit represented by unsigned number of type (tag, See the table below for values ), Indicates which constant type this constant belongs to
 Insert picture description here

3.2.2 Example

Above Demo.class In file , The offset address of the beginning of the constant pool is :0x0008.

  1. First is the constant counter (constant_pool_coun), Values are :0x0013, It means this Demo.class There is a total of 18 Constant

  2. cp_info_constant_pool[1]: The offset address is 0x000A, The content is :0x0A0004000F.

0x0A The flag bit indicates a CONSTANT_Methodref_info Constant ,0x0004 It's an index , Point to the second in the constant pool 4 Information represented by constants ;0x000F It's an index , Point to constant pool number 15 Information represented by constants .CONSTANT_Methodref_info The structure of constants is as follows :

 Insert picture description here
3. cp_info_constant_pool[2]: The offset address is 0x000F, The content is :0x0900030010,0x09 Indicates that this constant is a CONSTANT_Fieldref_info Constant ,0x0003 Represents an index , Point to constant pool number 3 Information represented by constants ;0x0010 It's an index , Indicates that it points to the constant pool 16 Information represented by constants .CONSTANT_Fieldref_info The structure of constants is as follows :

 Insert picture description here

  1. cp_info_constant_pool[3]: The offset address is 0x0014, The content is :0x070011.0x07 The flag bit indicates this constant It's a CONSTANT_Class_info Constant , Indexes 0x0011 Point to the second in the constant pool 17 Constant .CONSTANT_Class_info The structure of constants is as follows :

 Insert picture description here

  1. cp_info_constant_pool[4]: The offset address is 0x0017, The content is :0x070012.0x07 The flag bit indicates that this constant is a CONSTANT_Class_info Constant , Indexes 0x0012 Point to the second in the constant pool 18 Constant .

  2. cp_info_constant_pool[5]: The offset address is 0x001A, The content is :0x0100016D.0x01 Indicates that this constant is a CONSTANT_Utf8_info Constant ,0x0001 Express UTF-8 The number of bytes occupied by the encoded string ;0x6D Express The length is 1 Of UTF-8 The contents of the encoded string : m.

CONSTANT_Utf8_info The structure of constants is as follows :

 Insert picture description here

  1. cp_info_constant_pool[6]: The offset address is 0x001E, The content is :0x01000149.0x01 Indicates that this constant is a CONSTANT_Utf8_info Constant ,0x0001 Express UTF-8 The number of bytes occupied by the encoded string ;0x49 The length is 1 Of UTF-8 The contents of the encoded string : I.

  2. cp_info_constant_pool[7]: The offset address is 0x0022, The content is :

0x0100063C696E69743E.0x01 Indicates that this constant is a CONSTANT_Utf8_info Constant ,0x0006 Indicates that the string length is 6,0x3C696E69743E The length is 6 Of UTF-8 The contents of the encoded string : .

It analyzes 7 Constant , The rest of the constants are similar . According to the first u1 The mark of a , You know the type and table structure of this constant , You can know the length and meaning of this constant . We can also pass “javap -verbose” Command view .class The content of the document , As shown in the figure below :
 Insert picture description here

3.3 Access signs

3.3.1 The concept is introduced

The constant pool is followed by u2 Access flag bit of type (access_flags), This access flag bit is used to identify the access information at the class or interface level , Include : This Class Class or interface 、 Is it defined as public type 、 Is it defined as abstract type , If it's a class , Whether it is final Keyword modification . See the table below for the specific sign position and the meaning of the sign

 Insert picture description here

3.3.2 Example

stay Demo.class The access flag bit in the file is :0x0021. In the above table , We didn't find it 00 21 Access signs for , This is because of the access flag in the bytecode file , You can use multiple access flags in the above table to form a real access flag through or operation . Through ACC_SUPER and ACC_PUBLIC Can be combined 00 21 Your visit marks , That is to say, the access flag of this class is public And allowed to use invokespecial New semantics of bytecode instructions

3.4 Class index 、 Parent index 、 Interface index

3.4.1 The concept is introduced

stay .class These three data are used to determine the inheritance relationship of this class in the file .

  1. Class index :u2 data type , Used to determine the fully qualified name of this class .

  2. Parent index :u2 data type , The fully qualified name used to determine the parent of this class .

  3. Interface index :u2 Collection of data types , Used to describe which interfaces the class implements , These implemented interfaces will follow implements The order after the statement is in the interface index set from left to right . The interface index set is divided into two parts , The first part represents the interface counter (interfaces_count), It's a u2 Number of types

According to the , The second part is the interface index table, which represents the interface information , Immediately after the interface counter . If the interface implemented by a class is 0, The value of the interface counter is 0, The interface index table does not take up any bytes .

3.4.2 Example

Here it is Demo.class In file , Class index 、 Parent index 、 The interface indexes are as follows :

  1. Class index : The offset address is 0x00B3, The content is 0x0003, Indicates that it points to the... In the constant pool 3 Constant CONSTANT_Class_info, The first 3 The constant index points to 17 Constant , The first 17 A constant is a UTF-8 Encoded string , Its value is :com/lijiankun24/classpractice/Demo, Indicates the fully qualified name of this class

  2. Parent index : The offset address is 0x00B5, The content is 0x0004, It points to the... In the constant pool 4 Constant

CONSTANT_Class_info, The first 4 The constant index points to 18 Constant , The first 18 The value of a constant is :java/lang/Object, Represents the fully qualified name of the parent class

  1. Interface index : The offset address is 0x00B7, The content is 0x0000. because Demo Class does not implement any interfaces , So the counter of the interface index is 0, Indicates that there is no interface index .

3.5 Field table set

3.5.1 The concept is introduced

The field table collection is used to describe variables declared in interfaces or classes . The fields mentioned here include class level variables (static modification ) And object level variables ( of no avail static modification ), But it does not include local variables declared in the method .

The field table set consists of two parts : Field counters and field tables , The field counter indicates how many fields there are , Each field of the field table is named field_info To represent ,field_info The data structure of the table is as follows :

 Insert picture description here
Field table contains fixed data items to descriptor_index It's over , But in the

descriptor_index Followed by a collection of property sheets to store some additional information , Fields can describe zero to many additional information in the property sheet . After the field descriptor , Generally, there will be a property sheet set of this field , The property sheet set has two parts , The first part is the attribute counter , The second part is the attribute table .

The fields of the parent class are not listed in the field table set , But there may be some Java There are no declared fields in the code , For example, in order to maintain the accessibility of external classes in internal classes , Fields that point to external class instances are automatically added

3.5.1 Example

stay Demo.class The offset address of the field table set in the file is :0x00B9, The content is :0x 0001 0002 0005 0006 0000.

  1. 0x0001 Indicates that the field counter is 1, Only 1 A field

  2. field_info_fields[0]: The offset address is 0x00BB, The content is 0x0002 0005 0006 0000, Analyze this data according to the structure of the field table

  • 0x0002 Indicates the access ID of this field ,0002 Said is private Of
  • 0x0005 Indicates the name index entry of the field , Point to... In the constant pool 5 Constant , The first 5 A constant is a UTF-8 String m
  • 0x0006 Represents the descriptor index entry of the field , Point to... In the constant pool 6 Constant , The first 6 A constant is a UTF-8 String I,I Descriptor represents a int Type field
  • 0x0000 Express m Property sheet collection of fields , The property sheet set counter is 0, Indicates that this field has no additional attribute information .

3.6 Method table set

3.6.1 The concept is introduced

The field table is followed by the method table set , The method table represents the method information in the class or interface .

The method table set is almost the same as the field table set above , At the very beginning 2 Bytes represent a method counter , After the method counter , Is the real method data item . Each method in the method table uses a method_info Express , The data structure is as follows :
 Insert picture description here
In the method table structure , We can see the access flag bit of the method 、 Index of names 、 Descriptor index 、 Property sheet set , The code in the method is compiled , A named “code” In the properties of

3.6.2 Example

  1. stay Demo.class The offset address of the method table set in is :0x00C3, The method table set counter is 0x0002, Indicates that there are two method table data items in this method table collection . There may be questions ,Demo.java We only write one method in , Why are there two methods in the method table ? Because the compiler will automatically add instance constructors Method

  2. method_info_methods[0]: The offset address is :0x00C5, The content is :0x00 0100 0700 0800 0100 0900 0000 1d00 0100 0100 0000 052a b700 01b1 0000 0001 000a 0000 0006 0001 0000 0003.

  • 0x0001:access_flags Express ACC_PUBLIC, That means the method is public Of
  • 0x0007:name_index Represents the method name index , Point to... In the constant pool 7 Constant , The first 7 A constant is a UTF-8 character string , The value is :
  • 0x0008:descriptor_index Table method descriptor index entries , Point to... In the constant pool 8 Constant , It's a UTF-8 character string , The value is :()V0x0001: Represents the property sheet collection counter of this method , Yes 1 Attributes
  • 0x0009: Represents attribute_name_index, Point to... In the constant pool 9 UTF-8 Constant :Code, Indicates that this property is the bytecode description of the method Code attribute

 Insert picture description here

Then we will analyze the bytecode content of this instance constructor in turn according to the structure of the above table , As for the specific meaning of bytecode

In the following article, we will analyze and introduce

  • 0x0009: This has been described above , It means this Code The name index of the property , Its value is “Code” character string

  • 0x0000 001d:attribute_length Indicates that the attribute length is 29 Bytes

  • 0x0001:max_stack Indicates that the maximum depth of the operand stack is 1

  • 0x0001:max_locals Indicates that the maximum length of the local variable table is 1

  • 0x0000 0005:code_length Indicates that the bytecode instruction length is 5, share 5 Bytecode instructions

  • 0x2ab7 0001 b1: this 5 individual u1 data , Express 3 Bytecode instructions ,0x2a = aload_0, 0xb7 = invokespecial,0x0001 = Represents an index to the constant pool , yes invokespecial Instruction parameter ,0xb1 = return Means to return... From the current method

  • 0x0000 exception_table_length=0, The length of the exception table set is 0

  • 0x0001:attributes_count=1(Code The attribute table also contains 1 Attribute table )

  • 0x000a: Point to the tenth constant in the constant pool :LineNumberTable,LineNumberTable The attribute structure is shown in the following figure , The content is :0000 0006 0001 0000 0003
     Insert picture description here

  • 0x0000 0006:attribute_length Indicates that the attribute length is 6

  • 0x0001:line_number_table_length, Indicates the following line_number_info Table has 1 individual ,line_number_info Table includes start_pc and line_number Two u2 Data item of type , The former is bytecode line number , The latter is java Source line number :start_pc:00 00,end_pc:00 03

  1. method_info_methods[1]: Above we analyzed the first method : Instance constructor method , Analysis process It's just like that , The method table has a fixed structure , It contains some fixed information , Including the maximum depth of the operand stack 、 Maximum length of local variable table 、 And very important Code attribute , stay Code Property contains java Method to compile the generated bytecode instructions , If you want to browse the contents of the method table set quickly , You can also use “javap -verboseDemo.class” Command view , As shown in the figure below
     Insert picture description here

3.7 Property sheet set

3.7.1 The concept is introduced 1. stay class file 、 Field table 、 Method tables can carry their own set of property tables , It is used to describe some scene specific information . 2. The format of attribute table is relatively fixed , It includes three parts :

One u2 Of attribute_name_index Point to one of the constant pools UTF- 8 String constants represent an attribute name

One u4 The data type represents attribute_length Represents the byte length of the attribute value

Attribute value information of this length , The structure is shown in the following figure :

 Insert picture description here

The restriction of attribute table is relatively loose , Anyone who implements the compiler can write custom attribute value information to the attribute table ,Java The virtual machine will ignore the attribute values it does not know .

  1. stay Java 7 The virtual machine specification has been predefined 21 Item properties

3.7.2 Example

  1. Demo.class The offset address of the attribute table in is :0x011D, The content is 0x00 0100 0D00 0000 0200 0E

  2. 0x0001 The counter representing this property sheet collection is 1, Yes 1 Attributes

  3. attribute_info_attributes[0]: The offset address is :0x011F, The content is 0x00 0D00 0000 0200 0E

0x000D: Point to... In the constant pool 13 individual Utf-8 Constant :SourceFile,

SourceFile Property is used to record the generation of this Class File source file name , The structure shown below :

 Insert picture description here

0x0000 0002:attribute_length The attribute length is 2

00 0E:sourcefile_index Point to the second in the constant pool 14 Constant Demo.java

3.8 010 Editor

analysis .class The file structure is boring , But if you can understand .class The content of the file structure , And understand the meaning , know .class File structure Code The execution of bytecode instructions in attributes , To our

Java The ability improvement is still relatively large .

analysis .class File structure , We can use “javap -verbose Demo.class” Command view , We can also use 010 Editor software analysis , You can easily view the address offset of each data item 、 Data item content . such as , We'd like to check number 4 Contents of a constant pool , As shown in the figure below

 Insert picture description here

At the end , This article analyzes .class File structure , Knowing its essence is to 8 Binary stream file stored in bits and bytes

This article ends here , Learn more about Java For knowledge, please follow wechat official account “ Lao zhoula IT”

原网站

版权声明
本文为[Java programmer Zhou Yu]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/188/202207061657088608.html