当前位置:网站首页>JVM in-depth

JVM in-depth

2022-07-07 06:23:00 leowang5566

  • summary
  1. java How the program works

compile :.java The source file is compiled to .class Bytecode file

pack :.class The bytecode file is packaged into a jar A bag or one war package

function : Use java -jar Wait for the command to run the program , Start a jvm process

Class loading : Use class loader to put .class The bytecode file is loaded into jvm in

perform :jvm Bytecode execution engine starts execution main Method

  1. When to trigger class loading

When this class is used in the code .

jvm start-up , First we will find main Class of method , Will be loaded into jvm Memory , And then execute main Method .

main Method execution process , Which classes are used , Just load this class .

  1. Class loading process

Loading phase :

  1. Get the binary byte stream that defines the class through the fully qualified name of the class
  2. Transform the static storage structure represented by this byte stream into the runtime data structure of the metadata area
  3. Generate a representation of this class in memory java.lang.Class object , As the access entry of various data of metadata area .

Before loading subclasses , To load and initialize the parent class

Load only , The operation is executed by the execution engine

Link phase :

verification : according to jvm standard , Verify the loaded .class Whether the document meets the specification , prevent jvm The file was tampered with

Get ready : Non for class final Embellished static Variable allocates memory space , Default initialization .

analysis : Replace the symbolic reference in the constant pool with a direct reference

A direct reference is a pointer directly to the target , such as System.out.println(“haha”)

Initialization phase :

perform clinit Construction method ,javac compiler , Will static Variables and static Constructor code is merged sequentially .

If this class has a parent ,jvm Will execute the parent class first clinit Method .

Jvm To ensure that clinit Method loading , Lock synchronously , Make sure the class loads only once .

  1. Object creation process

according to new To locate a symbolic reference to a class in the constant pool

If this symbol reference is not found , The description class has not been added to , Then add classes 、 analysis 、 initialization

jvm The virtual machine allocates memory for objects in the heap

Memory to be allocated , Initialize to 0 value , Default initialization

Call the constructor of the object

Point the variables in the local variable table to this object

  1. Object's memory layout in memory
  1. Object head

markwork part

Lock status flag : The lock state of an object is divided into no lock 、 Biased locking 、 Lightweight lock 、 Several marks of heavyweight lock .

Thread holding lock : The thread that holds the current object lock ID.

object HashCode

GC Generational age : Every time an object passes GC And survived ,GC Age adds 1.

Class pointer : Class meta information can be found through objects , Used to locate object types .

The length of the array : When the object is of array type, the length of the array will be recorded .

  1. The instance data

Object instance data is the real data of the object itself

It mainly includes its own member variable information , At the same time, it also includes the interface of implementation 、 Member variable information of parent class .

  1. Alignment filling

The memory size is 8 Multiple of bytes

If the information size of the object itself does not reach the requested memory size , Then this part is to fill in the rest .

  1. Object o=new Object() How much memory

If jvm The default is on UseCompressedClassPointers Type pointer compression

So first of all new Object() Occupy 16 Bytes

markword Occupy 8 byte + classpointer Occupy 4 byte + Instance data accounts for 0 byte + A filling 4 byte

then Object o There is a quote , This reference turns on compression by default , So it is 4 Bytes ( Each reference takes 4 Bytes )

So it's taken up 20 Bytes (byte)

If jvm Don't open CompressedClassPointers Type pointer compression

So first of all new Object() Occupy 8(markword) + 8(class pointer)+ 0(instance data)+0( Complete as 8 Multiple )16 Bytes , Then add quotation ( because jvm Default on UseCompressedClassPointers Type pointer compression , So the default reference is 4 byte , But compression is not enabled here , So for 8 byte ) It's occupied 8 Bytes =24 Bytes

  1. Class loader

Start class loader

Bootstrap ClassLoader,c++ To write , Be responsible for loading the java lib The core class under the directory (rt.jar、resources.jar), For providing jvm The class you need

Extend the classloader

ExtClassLoader,java To write , Inherited from ClassLoader, load lib\ext Classes under directory , Allow developers to operate loaders by reference .

Application class loader

AppClassLoader,java To write , Inherited from ClassLoader, load classpath Class in the path specified by the environment variable . This class is the default class loader in the program . adopt ClassLoader.getSystemClassLoader() Method can get the loader .

Custom class loaders

Customize the class loader according to personal needs , The specified path can be loaded class file

Why custom class loaders ? Isolation loading class 、 Modify the loading method of the class 、 Extended load source 、 Prevent source code leakage

  1. Get the loader of the class

clazz.getClassLoader()                         Get the... Of the current class ClassLoader

Thread.currentThread().getContextClassLoader()   Gets the context of the current thread ClassLoader

ClassLoader.getSystemClassLoader()             obtain AppClassLoader

ClassLoader.getSystemClassLoader().getParent()   obtain ExtClassLoader

  1. Parent delegate mechanism
  1. Why use the parental delegation mechanism

Avoid duplicate loading of classes

Protect the core api, To prevent from being tampered with

For example, write one in your own program java.lang.String, Who should the class loader load ?

Through parental delegation , Will be delegated to BootStrapClassLoader, Load first java Built in java.lang.String

  1. What is parental delegation mechanism

If a class loader receives a class load request , It doesn't load itself first , Instead, delegate the request to the parent loader to execute

If the parent loader still has its parent logger , Then recursively delegate up , Until the request finally reaches the top BootstrapClassLoader.

If the parent class loader can complete class loading , You're back ; If the parent class logger fails to load , Then push down recursion to the subclass loader to load .

special :Tomcat Breaking the parental appointment mechanism , Every webapp Class loader , Only load the currently applied classes , The parent class loader will not be updated .

  1. Tomcat Class loader

Tomcat The custom Common、Catalina、Shared Class loaders, etc , It's actually used to load Tomcat Own some core foundation class library .

then Tomcat For every... Deployed in it Web Every application has a corresponding WebApp Class loader , Responsible for loading our deployed Web Application class

as for Jsp Class loader , It's for everyone JSP We've got one Jsp Class loader .

And you must remember ,Tomcat It's breaking the parental appointment mechanism

Every WebApp Responsible for loading the corresponding one Web Applied class file , That is, we have written a system packaged war Everything in the bag class file , Does not pass to the upper class loader to load .

  1. jvm Memory area
  1. Why do I need to divide the memory area

Where to put the class in memory after loading ?

Where are the local variables of method operation ?

Where are the objects created in the code ?

So we must divide different areas , Access these data .

  1. The method area where the class is stored

jdk1.8 It was called the method area , Then it is called metadata area

jvm load .class Class file , Will be loaded here

  1. Program counter (PC register )

Pc Registers are used to store the address to the next instruction , Also about to execute the instruction code . The execution engine reads the next instruction .

Bytecode execution engine executes .class When you file , The execution position of bytecode instructions will be recorded , Each thread has a program counter , Represents the code location that the current thread executes .

Why pc register ?

because cpu You need to switch threads all the time , Now , After switching back , You need to know where to start and continue .

Jvm The bytecode interpreter needs to be changed pc Register to determine what kind of bytecode instruction should be executed next .

Thread switching context needs to save the location of bytecode instructions , Resume thread execution , From which bytecode instruction position to continue executing code .

such as main Threads execute main Method ,main The program counter of a thread refers to the location information where the current bytecode instruction is executed .

  1. Virtual machine stack

Every time a method is executed, a stack frame is formed , The last method to be executed releases the stack frame after execution .

In every stack frame , There is a local variable table 、 The stack of operands 、 Dynamic links 、 Methods the export

Each thread has its own independent virtual machine stack

  1. The background

Because of the cross platform design ,Java The instructions of are designed according to the stack . Different platforms CPU Different architectures , So it can't be designed as register based .

The advantage is cross platform , Instruction set is small , The compiler is easy to implement , The disadvantage is that the performance is reduced , More instructions are needed to implement the same function .

  1. The stack and heap

A stack is a unit of runtime , A heap is a storage unit .

The stack solves the running problem of the program , That is, how the program is executed , Or how to deal with the data . Heap solves the problem of data storage , How to put the data 、 Where to put it .

  1. Java What is the virtual batch machine ?

Java Virtual machine stack (Java Virtual Machine Stack) , The early days were also called Java Stack .

Each thread will create a virtual machine stack when it is created , It stores stack frames one by one (stack Frame) , Corresponding again and again Java The method is useless .

  1. Life cycle

Life cycle is consistent with thread .

  1. The characteristics of the stack ( advantage )

Stack is a fast and efficient way to allocate storage , Access speed is second only to the program counter .

JVM Direct pair Java There are only two operations on the stack :

Each method executes , Along with the stack ( Push 、 Pressing stack )

After the execution of the stack work

There is no garbage collection problem for the stack , The stack exists OOM, non-existent GC ( Because there is only the operation of entering and leaving the stack )

  1. Possible exceptions in the stack

Java The virtual machine specification allows Java Stack size is dynamic or fixed .

If you use a fixed size Java Virtual machine stack , And every thread Java The virtual machine stack capacity can be selected independently when the thread is created . If the stack capacity allocated by the thread request exceeds Java Maximum capacity allowed by virtual machine stack ,Java The virtual machine will throw a StackOverflowError abnormal .

If Java The virtual machine stack can be expanded dynamically , And when trying to expand, you can't apply for enough storage , Or when creating a new thread, there is not enough memory to create the corresponding virtual machine stack , that Java The virtual machine will throw a OutofMemoryError abnormal .

  1. The internal structure of the stack frame

Every stack frame stores :

Local variable table (Local variables)

The stack of operands (operand stack) ( Or expression stack )

Dynamic links (Dynamic Linking) ( Or a method reference to the runtime constant pool )

Method return address (Return Address) ( Or method normal exit or abnormal exit definition )

Some additional information

  1. Local variable table (Local variables)

Local variable table is also called local variable array or local variable table

Define as a Array of numbers , Mainly used for storage Method parameter And defined in the method body local variable , These data types include all kinds of basic data types 、 Object reference (reference) , With returnAddress type .

Because the local variable table is built on the thread stack , It's the thread's private data , therefore There are no data security issues

The size of the local variable table is determined at compile time , And stored in the Code Attribute maximum local variables In data item . The size of the local variable table does not change during the method run .

  1. About Slot The understanding of the

Parameter values are always stored in the local variable array index0 Start , To array length -1 End of index of . Local variable table , The most basic storage unit is Slot ( Variable slot )

In the local variable table Store various basic data types known during compilation (8 Kind of ), Reference type (reference),returnAddress Variable of type .

In the local variable table ,32 Types within bits occupy only one slot ( Include returnAddress type ),64 Bit type (long and double) Take two slot.

byte、short 、char Converted to before storage int,boolean Also converted to int,0 Express false , Not 0 Express true.

long and double They occupy two Slot.

Jvm Will be in the local variable table every last slot Assign an access index , Through this index, the local variable values specified in the local variable table can be accessed successfully

When a sample method is called , Its method parameters and variables defined inside the method will Copied in order To each of the local variable tables slot On .

If you need to access one of the local variables table 64bit When the local variable value of , Just use Previous index that will do .(long、double)

If the current frame is created by a constructor or instance method , that The object refers to this Will be stored in index by 0 Of slot It's about , The rest of the parameters continue to be arranged in parameter table order .

  1. Slot Reuse of waste

The slots in the local variable table in the stack frame can be reused , If a local variable goes beyond its scope , Then the new local variable declared after its scope is likely to reuse the slot of the expired local variable , So as to save resources .

  1. Description of variables

Classification of variables : Basic data type 、 Reference data type

According to the position declared in the class :

Member variables

Static member variable of class :linking Default initialization of the preparation phase of the phase , The initialization phase shows the initialization assignment

Instance member variables of class : With the creation of objects , Will allocate space in space , Default initialization .

local variable

Before using , Display assignment is required , Otherwise, the compilation fails .

  1. Supplementary description of local variable table

In the stack frame , The most relevant part of performance tuning is the local variable table mentioned earlier .

When the method executes , The virtual machine uses the local variable table to complete the method transfer .

Variables in the local variable table are also important garbage collection root nodes , As long as the objects directly or indirectly referenced in the local variable table are not recycled .

  1. The stack of operands

The stack of operands , During method execution , According to bytecode instruction , Write or extract data from the stack , That is to say, the stack in and out of the stack . Its underlying data structure is an array . For example, execute replication 、 In exchange for 、 Sum and so on .

The stack of operands , It is mainly used to save the intermediate results of the calculation process , At the same time, it serves as the temporary storage space for variables in the calculation process .

The operand stack does not use the way of access index to access data , But only through standard Push (push) And out of the stack (pop) operation To complete a data access .

If the called method has a return value , Its return value will be pushed into the stack of operands in the current stack frame , And update the PC The next bytecode instruction to be executed in the register .

The data type of the elements in the operand stack must strictly match the sequence of bytecode instructions , This is verified by the compiler during the compiler , At the same time, in the class loading process, the data flow analysis phase of the class verification phase needs to be verified again .

We said Java The interpretation engine of virtual machine is stack based execution engine , The stack refers to the operand stack .

  1. Dynamic links

Method reference to the runtime constant pool

Each stack frame contains a point to Reference to the method to which the stack frame belongs in the runtime constant pool . The purpose of including this reference is to support the code implementation of the current method Dynamic links (Dynamic Linking) . such as : invokedynamic Instructions .

stay Java When the source file is compiled into a bytecode file , All variable and method references are referred to as symbols (symbolic Reference) Save in class In the constant pool of files . such as : Describes when a method calls another method , It is represented by a symbolic reference to a method in the constant pool , that The purpose of dynamic linking is to convert these symbolic references into direct references to calling methods .

Most bytecode instructions will access the constant pool during execution

Constant pool The constant pool is stored in the method area at run time ( Runtime constant pool )

Call... By reference , Several copies are called together, and the corresponding address is the same , Otherwise waste

Like polymorphism , Write a parent class , Running subclasses

Why do you need a constant pool ?

The function of constant pool , Just to provide some symbols and constants , It is easy to identify the instruction .

  1. Method call

stay JVM in , Converting a symbolic reference to a direct reference to a calling method is related to the binding mechanism of the method ( Determine during compilation or during runtime ).

Static links :

When a bytecode file is loaded into JVM Internal time , If the target method to be called is known at compile time , And the operation period remains unchanged . In this case, the process of converting the symbolic reference of the calling method to a direct reference is called static linking .

Dynamic links :

If the called method cannot be determined at compile time , in other words , Only symbolic references of calling methods can be converted to direct references during program runtime , Because of the dynamic nature of this reference conversion process , So it's called dynamic linking .

The corresponding method is Binding mechanism by : Early binding (Early Binding) And late binding (Late Binding) . Binding is a field 、 Methods or classes are replaced by direct references in symbolic references , It only happened once .

Early binding :

If the target method is called in the early compilation period , And the operation period remains unchanged , You can bind this method to the type it belongs to , thus , Because the purpose of being called is clear . What is the bidding method – individual , Therefore, you can use static links to convert symbol references to direct references .

Late binding :

If the called method cannot be determined at compile time , You can only bind related methods according to the actual type during the program run time , This binding is also called late binding .

Early late binding

With the emergence of high-level language , Be similar to Java- There are more and more object-oriented programming languages nowadays , Although this kind of programming language in the grammatical style . There are certain differences , But they always have one thing in common with each other , That is, they all support encapsulation 、 Object oriented features such as inheritance and polymorphism

Since this kind of programming language is polymorphic , So naturally, there are two binding methods, early binding and late binding .

Java In fact, any ordinary method in this paper has the characteristics of virtual function , They are equivalent to C++ Virtual functions in language (C+ + You need to use keywords in virtual To explicitly define ). If in Java When a program does not want a method to have the characteristics of a virtual function , You can use the keyword final To mark this method .

final It just can't be rewritten , At compile time .

Non virtual method

If the method determines the specific calling version at compile time , This version is immutable at run time . The method is called Non virtual method .

Static methods 、 Private method 、final Method 、 Instance builder 、 Parent methods are non virtual methods , Other methods are called virtual methods .

The premise of using polymorphism of subclass objects :1. Class inheritance .2. Method rewrite

The following method call instructions are provided in the virtual machine

Normal call instructions :(1、2 Non virtual method )

invokestatic: Call static methods , The parsing phase determines the unique method version

invokespecial: Calling method 、 Private and superclass methods , The parsing phase determines the unique method version

invokevirtual: Call all virtual methods

invokeinterface: Call interface method

Call instructions dynamically :

invokedynamic: Dynamically resolve the method to be called , And then execute (JDK7 newly added )

The first four instructions are fixed in the virtual machine , The call and execution of the method cannot be interfered by human beings , and invokedynamic The instruction supports the user to determine the method version . among invokestatic Instructions and invokespecial A method called by an instruction is called a non virtual method , The rest (final Except decorated ) It's called virtual method .

About invokedynamic Instructions

JVM Bytecode instruction set has been relatively stable , Until Java7 It's just added a new one invokedynamic Instructions , This is a Java In order to achieve 「 Dynamic type language 」 An improvement made with support .

But in Java7 Direct generation is not provided in invokedynamic Method of instruction , Need help ASM This underlying bytecode tool to generate invokedynamic Instructions . until Java8 Of Lambda Occurrence of expression ,invokedynamic finger Generation of orders , stay Java Only in the middle A direct generation method .

Java7 The essence of the dynamic language type support added in is to Java Modification of virtual machine specification , Not right. Java Modification of language rules , This one is relatively complicated , Added method call in virtual machine , The most direct beneficiary is running in Java Compiler for dynamic language of platform .

Dynamically typed languages and statically typed languages

The difference between a dynamically typed language and a statically typed language is whether the type is checked at compile time or at run time , Satisfying the former is a statically typed language , Instead, it's a dynamically typed language .

To put it bluntly, it is , Static type language is to judge the type information of variable itself ; Dynamic type language is to judge the type information of variable value , Variables have no type information , Only variable values have type information , This is an important feature of dynamic language .

The nature of method rewriting

Java The essence of method rewriting in language :

1. Find the actual type of object executed by the first element at the top of the stack of operands , Write it down as C.

2. If in type C Find a method that matches the description in the constant with the simple name , Then check the access rights , If it passes, it returns the direct reference to this method , The search process is over ; If it doesn't go through , Then return to java.lang.IllegalAccessError abnormal .

3. otherwise , According to the relationship of inheritance, you should treat C Each parent class of 2 Step by step search and verification process .

4. If you never find the right way , Throw out java.lang.AbstractMethodError abnormal .

IllegalAccessError Introduce :

A program attempts to access or modify a property or call a method , This property or method , You don't have access to . General , This will cause compiler exceptions . If this error occurs at runtime , It means that an incompatible change has taken place in a class .

Virtual method table

In object-oriented programming , Dynamic dispatch will be used frequently , If you have to search for the right target in the method metadata of the class in each dynamic dispatch process, the execution efficiency may be affected . therefore , To improve performance ,JVM Create a virtual method table in the method area of the class (virtual method table) ( Non virtual methods do not appear in the table ) To achieve . Use index tables instead of looking up .

There is a virtual method table in each class , The table contains the actual entries for each method .

When is the virtual method table created ?

The virtual method table is created and initialized during the link phase of class loading , After the initial value of the class variable is ready ,JVM The method table of this class will also be initialized .

  • Method return address

Mainly for normal exit

To store the call to the method pc Register value .

The end of a method , There are two ways :

Normal execution complete

An unhandled exception occurred , Abnormal exit

Whatever way you exit , After the method exits, it returns to the location where the method was called . When the method exits normally , Of the caller pc The value of the counter is used as the return address , That is, the address of the next instruction of the instruction calling the method . And by the exception of the exit , The return address is determined by the exception table , This part of information will not be saved in stack frame . To the execution engine , To perform subsequent operations

Essentially , Method exit is the process of the current stack frame out of the stack . here , Need to restore the local variable table of the upper method 、 The stack of operands 、 Push the return value into the operand stack of the caller stack frame 、 Set up PC Register values, etc , Let the caller method continue to execute .

The difference between a normal completion exit and an abnormal completion exit is that

Exit through exception will not produce any return value to its upper layer callers .

When a method starts executing , There are only two ways to exit this method :

1、 The execution engine encountered any - Bytecode instruction returned by a method (return), There will be a return value passed to the upper level method caller , The export is normally completed ;

Which return instruction a method needs to use after a normal call is completed depends on the actual data type of the method return value .

In bytecode instructions , The return instruction contains ireturn ( When the return value is boolean、byte、 char、short and int Use )、lreturn、 freturn、 dreturn as well as areturn, There's another one return The instruction is declared as void Methods 、 Instance initialization method 、 Class and interface initialization methods use .

2、 Exception encountered during method execution (Exception) , And this exception is not handled in the method , That is, as long as no matching exception handler is found in the exception table of this method , Will cause the method to exit . For short, the export is completed abnormally .

Exception handling when an exception is thrown during method execution , Stored in an exception handling table , It is convenient to find the code to handle the exception when an exception occurs .

  1. Some additional information

Stack frames also allow | Carry with Java Some additional information about virtual machine implementation . for example , Information to support program debugging .

  1. Stack of related interview questions

For example, stack overflow ?(StackOverflowError)

adopt -Xss Set the stack size , More than the , The stack overflows

Adjust stack size , Can we guarantee that there is no overflow ?

You can't , Theoretically, it can only be guaranteed to appear later , The depth of the stack is deeper

Is the larger the stack memory allocated, the better ? 

Occupy stack space , Fewer threads can run

Will garbage collection involve virtual machine stacks ?

Can't , Virtual machine stack directly out of the stack

  1. Native Method Stack

native Method execution , You also need to store stack frame information . such as hashcode()、Thread.start0()

  1. Heap area

Method inside new An object , This object will be stored in the heap , Stack frame corresponding to method , The local variable table has a reference to this object .

  1. The core concept

One JVM The instance has only one heap memory , So is heap Java The core area of memory management .

Java The heap area is in JVM It is created at startup , The size of the space is determined . yes JVM The largest memory space managed .

The size of heap memory can be adjusted .

《Java Virtual machine specification 》 Regulations , The heap can be in a physically discontinuous memory space , But logically it should be seen as continuous .( It involves physical memory and virtual memory )

All threads share Java Pile up , It can also be divided here Thread private buffer (ThreadLocal Allocation Buffer, TLAB) .

  1. Set heap space size

-Xms600m -Xmx600m

Java The heap is used to store Java Object instances , So the size of the heap is JVM It's set when it starts , You can choose "-XmX" and "-Xms" To set it up .

-Xms" Used to represent Starting memory of heap , Equivalent to -XX: InitialHeapSize

-Xmx" Is used to indicate Maximum memory of heap , Equivalent to -XX : MaxHeapSize

Once the memory size in the heap exceed “-Xmx" The maximum memory specified , will Throw out OutOfMemoryError abnormal .

Usually will -Xms and -Xmx Both parameters are configured with the same value , The aim is to be able to java After the garbage collection mechanism cleans up the heap, it does not need to re partition and calculate the size of the heap , thus Improve performance .

By default , Initial memory size : Physical computer memory size /64

Maximum memory size : Physical computer memory size /4

  1. How to view parameters

Set up -Xms600m -Xmx600m

Mode one :

jps Check the process

jstat -gc 6700 View the memory usage of the process

S0C    S1C    S0U    S1U      EC       EU        OC         OU       MC     MU    CCSC   CCSU   YGC     YGCT    FGC    FGCT     GCT

25600.0 25600.0  0.0    0.0   153600.0 12288.1   409600.0     0.0     4480.0 770.3  384.0   75.9       0    0.000   0      0.000    0.000

Mode two :

Set up jvm Parameters :-Xms600m -Xmx600m -XX:+PrintGCDetails

-Xms : 575M

-Xmx : 575M

Heap

 PSYoungGen      total 179200K, used 12288K [0x00000000f3800000, 0x0000000100000000, 0x0000000100000000)

  eden space 153600K, 8% used [0x00000000f3800000,0x00000000f44001b8,0x00000000fce00000)

  from space 25600K, 0% used [0x00000000fe700000,0x00000000fe700000,0x0000000100000000)

  to   space 25600K, 0% used [0x00000000fce00000,0x00000000fce00000,0x00000000fe700000)

 ParOldGen       total 409600K, used 0K [0x00000000da800000, 0x00000000f3800000, 0x00000000f3800000)

  object space 409600K, 0% used [0x00000000da800000,0x00000000da800000,0x00000000f3800000)

 Metaspace       used 3446K, capacity 4496K, committed 4864K, reserved 1056768K

  class space    used 381K, capacity 388K, committed 512K, reserved 1048576K

Example :

// return Java The total amount of heap memory in the virtual machine

long initialMemory = Runtime.getRuntime().totalMemory() / 1024 / 1024;

// return Java The maximum amount of heap memory the virtual machine is trying to use

long maxMemory = Runtime.getRuntime().maxMemory() / 1024 / 1024;

System.out.println("-Xms : " + initialMemory + "M");

System.out.println("-Xmx : " + maxMemory + "M");

result :

-Xms : 575M

-Xmx : 575M

Than 600M Less , Why? ?

S0 and S1 You can only use one , The other one is empty

  1. Use jvisualvm Tool View

Jvisualvm Official website :https://visualvm.github.io

Cmd Input jvisualvm And open the tool

Installing a plug-in , First set the plug-in address https://visualvm.github.io/uc/8u131/updates.xml.gz

  1. Heap memory subdivision

Jdk8: The new generation (eden、s0、s1)new generation + Old age (old generation)

-Xms10m -Xmx10m -XX:+PrintGCDetails

Heap

 PSYoungGen      total 6144K, used 2251K [0x00000000ff980000, 0x0000000100000000, 0x0000000100000000)

  eden space 5632K, 39% used [0x00000000ff980000,0x00000000ffbb2d60,0x00000000fff00000)

  from space 512K, 0% used [0x00000000fff80000,0x00000000fff80000,0x0000000100000000)

  to   space 512K, 0% used [0x00000000fff00000,0x00000000fff00000,0x00000000fff80000)

 ParOldGen       total 13824K, used 0K [0x00000000fec00000, 0x00000000ff980000, 0x00000000ff980000)

  object space 13824K, 0% used [0x00000000fec00000,0x00000000fec00000,0x00000000ff980000)

 Metaspace       used 3486K, capacity 4498K, committed 4864K, reserved 1056768K

  class space    used 387K, capacity 390K, committed 512K, reserved 1048576K

  1. Java How much memory space does the object in heap memory occupy ?

One is some information about the object itself

One is the space occupied by instance variables of objects as data

For example, the object head , stay 64 Bit linux On the operating system , Will occupy 16 Bytes .

There's a... Inside int Instance variable of type , Will occupy 4 Bytes ;long Occupy 8 Bytes ;

  1. Out of heap memory

nio It can be used inside DirectBuffer To reference and manipulate off heap memory

  1. Jvm Garbage collection mechanism
  1. What happens after the method is executed ?

The method is finished , The stack frame corresponding to the method will be out of the stack , The data in the stack frame will be released , The reference in the local variable table points to the object , Will break the reference .

  1. Why garbage collection

Java Objects created in heap memory occupy memory , And memory resources

  1. How to deal with unnecessary objects ?

Unwanted objects are about to be garbage collected .

jvm Garbage collection mechanism ,jvm When it starts , Will start a thread running in the background , Special garbage collection .

If an instance object does not have a local variable of any method pointing to it , There is no static variable of a class , Constant points to it . Then it will be cleared away .

  1. Will the metadata area be garbage collected

At the same time, the following 3 Conditions ,class Objects can be recycled

  1. All instance objects of the class have been changed from java Heap memory recycled
  2. Load the ClassLoader It's been recycled
  3. Of the class Class Object has no references
  1. jvm Generational model
  1. The young and the old

JVM take Java Heap memory is divided into two areas , One is the younger generation , One is the old days .

The younger generation , Don't use it

Old age , After creation, it needs to exist and be used for a long time

  1. Why do we need to distinguish between the young generation and the old generation

The object of the younger generation It will be recycled soon , The object of old age , For a long time .

therefore Their garbage collection algorithm is different , Objects need to be stored separately .

  1. Forever

Metadata area , Storage information

  1. Object memory allocation and flow

Objects are assigned first in the new generation

If the new generation is full , Will trigger Minor GC Recycle garbage objects that no one references

If an object has escaped garbage collection more than ten times , Will be put into the old generation

If the old generation is full , Then it will also trigger garbage collection , Clean up garbage objects that no one in the elderly generation references

  1. JVM Core parameters
  1. Core parameters

-Xss: Stack memory size per thread

-Xms:Java The size of heap memory  small

-Xmx:Java The maximum size of heap memory     max

-Xmn:Java New generation size in heap memory , After deducting the new generation, the rest is the memory size of the old age new

-XX:MetaspaceSize: Metadata area size

-XX:MaxMetaspaceSize: Maximum size of metadata area

-XX:SurvivorRatio=5  eden and s1 s2 The proportion of representative 5:1:1   When converting eden=xxx * 5/7,s1=xxx *1/7

  1. IDE adopt VM arguments Set up jvm Parameters

 -Xss1M -Xms512M -Xmx512M -Xmn256M -XX:MetaspaceSize=128M -XX:MaxMetaspaceSize=128M

  1. java -jar Startup settings jvm Parameters

java -Xss1M -Xms512M -Xmx512M -Xmn256M -XX:MetaspaceSize=128M -XX:MaxMetaspaceSize=128M -jar a.jar

  1. Tomcat How to configure jvm Parameters

Tomcat Is in the bin In the catalog catalina.sh Can be added to JVM Parameters

  1. Case actual combat - Million transaction payment system jvm Parameter setting
  1. Background of payment system

  1. The core issue to consider

How many machines do we need to deploy in our payment system ?

How much memory space does each machine need ?

On every machine JVM How much heap memory space needs to be allocated ?

to JVM How much memory space can guarantee the creation of so many payment orders in memory , Without causing a direct crash due to insufficient memory ?

  1. How many orders are processed per second

Every day 100 Order per order , Allocate a few hours , About per second 100 Orders .

Deployed 3 Taiwan machine , Each machine processes 30 Orders .

  1. Processing time of each payment order

hypothesis 1s, That is, a machine 1s Yes 30 The object of a payment order

  1. Memory space required for each payment order

One int 4byte,long 8byte,20 A field , Hundreds of bytes , A little bigger ,500kb

  1. Memory usage of payment requests per second

30 A payment order * 500 byte =15000 byte , about 15kb

  1. The payment system works

Payment orders are constantly created , Hundreds of thousands in total , It occupies hundreds of megabytes of space , The new generation will be full .

This will trigger minor gc, Recycle the garbage of the new generation , Make room .

  1. Memory occupation of the complete payment system forecast

The payment system has other requests to process , You can put the previous second 15kb expand 10-20 times , Just a few hundred per second kb-1MB Between .

  1. jvm How to set the heap memory

Online business system , The common configuration is 2 nucleus 4g perhaps 4 nucleus 8g.

If 2 nucleus 4g Memory , Per second 1mb, Soon hundreds mb, The new generation is full , Trigger minor gc, frequent minor gc, Affect the stability of the system , So we choose 4 nucleus 8g.

jvm The process allocates at least 4g Above memory , The new generation can be allocated 2g Memory .

such 1s 1mb, almost 35 minute , Need one minor gc, To reduce the gc The frequency of .

Therefore, the machine adopts 4 nucleus 8g,jvm Heap to 3g, The new generation 2g, Old age 1g

If horizontal expansion 5 platform ,10 Taiwan machine , Each machine handles fewer requests , Yes jvm Less pressure .

  1. Setting of metadata area

Just launched , There are not many reference specifications , Usually hundreds of mb That's enough. .

  1. Interview questions of Dachang
  1. Under what circumstances will an object be recycled

When the memory of the new generation is full, it will trigger YoungGC, In the old days, when the memory is full, it will trigger FullGC, At this time, objects will be recycled

GCRoots Unreachable objects will be recycled

FullGC The soft reference object will be recycled

as long as GC, Weak references are recycled

  1. Objects referenced by which variables cannot be recycled

By GCRoots Referenced objects are not recycled ,GC Roots It refers to local variables , Or static variables of classes

  1. What is replication algorithm

Replication algorithm is suitable for the new generation , A large number of objects need to be recycled

Two pieces of memory , Objects are moved from one block of memory to another

Advantage is , Avoid memory fragmentation

The disadvantage is that , It's a waste of memory

  1. Improvement of replication algorithm

Because the object that eventually survives may be relatively small , Account for only a 1% such , Therefore, the memory of two areas copied back and forth is reduced , Make a larger area eden District .

eden The location accounts for 90% Space , Two survivor Each district occupies 10%, Total utilization 90%, Just stay 10% Used to store the final surviving object .

Every time you recycle ,eden Area and one of them survivor All living objects in the area , Copy to another survivor District .

Finished recycling ,90% The space is free , Can continue to use .

  1. When will the Cenozoic trigger minor gc

A new generation of eden Lack of space

  1. When will the objects of the new generation enter the old age
  1. Evade 15 Time minor gc after , Into the old age

adopt JVM Parameters “-XX:MaxTenuringThreshold” To set up , The default is 15 year , The maximum is also 15, Object head 4bit

  1. Dynamic object age determination

Age 1 - Age n The total size of exceeds survivor District 50%, that n All objects of above ages should enter the old age

  1. Big objects go straight into the old age

adopt jvm Parameters "-XX:PretenureSizeThreshold" To set up , The default value is =0, Think of the object , We must advance the new generation

  1. minor gc Post survival objects survivor There's no room for

  1. Old age space allocation guarantee

The average survival of the Cenozoic Less than the continuous surplus of the elderly Just MinorGC, otherwise FullGC

  1. FullGC It's not enough to put in new objects after

Trigger OOM out of memory

  1. When will the elderly generation trigger FullGC

minor gc front

When the space guarantee is turned on , The elderly generation remains continuously   Less than   The average survival of the Cenozoic   It triggers fullgc

Without opening the space guarantee , The elderly generation remains continuously Less than Total number of new generation objects It triggers fullgc

 

minor gc after

The elderly generation remains continuously   Less than   The new generation survives   It triggers fullgc

cms In mode old The utilization rate of the zone exceeds the proportion of the configuration , Default 92%, Trigger fullgc

  1. What garbage collection algorithm does the elderly use

Mark - Arrangement , First mark the surviving objects , Then move to the side of memory , Finally, clean up the garbage , Than minor gc slow 10 More than times

  1. jvm The goal of optimization

Try to distribute and recycle objects in the new generation , Try not to let too many objects enter the elderly generation frequently

Avoid frequent garbage collection for older generations

At the same time, give the system sufficient memory size , Avoid frequent garbage collection by the new generation .

  1. Why the old age is not suitable for replication algorithm

Old age is the object of long-term survival , Each move 90% The living object of , Don't fit

  1. Why was garbage collection slow in the old days

The Cenozoic generation survives less , Throw it directly into s District , There are many surviving objects in the elderly generation , To mark clear

  1. The new generation ParNew Garbage collector

Multiple garbage collection threads running simultaneously , Operation period stop the world, Using replication algorithm

jvm Parameter setting :

Enable : “-XX:+UseParNewGC”

Set thread :“-XX:ParallelGCThreads”, In general, it is not set , The default is consistent with the system audit

  1. Old age CMS Garbage collector

Multiple garbage collection threads running simultaneously , Operation period stop the world, Using tag - Clean up algorithm

adopt GCRoots To determine whether the object is alive , Not alive marked as garbage , Finally, we can recycle it

  1. Default number of threads

be equal to (CPU Check the number + 3)/ 4

If 2 nucleus , Namely 1 Threads

  1. Initial marker

Get into stw state , according to GCRoots Mark whether it is garbage , fast

  1. Concurrent Tags

stop it stw state , The system may create objects or objects become garbage , Track as much as possible

  1. Re label

Get into stw state , Mark phases for concurrency The objects moved by the system are re marked , fast

  1. Concurrent cleanup

stop it stw state , Multithreading concurrently cleans up garbage objects

  1. advantage

1 and 3 Stage , Are simple Tags , fast

2 and 4 Stage , The system is executable , Time consumption has little impact on the system

  1. shortcoming
  1. Consume cpu resources

2 and 4 Stage , Although the system can execute , however cpu Partially occupied by the garbage collection thread , It's expensive cpu resources .

2 Phase concurrency flag , To track a large number of objects , It takes a lot of time .

4 Stage concurrent cleanup , Clean up a lot of garbage , It takes a lot of time .

  1. Concurrent garbage collection failed

cms It is the usage rate in the old age that starts garbage collection , Then only part of the memory must be reserved for concurrent recycling .

-XX:CMSInitiatingOccupancyFraction How much can the elderly generation reach , Turn on cms Garbage collection ,jdk1.6 Default 92%, Only 8% Use... For the system

Concurrent cleanup phase , A certain space is reserved for the system , Put the object of new and old age , What if this space is not enough .

Downgrade to Serial Old replace cms,stw, Retrace all objects , Re labelled , Clean up again , Let go stw.

  1. Memory fragments

There are a lot of memory fragments after cleaning , To solve the problem of memory fragmentation , You need to tidy up the memory , Need stw.

-XX:+UseCMSCompactAtFullCollection The default , Move objects together

-XX:CMSFullGCsBeforeCompaction The default is 0 Every time FullGC after , Memory consolidation

  1. Case actual combat - Electricity supplier system
  1. background

At ordinary times :500w Diurnal activity 、50w Order , Focused on the 4 Hours , An average of dozens of orders per second .

Great promotion : Per second 1000 Order per order

3 Taiwan machine 、4 nucleus 8g、 Each machine 300 Order requests

  1. Memory usage model estimation

1 Orders according to 1kb Calculation ,300 Order per order 300kb

Additional information is attached to the order ( Order entry object 、 stock 、 Sales promotion 、 Coupon ) expand 10-20 times

Other order inquiries expand 10 Times the amount of

Memory overhead per second 300kb * 20 * 10 = 60000kb = 60mb

  1. Memory allocation

4 nucleus 8g, to 4g to jvm

Pile up 3g、 The new generation 1.5g、 Old age 1.5g

Stack per thread 1mb, Hundreds of threads Corresponding virtual machine stack A few hundred mb

Metadata area A few hundred mb

  1. jvm Parameter configuration

-Xms3072M -Xmx3072M -Xmn1536M -Xss1M -XX:PermSize=256M -XX:MaxPermSize=256M -XX:SurvivorRatio=8

  1. Program execution

Per second 60mb,25s A new generation of 1.5g It's full , Execute at this time minor gc, And then there were 100mb, That is to say 1s About the object being executed .

here 100mb Put in s1

Next, run 20s, It is likely that the surviving object 150mb 了 , meanwhile 100mb Also exceed 50% 了 , This object is likely to be put into the elderly .

  1. One of the new generation of waste recycling optimization :Survivor There's not enough space

Two problems lead to Survivor The area is obviously inadequate

100mb exceed 50%, Need to enter the old age

150mb exceed s District , Need to enter the old age

What needs to be done next is : Adjust the size of the new generation and the old generation

Because of this common business system , Obviously most of the objects are short-lived , There should be no frequent access to the elderly , There's no need to maintain too much memory space for the old age , First of all, let the object stay in the new generation as much as possible .

So at this time, we can consider adjusting the new generation to 2G, For the old 1G, So at this time Eden by 1.6G, Every Survivor by 200MB

This is the time ,Survivor The area is bigger , It has greatly reduced the new generation GC After that, the living object is Survivor The problem that can't be put in , Or more than Survivor 50% The problem of .

In this way, the probability of new generation objects entering the old age is greatly reduced .

here JVM The parameters of are as follows :

“-Xms3072M -Xmx3072M -Xmn2048M -Xss1M -XX:PermSize=256M -XX:MaxPermSize=256M  -XX:SurvivorRatio=8”

In fact, for any system , First of all, it is similar to the memory usage model prediction and reasonable memory allocation , Try to make every time Minor GC The latter objects remain in Survivor in , Don't go into the old generation , This is the first place you need to optimize .

  1. How many times does the new generation avoid garbage collection and then enter the old generation ?

-XX:MaxTenuringThreshold It must be combined with the system model , General situation , Those who need to enter the old age are objects that need to exist for a long time

such as @Service These objects , A system adds up to dozens mb, Reduce it to 5 Time , When a 5 Next time , Just let it go as soon as possible , Go back to the old days .

here JVM The parameters are as follows :

“-Xms3072M -Xmx3072M -Xmn2048M -Xss1M  -XX:PermSize=256M -XX:MaxPermSize=256M  -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=5”

  1. How big is the object directly into the old generation ?

Large objects can directly enter the elderly generation , Because large objects are meant to live and use for a long time

For example JVM It may cache some data , In general, this can be determined by whether or not a large object is created in your system .

But in general , Set him up with 1MB Enough to , Because in general, there is seldom more than 1MB Large objects of . If there is , Maybe you allocated a large array in advance 、 Big List Things like that are used to put cached data .

here JVM The parameters are as follows :

“-Xms3072M -Xmx3072M -Xmn2048M -Xss1M  -XX:PermSize=256M -XX:MaxPermSize=256M  -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=5 -XX:PretenureSizeThreshold=1M”

  1. Specify garbage collector

Don't forget to specify a garbage collector , The new generation uses ParNew, Used in the old days CMS, as follows JVM Parameters :

“-Xms3072M -Xmx3072M -Xmn2048M -Xss1M  -XX:PermSize=256M -XX:MaxPermSize=256M  -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=5 -XX:PretenureSizeThreshold=1M -XX:+UseParNewGC -XX:+UseConcMarkSweepGC”

  1. Estimation steps of system operation model

How much memory is used per second ?

How often does it trigger Minor GC?

commonly Minor GC How many surviving objects after ?

Survivor Can I put it down ?

Will it be frequent because Survivor Can't let go, leading to the object into the old generation ?

Will it enter the elderly generation due to the dynamic age judgment rules ?

  1. Optimization of old age

Every time fullgc All memory collation , Others adopt the default , Concurrent cleanup failed Concurrent Mode Failure It's a small probability event , Never mind

“-Xms3072M -Xmx3072M -Xmn2048M -Xss1M  -XX:PermSize=256M -XX:MaxPermSize=256M  -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=5 -XX:PretenureSizeThreshold=1M -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFaction=92 -XX:+UseCMSCompactAtFullCollection -XX:CMSFullGCsBeforeCompaction=0”

  1. G1 Garbage collector
  1. parnew+cms The pain points

Whether in the new generation or in the old age, recycling should be stw, The system is suspended

  1. G1 Design thinking

There are no generations , A garbage collector can recycle both the new generation and the old generation

The heap memory is split into multiple of the same size region,n A new generation ,m An old age

You can set an expected pause time for garbage collection ( For example, I hope it 1 Hours stw Don't spend more time than 1 minute )

  1. How to make garbage collection time controllable

For each region Follow up Recycling value , How much garbage is there , How long does it take to recycle this garbage

When recycling , Be able to follow the expected goal , Within control , Recycle more garbage .

  1. region Both the new generation and the old generation can be allocated

free Of region It can be allocated to the new generation and the elderly , Distribute on demand

  1. How to set G1 The memory size of

Problems faced , How many region、 What size is set ?

-XX:+UseG1GC Be sure to use G1, As long as give G1 Set the heap memory (“-Xms”、“-Xmx”), Others by G1 Self control .

Default region Number 2048 individual

region Size = Heap memory /2048

Parameters :

-XX:G1HeapRegionSize   Appoint region size

-XX:G1NewSizePercent   The initial proportion of Cenozoic , Default 5%

-XX:G1MaxNewSizePercent The largest proportion of Cenozoic era , Default 60%

  1. Eden and Survivor Is it still there ?

A new generation of region It still exists eden、survivor District

Different region It can be released eden、s1、s2, Further split

such as 1000 individual region Belong to eden、100 individual s1,100 individual s2, here

  1. control GC pause duration

-XX:MaxGCPauseMills, The default value is 200ms

  1. When did the object enter the old age ?

achieve 15 year -XX:MaxTenuringThreshold

Dynamic age determination , Age n The previous is more than 50% Capacity ,n Then into the old age

  1. What about big objects ?

If the object size exceeds region Half of , It's about to enter the big object region.

Empty region It does not belong to the Cenozoic , It doesn't belong to the older generation , Large objects can be used .

If the object is too large , Can span multiple region

The new generation 、 When the old generation recycles , It will also recycle large objects

  1. When to trigger the mixed garbage collection of the new generation and the old generation

-XX:InitiatingHeapOccupancyPercent, The default value is 45%

The old age occupied the total region Of 45%, It triggers mixed garbage collection

  1. G1 Garbage collection process

Initial marker : according to GCRoots Whether it can reach , Mark whether the object is garbage ,stw, fast

Concurrent Tags : Let go of stw, The system can execute , Track all objects , More time-consuming .

Final marker :stw, Objects tracked during the concurrent marking phase Determine whether it is garbage , marked .

Mixed recycling :stw, Recover the part within the control range region

The mixed recycling stage can be recycled in batches , The number of batches passed -XX:G1MixedGCCountTarget control , Default 8 Time

  1. Why should we recycle in batches for many times ?

Primary recovery , Longer time

Don't let the system stw drawn-out , Recycle many times to achieve the goal .

  1. Replication algorithm recycling

Hybrid recycling is based on Replication Algorithm , Whether it's the new generation or the old generation , Copy the last surviving object to a new region On , old region Empty it out again . The advantage of replication algorithm is that there will be no memory fragmentation .

  1. Which? region Can't recycle ?

The surviving object must exceed 85% Of region Not recyclable , Otherwise, the cost of copying back and forth is very high .

“-XX:G1MixedGCLiveThresholdPercent” Default 85%

  1. When to stop recycling

Free region The quantity reaches the default 5%, Stop mixing and recycling .

-XX:G1HeapWastePercent, The default value is 5%

  1. Recovery failure Full GC

Recycling waste , During the process of copying living objects, it is found that there is no idle Region It can carry its own surviving objects , It will trigger A failure .

Once it fails , Immediately switch to stop the system program , Then we use a single thread to mark 、 Clean and compress , Spare a batch Region, This process is extremely slow .

  1. Case actual combat - Million user education system
  1. background

Millions a day , evening 2-3 Hours , Students have to study in class , Study on weekends .

High frequency game interaction

  1. Operating pressure of the system

Every hour 20w user , Every minute 1 Time ,1 Hours 60 Time

in total 1 Hours 20w user 1200w operations , Per second 3000 operations , That is, every second 3000 A request

Then allocate 5 platform 4 nucleus 8g Machine , Each anti 600 A request , in total 3000 A request .

Create several objects at a time , A few kb Memory , Think 5kb,600 A request ,3mb about .

  1. G1 Memory layout

Distribute 4G Give heap memory , The default initial proportion of Cenozoic is 5%, The largest proportion is 60%

Every Java The stack memory of the thread is 1MB

Metadata area ( Forever ) The memory of is 256M

here JVM The parameters are as follows :

“-Xms4096M -Xmx4096M -Xss1M -XX:PermSize=256M -XX:MaxPermSize=256M -XX:+UseG1GC“

“-XX:G1NewSizePercent” The parameter is used to set the initial proportion of Cenozoic , No settings , Maintain the default value as 5% that will do .

“-XX:G1MaxNewSizePercent” The parameter is used to set the maximum proportion of Cenozoic , No setting , Maintain the default value as 60% that will do .

4 individual G A pile of ,2048 individual region, Every region size 2mb

  1. GC How to set the pause time ?

-XX:MaxGCPauseMills, The default value is 200 millisecond , That is to say stw Time 200ms

  1. How long will it trigger the Cenozoic GC

stay eden Area allocation object , Per second 3MB

G1 How many will be allocated Region To the new generation , How often does it trigger the Cenozoic gc, How long does it take each time , These are all uncertain , You must use some tools to check the actual situation of the system , This advance is unpredictable .

According to your preset gc Pause time , Assign some to the new generation Region, Then trigger to a certain extent gc, And the gc The time is controlled within the preset range , Try to avoid recycling too much at one time Region Lead to gc The pause time is longer than expected .

  1. The new generation gc How to optimize ?

Garbage collectors are becoming more and more intelligent , We can do less

-XX:MaxGCPauseMills The pause time parameter setting is small ,gc More frequently , The settings are big ,gc The frequency is low , It's been a long time

How to set this parameter , It needs to be combined with the system pressure measurement tools that will be explained later 、gc journal 、 Memory analysis tools are combined to consider , Try to make the system gc Don't be too frequent , At the same time, every time gc Don't pause too long , Reach an ideal and reasonable value .

  1. mixed gc How to optimize ?

In the old days, the heap memory occupied more than 45% It will trigger .

The condition for the young generation to enter the old age is s There's no room for , Or the object is old , Or dynamic age determination .

mixed gc The core optimization idea is the same as before , Avoid objects entering the elderly generation as soon as possible

The core of optimization is -XX:MaxGCPauseMills Pause time

Adjust the time , Avoid frequent Cenozoic gc, Calculate Cenozoic gc How many objects survive after , Avoid frequent triggering of the elderly generation as soon as possible mixed gc.

How to optimize this parameter , Everything should be combined with the subsequent explanation and practice of a large number of tools

  1. Interview questions of Dachang 2
  1. Minor GC / Young GC

One meaning , They are all new generation gc

  1. Old GC

Old age gc

  1. Full GC

For the new generation 、 Old age 、 Garbage collection of all memory space in metadata area

  1. Major GC

Major GC, In fact, this one is generally used less , He is also a very confusing concept

Everyone is right old gc Some people think it's Full gc, Then confirm with him which one it is

  1. Mixed GC

g1 specific , The elderly generation has reached 45%, Trigger

For the new generation , Older generations will recycle

  1. Young GC and Full GC Under what circumstances will it happen
  1. Young GC The trigger time of

eden The district is full , Trigger ygc

  1. Old GC and Full GC The trigger time of

ygc front , The old space guarantee failed , There is no continuous memory to put down the previous survival average , Trigger old gc

ygc after , There is no continuous memory to put down the surviving prime , Trigger old gc

cms In mode , In the old days, the memory utilization rate exceeded 92%, Trigger old gc

To sum up , There is not enough space for the elderly , Need to trigger old gc

  1. Why do you often see Old GC They all have one Young GC

commonly ygc It may be triggered once before old gc,ygc It may trigger once old gc.

therefore say old gc It's all about one ygc

  1. jvm Realized Full GC

Many different jvm The implementation of version , Generally achieve old gc Conditions , Will trigger full gc, Except for the elderly , The new generation , The metadata area is recycled

  1. What happens when the metadata area is full ?

The metadata area stores constants and class information , Generally, there is no need to recycle , It's too full to put , Just report a mistake ,oom, out of memory .

  1. Case actual combat -10w Concurrent bi System
  1. Business background

Collect the business data of merchants , analysis , Make a report and show it to them , Guide the operation of merchants

Collect data , adopt spark、flink、hadoop To analyze , adopt mysql、es、hbase To store

bi System java It's done , To read the stored data , Presentation Report

4 nucleus 8g machine , Just launched and deployed ,eden District 1g、s1 s2 various 100mb

  1. Technical pain point

Merchants need to adjust every few seconds bi System , Get real-time data , Refresh the report

Tens of thousands of merchants , Thousands of merchants online in real time , Estimated per second 500 A request .

Each request loads a lot of data , Calculate it in memory , Back to the front end .

Estimate each request 100kb Memory usage of ,500 The request is 50mb The amount of data

  1. Early frequent ygc

Per second 50mb,200s Time for ,eden The area is full , frequent ygc

ygc About dozens ms, The surviving objects are about dozens mb Even a few mb

The scene you see is 200s once ygc, Pause for dozens ms, Little impact on users .

  1. Business growth to 10w Concurrent

bi The system eats more memory , use 16 nucleus 32g Machine , Each machine can resist thousands of concurrency , Just get 20 or 30 machines .

32g Memory , The new generation at least distributes 20g Memory ,eden District 16g,s1、s2 various 1g

At this time, thousands of requests , Hundreds of megabytes of data per second , Tens of seconds eden The area is full , Trigger ygc

however ygc To reclaim a lot of memory , The system may stop for hundreds ms、1s, The system pauses longer .

System stw Long time , The front end easily times out .

  1. Use G1 To optimize large memory

g1 Set the pause time , such as 100ms, It will control the time by itself , Auto-Control ygc To recycle region, This reduces the impact on the system , although ygc The frequency is high , But the impact on the system is small

  1. Case actual combat - The real-time analysis engine with 10 billion data volume is frequent fullgc
  1. Business background

Each time, it will extract about 1 About 10000 pieces of data are calculated in memory , On average, each calculation will cost 10 Seconds or so , Then each machine is 4 nucleus 8G Configuration of ,JVM Memory is given to 4G, The new generation and the old generation are 1.5G Of memory space , Let's see the picture below .

Every piece of data 1kb,1w Jump data 10mb

The new generation is based on 8:1:1 To distribute Eden And two Survivor Region , So generally speaking ,Eden District is 1.2GB, each Survivor The area is 100MB about , Here's the picture .

Perform one calculation task at a time , Will be in Eden District Distribution 10MB Left and right objects , Then one minute probably corresponds to 100 Calculation task , In fact, basically one minute later ,Eden The area is full of objects , It's almost full

  1. Trigger Minor GC How many people will enter the old age when they are young

ygc When 80 A task has been completed ,20 One is executing , also 200mb It can't be recycled ,s There's no room for , Directly into the old age .

  1. How long does the system run , The old days will probably fill up ?

1 minute 200mb Into the old age , In the old days 1.5g,7 Minutes old generation is almost full , The next time ygc, Pre warranty ,100mb Not enough 200mb, Offense full gc

The frequency of this system is 7,8 Minutes to trigger full gc

  1. Optimization of this case

First of all, make sure s The area can put down surviving objects

3gb Heap memory ,1g to eden ,s Areas are given to 200mb, Every time ygc, Just enough to put down the surviving object .

such full gc The frequency of is reduced from a few minutes to a few hours

  1. The load expands again to 10 times

Per second 100mb Usage quantity ,1.6g The new generation ,10s Much ygc, A batch of data needs to be processed 10s, Probably 1g Objects cannot be recycled

So every once in a while 10 many s There is 1g Data into the old age , It may trigger several times in a minute full gc

  1. Use large memory machines to optimize

4 nucleus 8g Upgrade to 16 nucleus 32g

10 Times upgrade ,eden Area to 16g,s Each district 2g, Per second 100mb,2 Every minute or so ygc

Every time ygc The remaining objects are less than 1g, Also reduced. fullgc The frequency of

Because it is not facing the system used by users , Therefore, there is no need to use g1, Every pause 1s, It has no impact on the system .

  1. Hands on - simulation ygc
  1. Parameter configuration

jdk1.8

-XX:NewSize=5M -XX:MaxNewSize=5M -XX:InitialHeapSize=10M -XX:MaxHeapSize=10M -XX:SurvivorRatio=8 -XX:PretenureSizeThreshold=1M -XX:+UseParNewGC -XX:+UseConcMarkSweepGC

above “-XX:InitialHeapSize” and “-XX:MaxHeapSize” Is the initial heap size and the maximum heap size ,“-XX:NewSize” and “-XX:MaxNewSize” Is the initial Cenozoic size and the largest Cenozoic size ,“-XX:PretenureSizeThreshold=10485760” The specified threshold for large objects is 10MB.

Allocate heap memory 10MB Memory space , The Cenozoic is 5MB Memory space , among Eden Area occupation 4MB, Every Survivor Area occupation 0.5MB, Large objects must exceed 10MB Will directly enter the elderly generation , The younger generation uses ParNew Garbage collector , Used in the old days CMS Garbage collector

Print jvm Of gc journal

-XX:+PrintGCDetils: Print detailed gc journal

-XX:+PrintGCTimeStamps: This parameter can be printed out every time GC When it happened

-Xloggc:gc.log: This parameter can be set to gc The log is written to a disk file

jvm Parameter change

-XX:NewSize=5242880 -XX:MaxNewSize=5242880 -XX:InitialHeapSize=10485760 -XX:MaxHeapSize=10485760 -XX:SurvivorRatio=8 -XX:PretenureSizeThreshold=10485760 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:gc.log

  1. Code combat

public class YGCDemo {

    public static void main(String[] args) {

        byte[] arr = new byte[1024 * 1024];

        arr = new byte[1024 * 1024];

        arr = new byte[1024 * 1024];

        arr = null;

        byte[] arr2 = new byte[2*1024 * 1024];

    }

}

  1. Object allocation analysis

eden District 4mb、s1 0.5mb、s2 0.5mb、 Old age 5mb

byte[] arr = new byte[1024 * 1024];    Allocated 1mb To eden

arr = new byte[1024 * 1024];         Allocated 1mb To eden

arr = new byte[1024 * 1024];     Allocated 1mb To eden

arr = null;                         All three objects have become garbage

byte[] arr2 = new byte[2 * 1024 * 1024];   here eden There are still 1mb,2mb It can't be put down , Trigger ygc

  1. gc Log analysis

javac YGCDemo

java -XX:NewSize=5242880 -XX:MaxNewSize=5242880 -XX:InitialHeapSize=10485760 -XX:MaxHeapSize=10485760 -XX:SurvivorRatio=8 -XX:PretenureSizeThreshold=10485760 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:gc.log YGCDemo

Original log :

0.100: [GC (Allocation Failure)

0.101: [ParNew: 4087K->512K(4608K), 0.0012076 secs]

4087K->665K(9728K), 0.0013630 secs]

 [Times: user=0.00 sys=0.00, real=0.00 secs]

Heap

 par new generation   total 4608K, used 1577K [0x00000000ff600000, 0x00000000ffb00000, 0x00000000ffb00000)

  eden space 4096K,  26% used [0x00000000ff600000, 0x00000000ff70a558, 0x00000000ffa00000)

  from space 512K, 100% used [0x00000000ffa80000, 0x00000000ffb00000, 0x00000000ffb00000)

  to   space 512K,   0% used [0x00000000ffa00000, 0x00000000ffa00000, 0x00000000ffa80000)

 concurrent mark-sweep generation total 5120K, used 153K [0x00000000ffb00000, 0x0000000100000000, 0x0000000100000000)

 Metaspace       used 2696K, capacity 4486K, committed 4864K, reserved 1056768K

  class space    used 297K, capacity 386K, committed 512K, reserved 1048576K

Log analysis :

CommandLine flags

On behalf of jvm Parameters

0.100: [GC (Allocation Failure)

The system runs to 0.134s When , It happened. gc, Failed to allocate memory

 

0.101: [ParNew: 4087K->512K(4608K), 0.0012076 secs]

parnew The new generation gc, Used 0.0012076 s Time ,4608k yes 4.5mb, Total available space of the new generation ( only one s District ), Already used 4087k, After recycling , Survive 512Kkb

4087K->665K(9728K), 0.0013630 secs]

The state of the whole reactor ,4087k Already used ,665k After recycling ,9728k Total pile size ,1.3ms gc Time

 [Times: user=0.00 sys=0.00, real=0.00 secs]

This time gc Time for , Because it's all ms, In seconds , Almost 0

  1. GC Execution process

step 1

According to the principle that 3m yes 3072k, How did it become 4059k?

Storage array ,jvm Additional information is also attached , The actual storage of each array is more than 1mb Of

There are also some that are not self created , Invisible objects are eden District

GC Before , Add up three arrays and some other unknown objects , Just occupy 4059KB Of memory

step2

Then assign 2mb Array ,eden There is no space to trigger gc,Allocation Failure

appear gc journal

0.101: [ParNew: 4087K->512K(4608K), 0.0012076 secs]

gc After recycling , And then there were 512k The living object , from eden The district is transferred to survivor from District

  1. GC After the memory situation

Heap

 par new generation   total 4608K, used 1577K [0x00000000ff600000, 0x00000000ffb00000, 0x00000000ffb00000)

  eden space 4096K,  26% used [0x00000000ff600000, 0x00000000ff70a558, 0x00000000ffa00000)

  from space 512K, 100% used [0x00000000ffa80000, 0x00000000ffb00000, 0x00000000ffb00000)

  to   space 512K,   0% used [0x00000000ffa00000, 0x00000000ffa00000, 0x00000000ffa80000)

 concurrent mark-sweep generation total 5120K, used 153K [0x00000000ffb00000, 0x0000000100000000, 0x0000000100000000)

 Metaspace       used 2696K, capacity 4486K, committed 4864K, reserved 1056768K

  class space    used 297K, capacity 386K, committed 512K, reserved 1048576K

This log is in JVM Print out the current heap memory usage when exiting

par new generation   total 4608K, used 1577K

parnew Recyclers , Responsible new generation , Total memory 4608k(4.5m), Already used 1577k

Why has it been used 1577k?

because ygc after ,s District 510k, Plus reassigned 1m The data of , in total 1.5m

There are also some additional objects , Take up dozens kb

 eden space 4096K,  26% used

eden District 4m Memory used 26%

from space 512K, 100%

from Area used 100%

 concurrent mark-sweep generation total 5120K, used 153K

cms Management of the 5m, Used 153k

 Metaspace       used 2696K, capacity 4486K, committed 4864K, reserved 1056768K

 class space    used 297K, capacity 386K, committed 512K, reserved 1048576K

Metadata and class Space , Store some class information 、 Constant pool and so on , At this time, their total capacity , Using memory

  1. Hands on - The simulation object has entered the old age ( Dynamic age rule )
  1. The time to enter the old age

Evade 15 Time gc, achieve 15 After the age of 20, we enter the old age generation ;

Dynamic age determination rules , If Survivor Age in the area 1+ Age 2+ Age 3+ Age n The sum of objects is greater than Survivor District 50%, Age at this time n The above objects will enter the elderly generation , It doesn't have to be 15 year

If a Young GC There are too many post survival objects to put Survivor District , At this time, it is directly included in the elderly generation

Large objects go directly into the old generation

  1. Example JVM Parameters

-XX:NewSize=10485760 -XX:MaxNewSize=10485760 -XX:InitialHeapSize=20971520 -XX:MaxHeapSize=20971520 -XX:SurvivorRatio=8  -XX:MaxTenuringThreshold=15 -XX:PretenureSizeThreshold=10485760 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:gc.log

eden District 8m、s1 1m、s2 1m、 Old age 10m, The total size of the pile 20m, Big object 10m Into the old age

-XX:MaxTenuringThreshold=15 object 15 Into the old generation

  1. Part of the sample code of dynamic age determination rules

public class DynamicAgeToOldRegion {

    public static void main(String[] args) {

        byte[] arr = new byte[2 * 1024 * 1024];

        arr = new byte[2 * 1024 * 1024];

        arr = new byte[2 * 1024 * 1024];

        arr = null;

        byte[] arr2 = new byte[128 * 1024];

        byte[] arr3 = new byte[2 * 1024 * 1024];

    }

}

  1. gc Log analysis

CommandLine flags: -XX:InitialHeapSize=20971520 -XX:MaxHeapSize=20971520 -XX:MaxNewSize=10485760 -XX:MaxTenuringThreshold=15 -XX:NewSize=10485760 -XX:OldPLABSize=16 -XX:PretenureSizeThreshold=10485760 -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:SurvivorRatio=8 -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC -XX:-UseLargePagesIndividualAllocation -XX:+UseParNewGC

0.136: [GC (Allocation Failure)

0.136: [ParNew: 6257K->709K(9216K), 0.0014081 secs] 6257K->2759K(19456K), 0.0017086 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]

Heap

 par new generation   total 9216K, used 5170K [0x00000000fec00000, 0x00000000ff600000, 0x00000000ff600000)

  eden space 8192K,  54% used [0x00000000fec00000, 0x00000000ff05b5c8, 0x00000000ff400000)

  from space 1024K,  69% used [0x00000000ff500000, 0x00000000ff5b14f0, 0x00000000ff600000)

  to   space 1024K,   0% used [0x00000000ff400000, 0x00000000ff400000, 0x00000000ff500000)

 concurrent mark-sweep generation total 10240K, used 2050K [0x00000000ff600000, 0x0000000100000000, 0x0000000100000000)

 Metaspace       used 3484K, capacity 4498K, committed 4864K, reserved 1056768K

  class space    used 387K, capacity 390K, committed 512K, reserved 1048576K

eden District 8m、s1 1m、s2 1m、 Old age 10m, The total size of the pile 20m, Big object 10m Into the old age

byte[] arr = new byte[2 * 1024 * 1024];    Distribute 2m

arr = new byte[2 * 1024 * 1024];   Distribute 2m

arr = new byte[2 * 1024 * 1024];         Distribute 2m

arr = null;                           6m Become rubbish

byte[] arr2 = new byte[128 * 1024];       Distribute 128kb

byte[] arr3 = new byte[2 * 1024 * 1024];

Redistribute at this time 2m, The total space is 8m, Already in use 6m+128kb, The remaining space is not enough , It must happen once gc

0.136: [GC (Allocation Failure)

parnew gc The information is as follows , Total of the new generation 9216k, Used 6257k, After garbage collection, there is still 709k( Unknown object +128k)

Heap used 6257k, After recycling, there is still 2758k

0.136: [ParNew: 6257K->709K(9216K), 0.0014081 secs] 6257K->2759K(19456K), 0.0017086 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]

from Area used 69%, All the objects that survived once came in

from space 1024K,  69% used [0x00000000ff500000, 0x00000000ff5b14f0, 0x00000000ff600000)

eden Area used 54%,arr3 Redistributed 2m + Unknown object

eden space 8192K,  54% used [0x00000000fec00000, 0x00000000ff05b5c8, 0x00000000ff400000)

Now? Survivor From The one in the district 700kb The object of , How old is it ?1 year

He survived once gc, Age will increase 1 year . And at this point Survivor The total size of the area is 1MB, here Survivor There are already living objects in the area 700KB 了 , Definitely more than 50%.

  1. Code upgrade Simulate the object into old District

byte[] arr = new byte[2 * 1024 * 1024];

arr = new byte[2 * 1024 * 1024];

arr = new byte[2 * 1024 * 1024];

arr = null;

byte[] arr2 = new byte[128 * 1024];

byte[] arr3 = new byte[2 * 1024 * 1024]; // Trigger ygc  

arr3 = new byte[2 * 1024 * 1024];

arr3 = new byte[2 * 1024 * 1024];

arr3 = new byte[128 * 1024];

arr3 = null;

byte[] arr4 = new byte[2 * 1024 * 1024];

We need to trigger it for the second time Young GC, And then look at it Survivor Whether the dynamic age determination rules in the region can take effect .

byte[] arr3 = new byte[2 * 1024 * 1024];   // Trigger ygc  2mb

arr3 = new byte[2 * 1024 * 1024];        //2mb

arr3 = new byte[2 * 1024 * 1024];        //2mb

arr3 = new byte[128 * 1024];            //128kb

arr3 = null;                           //arr3 Turn into garbage

byte[] arr4 = new byte[2 * 1024 * 1024];   // Trigger the second time ygc

There is no room for 2mb 了 , Trigger ygc

  1. Second code gc Log analysis

Use javac Compile implementation

java -XX:NewSize=10485760 -XX:MaxNewSize=10485760 -XX:InitialHeapSize=20971520 -XX:MaxHeapSize=20971520 -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=15 -XX:PretenureSizeThreshold=10485760 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:gc.log DynamicAgeToOldRegion

0.098: [GC (Allocation Failure) 0.098: [ParNew: 7419K->812K(9216K), 0.0013814 secs] 7419K->812K(19456K), 0.0018869 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]

0.100: [GC (Allocation Failure) 0.100: [ParNew: 7116K->0K(9216K), 0.0021751 secs] 7116K->790K(19456K), 0.0022129 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]

Heap

 par new generation   total 9216K, used 2212K [0x00000000fec00000, 0x00000000ff600000, 0x00000000ff600000)

  eden space 8192K,  27% used [0x00000000fec00000, 0x00000000fee290e0, 0x00000000ff400000)

  from space 1024K,   0% used [0x00000000ff400000, 0x00000000ff400000, 0x00000000ff500000)

  to   space 1024K,   0% used [0x00000000ff500000, 0x00000000ff500000, 0x00000000ff600000)

 concurrent mark-sweep generation total 10240K, used 790K [0x00000000ff600000, 0x0000000100000000, 0x0000000100000000)

 Metaspace       used 2696K, capacity 4486K, committed 4864K, reserved 1056768K

  class space    used 297K, capacity 386K, committed 512K, reserved 1048576K

The second time gc, Meet the dynamic age judgment rule , All into the old age , in total 790k

  1. Hands on - The simulation object has entered the old age (s There's no room for )
  1. Code

byte[] arr = new byte[2 * 1024 * 1024];

arr = new byte[2 * 1024 * 1024];

arr = new byte[2 * 1024 * 1024];

byte[] arr2 = new byte[128 * 1024];

arr2=null;

byte[] arr3 = new byte[2 * 1024 * 1024];

Use javac Compile implementation

java -XX:NewSize=10485760 -XX:MaxNewSize=10485760 -XX:InitialHeapSize=20971520 -XX:MaxHeapSize=20971520 -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=15 -XX:PretenureSizeThreshold=10485760 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:gc.log DynamicAgeToOldRegion

  1. gc journal

0.095: [GC (Allocation Failure) 0.095: [ParNew: 7419K->688K(9216K), 0.0011852 secs] 7419K->2738K(19456K), 0.0014580 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]

Heap

 par new generation   total 9216K, used 2819K [0x00000000fec00000, 0x00000000ff600000, 0x00000000ff600000)

  eden space 8192K,  26% used [0x00000000fec00000, 0x00000000fee14930, 0x00000000ff400000)

  from space 1024K,  67% used [0x00000000ff500000, 0x00000000ff5ac2e0, 0x00000000ff600000)

  to   space 1024K,   0% used [0x00000000ff400000, 0x00000000ff400000, 0x00000000ff500000)

 concurrent mark-sweep generation total 10240K, used 2050K [0x00000000ff600000, 0x0000000100000000, 0x0000000100000000)

 Metaspace       used 2696K, capacity 4486K, committed 4864K, reserved 1056768K

  class space    used 297K, capacity 386K, committed 512K, reserved 1048576K

  1. Code and log analysis

byte[] arr = new byte[2 * 1024 * 1024];  // Allocated 2m To eden, At this time, garbage 0m

arr = new byte[2 * 1024 * 1024];       // Allocated 2m To eden, At this time, garbage 2m

arr = new byte[2 * 1024 * 1024];       // Allocated 2m To eden, At this time, garbage 4m

 

byte[] arr2 = new byte[128 * 1024];     // Allocated 128k To eden, At this time, garbage 4m

arr2=null;                          // At this time, garbage 4m+128k, The garbage 2m, Unknown object 500kb, in total eden 8m

Redistribute at this time 2m, Space is not enough , Trigger gc

byte[] arr3 = new byte[2 * 1024 * 1024];

gc Then the surviving objects 2m+500kb Unknown object ,s There's no room for

2m Into the eden District ,2m Into the old District ,500kb Into the s District

par new generation   total 9216K, used 2819K

concurrent mark-sweep generation total 10240K, used 2050K

from space 1024K,  67% used

therefore s There's no room for , Not all objects enter old District , Unknown objects can be seen everywhere s District , Our program survival object entered old District

  1. Hands on -JVM Of Full GC journal
  1. Code

public class FullGCDemo {

    public static void main(String[] args) {

        byte[] arr1 = new byte[4 * 1024 * 1024];

        arr1 = null;

        byte[] arr2 = new byte[2 * 1024 * 1024];

        byte[] arr3 = new byte[2 * 1024 * 1024];

        byte[] arr4 = new byte[2 * 1024 * 1024];

        byte[] arr5 = new byte[128 * 1024];

        byte[] arr6 = new byte[2 * 1024 * 1024];

    }

}

  1. use jvm Parameters

java -XX:NewSize=10485760 -XX:MaxNewSize=10485760 -XX:InitialHeapSize=20971520 -XX:MaxHeapSize=20971520 -XX:SurvivorRatio=8  -XX:MaxTenuringThreshold=15 -XX:PretenureSizeThreshold=3145728 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:gc.log FullGCDemo

“-XX:PretenureSizeThreshold=3145728” Core parameters , Maximum object threshold , exceed 3m Into the old age

  1. gc journal

0.099: [GC (Allocation Failure)

0.099: [ParNew (promotion failed): 7419K->8240K(9216K), 0.0024836 secs]

0.101: [CMS: 8194K->6915K(10240K), 0.0025028 secs] 11515K->6915K(19456K),

[Metaspace: 2689K->2689K(1056768K)], 0.0054504 secs] [Times: user=0.14 sys=0.02, real=0.01 secs]

Heap

 par new generation   total 9216K, used 2130K [0x00000000fec00000, 0x00000000ff600000, 0x00000000ff600000)

  eden space 8192K,  26% used [0x00000000fec00000, 0x00000000fee14930, 0x00000000ff400000)

  from space 1024K,   0% used [0x00000000ff500000, 0x00000000ff500000, 0x00000000ff600000)

  to   space 1024K,   0% used [0x00000000ff400000, 0x00000000ff400000, 0x00000000ff500000)

 concurrent mark-sweep generation total 10240K, used 6915K [0x00000000ff600000, 0x0000000100000000, 0x0000000100000000)

 Metaspace       used 2696K, capacity 4486K, committed 4864K, reserved 1056768K

  class space    used 297K, capacity 386K, committed 512K, reserved 1048576K

  1. gc Log analysis

at present eden  8m、s 1m、old 10m

byte[] arr1 = new byte[4 * 1024 * 1024];  // exceed 3m Into the old age

arr1 = null;                          // The old generation exists 4m The garbage is gone

byte[] arr2 = new byte[2 * 1024 * 1024];  //2m Into the eden

byte[] arr3 = new byte[2 * 1024 * 1024];  //2m Into the eden

byte[] arr4 = new byte[2 * 1024 * 1024];  //2m Into the eden

byte[] arr5 = new byte[128 * 1024];   //128k Into the eden

byte[] arr6 = new byte[2 * 1024 * 1024];  //2m Into the eden can't let go , Trigger ygc

Then look at the old generation guarantee , At this time, the remaining elderly generation is 6m, Average quantity of all previous times =0, Don't trigger old gc

And then start ygc,ygc after , Find surviving objects 8240k,s There's no room for , Just put old District ,old There are still 6m I can't put it down

0.099: [ParNew (promotion failed): 7419K->8240K(9216K), 0.0024836 secs]

So trigger old gc,8m Use , After recycling, it becomes 7m Less than

0.101: [CMS: 8194K->6915K(10240K), 0.0025028 secs] 11515K->6915K(19456K),

And then put eden District 6m+128k,s There's no room for , Copied to the old District , eliminate eden District , And then put the latest 2m discharge eden District

  1. The other two fullgc scene

One of them is triggering Young GC Before , Maybe the available space in the old age is smaller than that in previous times Young GC The average size of objects that rise into the old age , Will be in Young GC Before , Trigger ahead of time Full GC.

One more , That is, the utilization rate of the elderly generation has reached 92% The threshold of , It will also trigger Full GC

  1. jstat Tool use

jstat It can be used to check jvm The overall operation of

  1. command

jstat -gc PID -h5 How many milliseconds does it execute Total execution times

pid Is a process id

jps Check the process id

such as :jstat -gc 21664 10000, Express 10s The clock outputs data

5 Line shows header

  1. Indicator description

S0C: This is a From Survivor The size of the area

S1C: This is a To Survivor The size of the area

S0U: This is a From Survivor The size of memory currently used by the area

S1U: This is a To Survivor The size of memory currently used by the area

EC: This is a Eden The size of the area

EU: This is a Eden The size of memory currently used by the area

OC: This is the size of the old age

OU: This is the current memory size used in the old age

MC: This is the method area ( Forever 、 Metadata area ) Size

MU: This is the method area ( Forever 、 Metadata area ) The current memory size used

YGC: This is the performance of the system so far Young GC frequency

YGCT: This is a Young GC Time consuming

FGC: This is the performance of the system so far Full GC frequency

FGCT: This is a Full GC Time consuming

GCT: This is all. GC The total time of

  1. Case study

as follows jvm Parameters

-Xmx1024m -Xms1024m -Xmn512m -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=15 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:gc.log

The memory allocation is as follows

Pile up 1024m

young  512m

eden   410m

s0      51.2m

s1      51.2m

old     512m

S0C      S1C    S0U    S1U      EC       EU        OC         OU       MC     MU    CCSC  

52416    52416  0.0   34068.4    419456.0  59943     524288     45651     82316  80714   9972      

 CCSU   YGC     YGCT    FGC    FGCT     GCT

9644.4   17      0.298    4      0.080    0.378

Parameter interpretation

C Represents the current capacity ,U Represents that , The unit is kb

S0C   S0 District 52m

S1C   S1 District 52m

S0U   S0 Currently used 0

S1U   S1 Used 34m

EC    eden Total area size 419m

EU    eden The district always uses 59m

OC    old Total area size 524m

OU    old Area use 45m

MC     Method area 82m

MU     The method area uses 80m

CCSC    Total space size of compressed class 9.9m

CCSU    Compressed classes use space size 9.6m

YGC     Occurs since the system is running 17 Time

YGCT    Since the system is running YGC Total time 0.298s

FGC     Since the system is running FGC frequency 4 Time

FGCT    Since the system is running FGC Total time 0.08s

GCT     all GC Total time 0.378s

  1. Other jstat command

jstat -gccapacity PID: Heap memory analysis

jstat -gcnew PID: The younger generation GC analysis , there TT and MTT You can see the age at which the object survives in the younger generation and the maximum age at which it survives

jstat -gcnewcapacity PID: Young generation memory analysis

jstat -gcold PID: Old age GC analysis

jstat -gcoldcapacity PID: Old time memory analysis

jstat -gcmetacapacity PID: Metadata area memory analysis

  • The growth rate of Cenozoic objects

jstat You can see it later EU

To infer

Per second

Every minute

Every hour

Statistical data , You know the growth rate

And it can be in peak areas and low peak areas Conduct different monitoring

Like you Eden There are 800MB Memory , Then it is found that the peak time is increased every second 5MB object

Probably the peak period is 3 Minutes will trigger once Young GC

Daily period is added every second 0.5MB object

Then the daily period will take about half an hour to trigger Young GC.

  1. Every time Young GC The average time taken

I already know

How many times has the system happened Young GC And these Young GC The total time of

For example, system operation 24 Hours later, a total of 260 Time Young GC

The total time is 20s

Then on average, every time Young GC It takes about tens of milliseconds .

  1. Every time Young GC After that, how many objects are surviving and entering the old generation

If we infer that 3 Minutes at a time ygc

You can observe it , It happens every three minutes Young GC, here Eden、Survivor、 The object changes in the old age

Eden District

It will definitely become less and less after it is almost full , such as 800MB Dozens of space have been used MB

S District

Every time YGC after ,S The data of the area will move , You know how much you live

The growth rate of objects in the old age

Like every other 3 Minutes at a time Young GC, Every time there will be 50MB Object into the old generation , This is the growth rate of the age object , every other 3 Minute growth 50MB

From a normal point of view , The object of the old age is unlikely to keep growing rapidly , Because ordinary systems don't have so many long-term objects .

If you find, for example, every time Young GC later , In the old age, the number of objects increased by dozens MB, That's probably you once Young GC After that, there are too many surviving objects .

There are too many surviving objects , May result in Survivor The area then triggers the dynamic age determination rule to enter the elderly generation , It could be Survivor The area can't be put , So most of the surviving objects enter the elderly .

This is the most common case . If your old age every time in Young GC After that, hundreds of new KB, Or a few MB The object of , This is a good reason , But if the objects in the old age grow rapidly , That must be abnormal .

  1. Full GC Trigger timing and time-consuming

As long as we know the growth rate of objects in the old age , that Full GC The trigger time is very clear , For example, in the old days, there were 800MB Of memory , every other 3 Minutes to add 50MB object , Then it will trigger every hour Full GC.

And then you can see jstat The printed system runs vigorously Full GC Times and total time , For example, a total of 10 Time Full GC, Total time consuming 30s, Every time Full GC It probably costs 3s about .

  1. jstat -gcutil 13614 2000 10

Check the percentage used

 S0     S1     E      O      P     YGC     YGCT    FGC    FGCT     GCT   

  0.68   0.00  98.02  14.48  70.23    804    6.117     0    0.000    6.117

  0.00   0.83  34.27  14.48  70.23    805    6.124     0    0.000    6.124

  0.00   0.83  34.27  14.48  70.23    805    6.124     0    0.000    6.124

Column  Description

S1  S1 Percentage used

E   eden Percentage used

O   old Percentage used

P   perm Percentage used

YGC The younger generation gc frequency

YGCT     The younger generation gc Time

FGC full gc frequency

FGCT    full gc Time

GCT Total garbage collection time

  1. Need to master the system GC situation

The growth rate of Cenozoic objects

Young GC The trigger frequency of

Young GC Time consuming

Every time Young GC How many objects survived

Every time Young GC How many objects have entered the old age

The rate of object growth in the old age

Full GC The trigger frequency of

Full GC Time consuming

  1. Hands-on experiment : Use jmap and jhat Find out the object distribution of the online system
  1. Use jmap Understand the memory area of the system at runtime

demand : Which objects occupy so much memory

jmap -heap PID

For example, use the following parameters

-Xmx1024m

-Xms1024m

-Xmn512m

-XX:MetaspaceSize=256m

-XX:MaxMetaspaceSize=256m

-XX:SurvivorRatio=8

-XX:MaxTenuringThreshold=15

-XX:+UseParNewGC

-XX:+UseConcMarkSweepGC

-XX:+PrintGCDetails

-XX:+PrintGCTimeStamps

-Xloggc:gc.log

The printed information is as follows

Heap Configuration:

   MinHeapFreeRatio         = 40

   MaxHeapFreeRatio         = 70

   MaxHeapSize              = 1073741824 (1024.0MB)

   NewSize                  = 536870912 (512.0MB)

   MaxNewSize               = 536870912 (512.0MB)

   OldSize                  = 536870912 (512.0MB)

   NewRatio                 = 2

   SurvivorRatio            = 8

   MetaspaceSize            = 268435456 (256.0MB)

   CompressedClassSpaceSize = 1073741824 (1024.0MB)

   MaxMetaspaceSize         = 268435456 (256.0MB)

   G1HeapRegionSize         = 0 (0.0MB)

Heap Usage:

New Generation (Eden + 1 Survivor Space):

   capacity = 483196928 (460.8125MB)

   used     = 384191992 (366.39403533935547MB)

   free     = 99004936 (94.41846466064453MB)

   79.51043761602723% used

Eden Space:

   capacity = 429522944 (409.625MB)

   used     = 358805512 (342.18360137939453MB)

   free     = 70717432 (67.44139862060547MB)

   83.53581968370938% used

From Space:

   capacity = 53673984 (51.1875MB)

   used     = 25386480 (24.210433959960938MB)

   free     = 28287504 (26.977066040039062MB)

   47.29755108173077% used

To Space:

   capacity = 53673984 (51.1875MB)

   used     = 0 (0.0MB)

   free     = 53673984 (51.1875MB)

   0.0% used

concurrent mark-sweep generation:

   capacity = 536870912 (512.0MB)

   used     = 60859816 (58.040443420410156MB)

   free     = 476011096 (453.95955657958984MB)

   11.336024105548859% used

34803 interned Strings occupying 3962392 bytes.

  1. Use jmap Understand the distribution of objects at runtime

Actually jmap Command is a useful way to use , It's the following :

jmap -histo PID

The printed information is as follows

 num     #instances         #bytes  class name

----------------------------------------------

   1:        329688       38130208  [C

   2:         58167       35292392  [B

   3:         11450       31126360  [I

   4:        260352        6248448  java.lang.String

   5:         25155        3273232  [Ljava.lang.Object;

   6:         26177        2303576  java.lang.reflect.Method

   7:         69254        2216128  java.util.HashMap$Node

   8:         62254        1992128  java.util.concurrent.ConcurrentHashMap$Node

   9:         15449        1721448  java.lang.Class

  10:          7493        1095728  [Ljava.util.HashMap$Node;

  11:         37330         895920  org.apache.catalina.startup.ContextConfig$JavaClassCacheEntry

  12:           820         838912  [Ljava.util.concurrent.ConcurrentHashMap$Node;

  1. Use jmap Generate heap memory dump snapshot

jmap -dump:live,format=b,file=dump PID

This command will generate a... In the current directory dump file , Here is the binary format , You can't open it directly , He took this moment JVM The snapshot of all objects in the heap memory is put into the file , For your subsequent analysis .

  1. Use jhat Analyze the heap in the browser and roll out the snapshot

jhat To analyze the heap snapshot ,jhat Built in web The server , It will support you to analyze heap dump snapshots graphically through the browser

jhat dump

The default port is 7000 , You can also specify a port -port 8000

Pull the bottom Look for histo Information , Point past

  1. From test to launch : How to analyze JVM Operation condition and reasonable optimization ?
  1. Predictive optimization

Estimate how many requests the system makes per second , How many objects are created per request , How much memory does it take , What kind of configuration should the machine choose , How much memory should the younger generation give ,Young GC Trigger frequency , The rate at which objects enter the old age , How much memory should we give in the old age ,Full GC Trigger frequency .

These things can actually be based on your own code , Roughly and reasonably estimated .

In fact, the optimization idea is simply in one sentence :

Try to make every time Young GC After the survival of less than Survivor Regional 50%

It's all in the younger generation

Try not to let the object enter the old age

Try to reduce Full GC The frequency of

  1. During system pressure measurement JVM Optimize

Tool simulation 1000 Users , Per second 500 A request

adopt jstat Analyze the following data :

The growth rate of new generation objects

ygc The trigger frequency of

ygc Time consuming

ygc How many objects survived

ygc How many objects have entered the elderly generation

The growth rate of objects in the old age

fullgc The trigger frequency of

fullgc Time consuming

Real optimization , It must be you according to your own system , After actual observation , Then adjust the memory distribution reasonably , There is nothing fixed JVM Optimize template

  1. On line system JVM monitor
  1. low Methods

jstat jmap jhat Check the data regularly

  1. Zabbix、OpenFalcon、Ganglia

Online monitoring , visualization , Display the occupation change curve of each memory area ,eden The growth rate of the district 、ygc、fullgc The number and time taken

It can also be monitored , such as 10 minute 5 More than once fullgc, Send email and SMS notification

  1. Case actual combat - Per second 10w concurrent BI System , Positioning and resolution ygc problem
  1. Business background

Collect the business data of merchants , analysis , Make a report and show it to them , Guide the operation of merchants

Collect data , adopt spark、flink、hadoop To analyze , adopt mysql、es、hbase To store

bi System java It's done , To read the stored data , Presentation Report

4 nucleus 8g machine , Just launched and deployed ,eden District 1g、s1 s2 various 100mb

  1. Technical pain point

Merchants need to adjust every few seconds bi System , Get real-time data , Refresh the report

Tens of thousands of merchants , Thousands of merchants online in real time , Estimated per second 500 A request .

Each request loads a lot of data , Calculate it in memory , Back to the front end .

Estimate each request 100kb Memory usage of ,500 The request is 50mb The amount of data

  1. Early frequent ygc

Per second 50mb,200s Time for ,eden The area is full , frequent ygc

ygc About dozens ms, The surviving objects are about dozens mb Even a few mb

The scene you see is 200s once ygc, Pause for dozens ms, Little impact on users .

  1. Simulated scene JVM Parameters

Heap area 200m, The new generation 100m, Old age 100m,eden District 80m,s District 10m

-Xmx200m -Xms200m -Xmn100m -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=15 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:gc.log

  1. Simulation code

public class BIDemo {

    public static void main(String[] args) throws InterruptedException {

        Thread.sleep(30000);

        while (true) {

            loadData();

        }

    }

    private static void loadData() throws InterruptedException {

        byte[] data = null;

        for (int i = 0; i < 50; i++) {

            data = new byte[100 * 1024];

        }

        data = null;

        Thread.sleep(1000);

    }

}

  1. Simulate code interpretation

First sleep30s, In order to jps jvm process pid, then jstat monitor , Leave enough time to prepare

Then assign one at a time 100kb Array of , request 50 Time , amount to 5m data , Then disconnect all references , And then sleep 1s

Analog out , Per second 5m Memory usage of

  byte[] data = null;

  for (int i = 0; i < 50; i++) {

      data = new byte[100 * 1024];

  }

  data = null;

  1. Use jstat monitor

Compile run program

javac BIDemo.java

java -Xmx200m -Xms200m -Xmn100m -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=15 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:gc.log BIDemo

Once per second , Total print 1000 Time

jstat -gc 51464 1000 1000

Initial value

 S0C    S1C    S0U    S1U      EC       EU        OC         OU       MC     MU    CCSC   CCSU   YGC     YGCT    FGC    FGCT     GCT

10240.0 10240.0  0.0    0.0   81920.0   3276.8   102400.0     0.0     4480.0 770.3  384.0   75.9       0    0.000   0      0.000    0.000

initial eden It was used 3m, Next is 7m、27、32、37、42、47、52、57

Increase every second 5m object

eden Area arrives 80m Left and right , Just one trigger ygc, next eden The area is reduced to 1m many , It can be observed that 17s It happens about once ygc, Time consuming 0.001 second

That is to say 1ms Recyclable 80m Of eden Area object ,10ms Recyclable 800m,17s Once around ygc

ygc Then the surviving objects 675k, You can easily put 10ms District

This shows that the system works well ,ygc Will not cause the object to enter the elderly , There is no need to optimize , It can almost be defaulted that the growth rate of objects in the old age is 0,full gc The frequency of is almost 0, No impact on the system

adopt jstat Take a look back. , The following information can be analyzed :

eden The rate at which objects in the zone grow

ygc How many seconds does it happen , How long does it take

ygc How many surviving objects

ygc Whether the post survival object can be placed s District

ygc How many objects have entered the old age

The growth rate of the old age

fullgc The trigger frequency of

fullgc Time consuming

  1. Case actual combat - A real-time analysis engine with 10 billion data per day , Positioning and resolution fullgc problem
  1. Business background

Each time, it will extract about 1 About 10000 pieces of data are calculated in memory , On average, each calculation will cost 10 Seconds or so , Then each machine is 4 nucleus 8G Configuration of ,JVM Memory is given to 4G, The new generation and the old generation are 1.5G Of memory space , Let's see the picture below .

Every piece of data 1kb,1w Jump data 10mb

The new generation is based on 8:1:1 To distribute Eden And two Survivor Region , So generally speaking ,Eden District is 1.2GB, each Survivor The area is 100MB about , Here's the picture .

Perform one calculation task at a time , Will be in Eden District Distribution 10MB Left and right objects , Then one minute probably corresponds to 100 Calculation task , In fact, basically one minute later ,Eden The area is full of objects , It's almost full

  1. Trigger Minor GC How many people will enter the old age when they are young

ygc When 80 A task has been completed ,20 One is executing , also 200mb It can't be recycled ,s There's no room for , Directly into the old age .

  1. How long does the system run , The old days will probably fill up ?

1 minute 200mb Into the old age , In the old days 1.5g,7 Minutes old generation is almost full , The next time ygc, Pre warranty ,100mb Not enough 200mb, Offense full gc

The frequency of this system is 7,8 Minutes to trigger full gc

  1. Optimization of this case

First of all, make sure s The area can put down surviving objects

3gb Heap memory ,1g to eden ,s Areas are given to 200mb, Every time ygc, Just enough to put down the surviving object .

such full gc The frequency of is reduced from a few minutes to a few hours

  1. The load expands again to 10 times

Per second 100mb Usage quantity ,1.6g The new generation ,10s Much ygc, A batch of data needs to be processed 10s, Probably 1g Objects cannot be recycled

So every once in a while 10 many s There is 1g Data into the old age , It may trigger several times in a minute full gc

  1. Use large memory machines to optimize

4 nucleus 8g Upgrade to 16 nucleus 32g

10 Times upgrade ,eden Area to 16g,s Each district 2g, Per second 100mb,2 Every minute or so ygc

Every time ygc The remaining objects are less than 1g, Also reduced. fullgc The frequency of

Because it is not facing the system used by users , Therefore, there is no need to use g1, Every pause 1s, It has no impact on the system .

  1. Simulated scene JVM Parameters

Heap area 200m, The new generation 100m, Old age 100m,eden District 80m,s District 10m

-Xmx200m -Xms200m -Xmn100m -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=15 -XX:PretenureSizeThreshold=20m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:gc.log

  1. Simulation code

public class BigDataDemo {

    public static void main(String[] args) throws InterruptedException {

        Thread.sleep(30000);

        while (true) {

            loadData();

        }

    }

    private static void loadData() throws InterruptedException {

        byte[] data = null;

        for (int i = 0; i < 4; i++) {

            data = new byte[10 * 1024 * 1024];

        }

        data = null;

        byte[]  data1 = new byte[10 * 1024 * 1024];

        byte[]  data2 = new byte[10 * 1024 * 1024];

        byte[]  data3 = new byte[10 * 1024 * 1024];

        data3 = new byte[10 * 1024 * 1024];

        Thread.sleep(1000);

    }

}

  1. Simulate code interpretation

byte[] data = null;

for (int i = 0; i < 4; i++) {

    data = new byte[10 * 1024 * 1024];

}

data = null;

byte[]  data1 = new byte[10 * 1024 * 1024];

byte[]  data2 = new byte[10 * 1024 * 1024];

byte[]  data3 = new byte[10 * 1024 * 1024];

data3 = new byte[10 * 1024 * 1024];

Thread.sleep(1000);

Distribute 4 individual 10m The object of , Immediately become garbage , redistribution 30m object , send 10m The object becomes garbage , Not enough at this time , Trigger ygc.

ygc Post surplus 20m Survive ,10m Memory to be allocated

This is a 1s Trigger once ygc

  1. Use jstat monitor

Compile run program

javac BigDataDemo.java

java -Xmx200m -Xms200m -Xmn100m -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=15 -XX:PretenureSizeThreshold=20m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:gc.log BigDataDemo

Once per second , Total print 1000 Time

jstat -gc 51464 1000 1000

S0C    S1C    S0U    S1U      EC       EU        OC         OU       MC     MU    CCSC   CCSU   YGC     YGCT    FGC    FGCT     GCT

10240.0 10240.0  0.0    0.0   81920.0   3276.8   102400.0     0.0     4480.0 770.3  384.0   75.9       0    0.000   0      0.000    0.000

initial eden It was used 3m

next eden Each district s once ygc

ygc after s There are 600kb Unknown object

ygc old There is 30m Objects enter , Obviously s There's no room for

Can see every second 20-30m Object into old District ,

Again 60m It happened once fullgc, The old generation has become 30m

That is, new per second 80m object , Per second 1 Time gc, Every time gc Post survival objects 20-30m,s I can't put it down

In the old age, new ones are added every second 20-30m object ,fullgc 3 second 1 Time .

ygc 14ms

fullgc 1ms

Every time ygc There are too many objects after , Enter frequently old District , Lead to fullgc Produce frequently

  1. Yes jvm To optimize

Compile run program

javac BigDataDemo.java

java -Xmx300m -Xms300m -Xmn200m -XX:SurvivorRatio=2 -XX:MaxTenuringThreshold=15 -XX:PretenureSizeThreshold=20m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:gc.log BigDataDemo

The total size of the pile 300m,old District 100m,young District 200m,eden100m,s District 50m

such ygc after ,20m Survive , Into the s District , Usually not more than 50% Dynamic age determination threshold

Every time ygc after , Few objects have entered the old age , In the end 600k The object of the old age , There is no object in the old age

ygc Time consuming 12ms

  1. Case actual combat : 100000 per second QPS Social APP How to optimize GC Performance improvement 3 times ?
  1. Business background

Fastigium 10w qps social contact app

Visit your home page 5m data

  1. Optimization plan

The first is to add machines , Reduce the cost of a single machine qps pressure

CMS By default, the garbage collector adopts tags - Clean up algorithm , So it will cause a lot of memory fragments .

Therefore, we need to use the tag collation Algorithm , Add the following parameters

The meaning is Enable memory grooming ,5 Time fullgc Tidy up the memory once

-XX:+UseCMSCompactAtFullCollection -XX:CMSFullGCsBeforeCompaction=5

that 5 Time fullgc Then tidy up the memory , As a result, continuous memory may become less and less during the period .

Then directly modify it to fullgc After that, directly tidy up the memory , After finishing each time , The rest of the space is continuous

-XX:+UseCMSCompactAtFullCollection -XX:CMSFullGCsBeforeCompaction=0

  1. Case actual combat : Vertical E-commerce APP Background system , Yes Full GC Deep optimization
  1. Business background

Small and medium-sized vertical e-commerce app, Live hundreds of thousands a day , Fastigium qps A few hundred /s

The default jvm Parameters ,1 Hours Several times fullgc

  1. Optimization plan

Customize a basic jvm Parameter template , Ensure that most systems use ,jvm The performance will not be too bad . Avoid using the default jvm Parameters or do not know how to set them jvm Parameters .

-Xms4096M -Xmx4096M -Xmn3072M -Xss1M -XX:PermSize=256M -XX:MaxPermSize=256M -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=92 -XX:+UseCMSCompactAtFullCollection -XX:CMSFullGCsBeforeCompaction=0

4 nucleus 8g Machine

Total reactor 4g

The new generation 3g   eden District 2.4g  s District 300m

Old age 1g

Metadata area 256m

  1. How to optimize every Full GC Performance of ?

-XX:+CMSParallelInitialMarkEnabled

This parameter will be in CMS Garbage collector's “ Initial marker ” Phase to start multithreaded concurrent execution , Reduce stw Time for

-XX:+CMSScavengeBeforeRemark, This parameter will be in CMS Before the relabeling phase of , Try to do it once first Young GC, This reduces cms The time-consuming relabeling phase , Scan less objects .

  1. jvm The final template

-Xms4096M -Xmx4096M -Xmn3072M -Xss1M -XX:MetaspaceSize=256M -XX:MaxMetaspaceSize=256M -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFaction=92 -XX:+UseCMSCompactAtFullCollection -XX:CMSFullGCsBeforeCompaction=0 -XX:+CMSParallelInitialMarkEnabled -XX:+CMSScavengeBeforeRemark

  1. Case actual combat - Novice engineers are unreasonably set JVM Parameters , Lead to frequent FullGC
  1. Plagiarized parameters

Novice Engineers Plagiarized online -XX:SoftRefLRUPolicyMSPerMB=0 Parameters

There are a lot of reflections in the code

Method method = XXX.class.getDeclaredMethod(xx,xx);

method.invoke(target,params);

When reflecting, you need to generate some classes , These classes are soft referenced , It will be dynamically created and loaded into the metadata area

There is a formula for recycling soft references

clock - timestamp <= freespace * SoftRefLRUPolicyMSPerMB

clock - timestamp” Represents a soft reference object that has not been accessed for a long time

freespace representative JVM Free memory space in ,SoftRefLRUPolicyMSPerMB Represents every MB Free memory space allows SoftReference How long does the object live , Default 1mb Leisure can survive 1s.

hypothesis jvm Yes 3g Free memory ,3000m* Default 1s Namely 3000s,50 minute . That is to say 1 Objects 50 Minutes have not been visited , Can be recycled .

commonly jvm Memory is free , Only oom There is not enough memory at the time of , This is the time to recycle soft references .

  1. Location problem

-XX:TraceClassLoading -XX:TraceClassUnloading

Through this parameter in

Tomcat Of catalina.out Log file , Output a bunch of logs , It shows something similar to :

【Loaded sun.reflect.GeneratedSerializationConstructorAccessor from __JVM_Defined_Class】

  1. Resulting problems

-XX:SoftRefLRUPolicyMSPerMB=0

The meaning of representation is to happen gc, Just recycle these soft reference objects , Free memory .

jvm Internal mechanism , This will lead to more and more strange classes , The final metadata area is not enough , Trigger full gc, The internal implementation is complex , Don't care about details here , Just know this conclusion .

  1. Solution

This parameter can be set larger , Don't set some new students to 0, You can set a 1000,2000,3000, perhaps 5000 millisecond , Fine .

  1. Case actual combat : Online system dozens of times a day Full GC Cause frequent jamming
  1. background

Machine configuration :2 nucleus 4G

JVM Heap memory size :2G

System running time :6 God

System operation 6 What happened within days Full GC Times and time :250 Time ,70 More than a second

System operation 6 What happened within days Young GC Times and time :2.6 Ten thousand times ,1400 second

Comprehensive analysis , It happens roughly every day 40 many times Full GC, On average, every hour 2 Time , Every time Full GC stay 300 Millisecond or so ;

  1. Not optimized jvm

-Xms1536M -Xmx1536M -Xmn512M -Xss256K -XX:SurvivorRatio=5 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=68 -XX:+CMSParallelRemarkEnabled -XX:+UseCMSInitiatingOccupancyOnly -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC

Heap size 1.5g

The new generation 512m,eden 365m,s 73m

Old age 1g

-XX:CMSInitiatingOccupancyFraction Parameter set to 68

representative old The district is full 68% It triggers full gc, That is to say 680m The object is fullgc

  1. According to the online system GC The running memory model is pushed backward

Based on the online data , Every minute 3 Time ygc, explain 20s,eden The area is full , Produce once ygc, Produced 365m Left and right objects .

Count down Per second 15-20m object .

At the same time every hour 2 Time fgc,30 Minutes at a time fgc, It should be full 68%, Produced 680m Object triggered fgc.

So the system runs 30 Minutes produced 600 many m The object of ,1 Time fgc 300ms

But it could be s The area cannot be put down, resulting in fgc, It is also possible that the object has survived for a long time , It is also possible that dynamic age rules trigger .

Cause classes to accumulate in old District , Trigger fgc

  1. Why are there so many objects in the old generation

Through visual monitoring and inference, it is absolutely impossible to analyze further , We don't know why there are so many objects in the elderly generation

jstat Production has taken a wave

The observed

Every time ygc after , Few people enter the old age

Every time ygc Post survival 15-20m,s District 70 many m, Therefore, dynamic age determination rules are often triggered

however ,jsta The data shows only Once in a while ygc after There are dozens m Objects enter old District

So normally , It shouldn't be 30 Minutes will lead to the occupation of space in the old age reaching 68%.

adopt jstat Observed , The memory of the old age is occupied when the system is running , do not know why , Suddenly there are hundreds m The object comes up , About fiveorsix hundred m Has been occupied by the elderly

It is precisely because there are five or six hundred m Use , Every time ygc, Putting a little object in causes fgc 了

  1. The big object of the positioning system

Through analysis , We know that it is likely that large objects lead to , Go straight into old District , We use jmap Export a copy dump memory dump .

Then use jhat perhaps visuavm To analyze

Through analysis , Directly locate the hundreds mb Large objects of , Just a few map Data structures like that .

Look at the , It is the data found from the data , Loaded in

Officially, because of these hundreds of big objects , Directly into the old age , Cause occasionally several times fgc There are dozens mb Objects come in , Trigger gfc

  1. Solution

Heap size 1.5g

The new generation 1g,eden 700mm,s 150m

Old age 1g

Adjust it like this s The area is enough , It's not easy to enter the old age

old District 92% only old gc, Not too often fgc

Active setting of metadata area 256m, Do not set the default number m, In case of reflection , There are too many dynamically loaded classes , Lead to fgc

Every minute after optimization 1 Time gc, Dozens of ms,10 Once in a day fgc, A few hundred ms, The frequency decreases

-Xms1536M -Xmx1536M -Xmn1024M -Xss256K -XX:SurvivorRatio=5 -XX:PermSize=256M -XX:MaxPermSize=256M  -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=92 -XX:+CMSParallelRemarkEnabled -XX:+UseCMSInitiatingOccupancyOnly -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC

Compare with the previous optimization

-Xms1536M -Xmx1536M -Xmn512M -Xss256K -XX:SurvivorRatio=5 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=68 -XX:+CMSParallelRemarkEnabled -XX:+UseCMSInitiatingOccupancyOnly -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC

Heap size 1.5g

The new generation 512m,eden 365m,s 73m

Old age 1g

  1. Case actual combat : Under the e-commerce promotion activities , serious Full GC Causes the system to jam directly
  1. background

Big promotion activities , The system is directly stuck , All requests cannot be processed , Restarting the system will have no effect .

Analysis is not jvm gc The problem of ,jstat Look at the , To perform a second fgc once , It takes hundreds of milliseconds

Strangely enough : There is nothing wrong with each memory area

Then why does it happen frequently fgc Well ?

It can be inferred immediately that there may be something written in the code System.gc(), As a result, each execution will produce a fgc

This student wants to trigger once gc, Reclaiming memory . Usually low traffic is no problem , Great promotion period , The traffic has come up ,gc Trigger frequently , Causes the system to jam directly .

So you can add the following parameters , Disable display execution gc,-XX:+DisableExplicitGC

  1. Case actual combat : Memory leaks and Full GC Optimize
  1. background

Send SMS to users , mail ,app push news , Guide users to participate in marketing activities .

From the overall design ,db、cache、 Machine resources are quite sufficient , Usually no problem

however , This big promotion activity , Directly lead to production cpu Surge in usage , Cause the system to get stuck , Unable to process any requests

After restarting the system , Okay , But I soon found this .

  1. Troubleshoot problems

cpu There are two scenarios when the load is too high

A large number of threads are created in the system , Thread running at the same time , High workload , Lead to cpu Load rise

jvm Frequent gc,gc It's very exhausting cpu resources

Take a look through the monitoring platform , Not frequently gc, It is caused by multithreading .

Observed jvm fgc, Once a minute , It takes hundreds of milliseconds each time .

  1. Frequent preliminary investigation Full GC The problem of

fgc There are three possibilities :

Memory allocation is unreasonable , Objects frequently enter the elderly generation , causing fgc

Memory leak , There are a lot of objects in the memory, which are full of old times , Causing some objects to be imported slightly is causing fgc

There are too many metadata area classes , Trigger fgc

Use jstat Analyzed it , There is no problem of unreasonable memory allocation , The metadata area is also normal , That can only be a memory leak

except jmap and jhat outside , There is a stronger tool ,mat, Very powerful memory analysis tool

jmap everywhere dump

First use jmap Command to export a memory snapshot of the online system

jmap -dump:format=b,file= file name [ Service process ID]

  1. MAT analysis

mat Before analysis, you need to give mat Of jvm The parameters are set well , Such as to 1g A pile of

Download address :https://www.eclipse.org/mat/

Leak Suspects, Is the analysis of memory leakage

next MAT Will analyze your memory snapshot , Try to find a batch of objects that cause memory leaks .

Through analysis , It is found that the objects created by the system itself occupy too much , There are as many as 100000 instances of this object , It takes up more than half of the space of the elderly .

This is a typical memory leak , The system creates a large number of objects and takes up memory , In fact, many objects do not need to be used , And it can't be recycled

Then I found out why , There is one in the system jvm Local cache , Many data are loaded into memory and cached , But there is no limit to the size and time of memory , No, LRU Such algorithms regularly eliminate some cached data .

The solution is simple : Use EHCache Such a caching framework is ok , It fixes the maximum number of objects cached

  1. Case actual combat : Frequency caused by million level data misprocessing Full GC Problem optimization
  1. Transaction scenarios

Once an online system was upgraded , As a result, it was only half an hour after the upgrade , Suddenly, I received snowflake like feedback from operations and customer service , The corresponding front-end page of this system can't be accessed , All users see is a blank and error message .

At this time, the snowflake like alarm is also received through the monitoring and alarm platform , Find the machine where the online system is located CPU The load is very high , Keep going up , Even directly caused the machine to go down . So of course, the front-end page corresponding to the system can't see anything .

  1. CPU Cause analysis of high load

The last article has been summarized for you CPU The reason for the high load , Here we will directly say the conclusion , Took a look at the monitoring and jstat, Find out Full GC Very often , Basically, it will be executed every two minutes Full GC, And every time Full GC It takes a long time , stay 10 About seconds !

So start directly and try Full GC The reason for this is .

  1. Full GC Frequent cause analysis

jstat Online observation

JVM The heap of is allocated 20G Of memory , among 10G To the younger generation ,10G To the old age

Eden Area probably 1 It will be full in about minutes

Trigger once Young GC

Young GC Then there are a few GB All the objects are alive and will enter the elderly

It shows that a large number of objects are generated when the system is running , Processing is extremely slow ,1 minute ygc Many objects will survive in the future , Cause a large number of objects to enter old District .

Leading to It is triggered every two minutes on average fgc, Because the memory of the elderly generation is very large , This led to a fgc Will be 10s Time for

Every time 2 In minutes stw 10s, For users , Unacceptable , Lead to cpu Fill up frequently , Overload , As a result, users often cannot access the system , Page blank

  1. The old one GC Can optimization strategies work ?

Use the young generation you learned before to turn it up , to s Regional adjustment , Avoid objects entering the elderly , This method obviously doesn't work .

Because even if you give the younger generation more space , even to the extent that s The district is here 2g,3g.

But once ygc after , Because the system is too slow , Lead to several g Of the objects that survived , Yours s There's no room for .

So simple jvm Optimization is no longer possible . It needs to be changed and upgraded from the code level .1 Too much data is loaded ,2 Data processing is still particularly slow , There are several in the memory gb The data of , Even to 1 It's only a matter of minutes .

The core point is to avoid loading too much data

  1. Use mat Analyze the case

Prepare a piece of code

public class MatDemo {

    public static void main(String[] args) throws InterruptedException {

        List<Data> list = new ArrayList<Data>();

        for (int i = 0; i < 10000; i++) {

            list.add(new Data());

        }

        Thread.sleep(1 * 60 * 60 * 1000);

    }

    static class Data {

    }

}

export dump file

jmap -dump:live,format=b,file=dump.hprof 1177

Use mat analysis , it is to be noted that ,mat Configuration file for MemoryAnalyzer.ini Modify its startup heap size to 8g

And then choose one of them “Open a Heap Dump” Open your own one dump Just take a snapshot

Then you will see the following figure , If you use the latest version like me MAT, Then open dump The snapshot will prompt you , Do you want to analyze memory leaks , Namely Leak Suspects Report, You usually check yes .

  1. Analyze the data

problem suspect1 Is the first concern

It shows

java.lang.Thread main, That is to say main Threads , Occupation is referenced through local variables 24.97% Memory objects

And he told you it was a java.lang.Object[] Array , This array takes up a lot of memory .

What exactly is in this array ?

“Problem Suspect 1” The last line in the box is a hyperlink “Details”, Click in to see the detailed description .

Through this detailed description , Especially the “Accumulated Objects in Dominator Tree”, In it we can see ,java.lang.Thread main There is a reference in the thread java.util.ArrayList, There's a java.lang.Object[] Array , Every element in the array is Demo1$Data Object instances .

Only this and nothing more , That's pretty clear , What object takes up too much memory in memory . So you can know through this small example , What are those super large objects in your system , use MAT Analyzing memory is very convenient .

  1. Trace thread execution stack , Find the problem code

Once it is found that a thread has created a large number of objects during execution , You can try to find out what code the thread executed to create these objects

First click on one of the pages “See stacktrace”, But then it will enter the call chain of a thread execution code stack :

According to this method, the execution stack of a thread in the online system is traced , Finally, it is found that this thread executes “String.split()” Method results in a large number of objects

  1. Why? “String.split()” Memory leaks can occur ?

At that time, this online system used JDK 1.7

Is to create a new array for each segmented string , such as “Hello” The string corresponds to a new array ,[H,e,l,l,o].

So the processing logic of the online system at that time , Is to load a large amount of data , Sometimes hundreds of thousands of pieces of data may be loaded at one time , The data is mainly string

Then cut these strings , Each string will be cut into N A small string .

This instantly causes the number of strings to soar several times or even dozens of times , This is the fundamental reason why the system often produces a large number of objects !

Because before this system upgrade , It's not String.split() This line of code , So at that time, the system basically operated normally , In fact, hundreds of thousands of pieces of data are loaded at a time , At that time, there were basically several times per hour Full GC, But it's basically normal .

Only after the system upgrade, the code was added String.split() operation , Instantly lead to a surge in memory usage N times , Triggered the above-mentioned every minute Young GC, Once every two minutes Full GC, The fundamental reason is the introduction of this line of code .

  1. Solution

String.split() This code was available but not needed , Remove directly .

At the same time, this big data processing system , Too much data is loaded into memory at one time , The code needs to be optimized .

Open the thread pool and process a large amount of data , Try to improve the speed of data processing , such ygc In order to avoid too many objects surviving

  1. frequent Full GC Several common reasons for

Through the summary and induction of those cases , We can get the following common frequent Full GC Why :

1. The system carries high concurrent requests , Or the amount of data processed is too large , Lead to Young GC Very frequent , And every time Young GC There are too many surviving objects , Memory allocation is unreasonable ,Survivor The area is too small , Cause the object to enter the old generation frequently , Trigger frequently Full GC.

2. The system loads too much data into memory at one time , Make a lot of big objects , Lead to frequent large objects entering the elderly generation , Must trigger frequently Full GC

3. The system has a memory leak , Inexplicably create a large number of objects , Never recyclable , It has been occupied by the elderly , Must trigger frequently Full GC

4.Metaspace( Forever ) Trigger because there are too many loaded classes Full GC

5. Wrong call System.gc() Trigger Full GC

In fact, common frequent Full GC The reason is nothing more than the above , So we deal with it online Full GC When , Just analyze from these angles , The core weapon is jstat.

If jstat Analysis found Full GC The reason is the first , Then allocate memory reasonably , turn up Survivor Just the area .

If jstat Analysis shows that it is the second or third reason , That is to say, there are a large number of objects that cannot be recycled in the elderly generation , There are not many diseases for the young generation to enter the old age , then dump Take a memory snapshot , And then use MAT Tools for analysis

Through analysis , Find out what object takes up too much memory , Then, through the reference of some objects and threads, perform stack analysis , Find out which piece of code produces so many objects . Then optimize the code .

If jstat Analysis found that memory usage is not much , It also triggers frequently Full GC, It must be the fourth and fifth , At this time, the corresponding optimization is

  1. OOM out of memory
  1. The reasons causing

OOM Namely OutOfMemory out of memory , Memory is limited , Keep stuffing things in , As a result, the memory can't be filled , It overflowed

  1. Where it came into being

Metadata area , A large number of classes are loaded

Stack , There are too many levels of method calls , Form a large number of stack frames

Pile up , A large number of created objects

  1. Metaspace Trigger OOM

Parameter setting

-XX:MetaspaceSize=512m -XX:MaxMetaspaceSize=512m

Metaspace The size of itself is dynamically expanded ,-XX:MetaspaceSize -XX:MaxMetaspaceSize What is specified is when Metaspace When the capacity is expanded, it will trigger Full GC

If Metaspace When the area is full, it will fgc, By the way oldgc、ygc

Once your class doesn't recycle too much , Still desperately loading classes , Namely OOM, The system crashed directly

The reasons causing :

  1. MetaspaceSize Parameter not set , By default , Only a few dozen m, For a slightly larger system , There are many kinds. , It also relies on external jar Many classes of packages , Dozens of m, Not enough use
  2. Use cglib Technology such as dynamic production of some classes , Once the code is not under control , Resulting in too many generated classes , hold Metaspace Fill up

Solution :

  1. Set up 512m, Generally enough ,-XX:MetaspaceSize=512m -XX:MaxMetaspaceSize=512m

Case study :

Enhancer enhancer=new Enhancer();

enhancer.setSuperclass(Car.class);

enhancer.setUseCache(false);

methodProxy.invokeSuper(o, objects);” Call the parent class Car Of run() Method

methodProxy.invokeSuper(o, objects); Call the parent class Car Of run() Method

-XX:MetaspaceSize=10m -XX:MaxMetaspaceSize=10m

Loop to create this dynamic Car Subclasses of

Created 200 An error is reported after multiple objects

Exception in thread "main" java.lang.IllegalStateException: Unable to load cache item

at net.sf.cglib.core.internal.LoadingCache.createEntry(LoadingCache.java:79)

at com.limao.demo.jvm.Demo1$Car$$EnhancerByCGLIB$$7e5aa3a5_264.run(<generated>)

at com.limao.demo.jvm.Demo1.main(Demo1.java:30)

Caused by: java.lang.OutOfMemoryError: Metaspace

at java.lang.Class.forName0(Native Method)

This OutOfMemoryError Is the classic memory overflow problem , And he told you clearly , yes Metaspace Memory overflowed in this area

once OOM,jvm The process is direct crash, Program exit

How to locate :

dump file mat analysis

  1. Stack overflow

Method stack frame 、 Local variables of methods will occupy memory

A large number of stack frames occupy memory , Cause stack overflow

Memory size of stack , Determines the number of layers of method calls , The depth of recursion

The reasons causing :

Generally, stack overflow is caused by code bug, For example, wireless method recursion

commonly Metaspace to 512m

Stack set each thread 1m,tomcat Hundreds of threads + Self created thread , almost 1000 Less than threads ,1g Memory

Heap to half of the system memory

Case simulation :

-XX:ThreadStackSize=1m, Set... With this parameter JVM The stack memory of is 1MB

Let's recurse , Report errors java.lang.StackOverflowError

How to locate :

Write exception to log , You can see in the log , Stack overflow , Methods will be called in large numbers

  1. Heap overflow

fullgc after ,old The area is not enough for survival objects , Just OOM 了

The reasons causing :

  1. High concurrency , Cause a large number of objects to survive , Into the old District , Fill up , trigger oom
  2. Memory leak , Objects are alive , No dereference , Triggered gc Can't recycle yet , Cause insufficient memory

Case simulation :

-Xms10m -Xmx10m

Make an array , Constantly add objects to it

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

How to locate :

dump file mat analysis

  1. OOM Will the backward process exit

jvm The condition for exiting is that there is no foreground thread

oom Itself is also an anomaly , happen oom When , The oom The generated thread resources will be recycled

A thread recycled , Does not affect the execution of other threads

  1. Case actual combat - Large data processing system OOM
  1. Business background

A computing engine system with a huge amount of data ,pb Level data processing

Push the calculated data to other systems , be based on kafka To decouple

The technical scheme at that time ,kafka Hang up , Try again and again .

There is a hidden danger ,kafka It really broke down , All the data of that calculation resides in memory , Can't release , Have been waiting for kafa recovery .

Then keep calculating , Data is constantly coming out to be pushed kafka, All resident memory

So circular , Using more and more memory , Memory that cannot be freed , Eventually lead to oom

  1. Solution

short-term , To cancel the kafka Failure retry , once kafka fault , Directly discard the local calculation results , Allow free memory

follow-up , Optimize this mechanism to once kafka fault , Write the calculation results to the local disk , Allow memory data to be recycled

  1. Case actual combat : How two novice engineers mistakenly write code leads to OOM Of

Some exceptions occurred when writing logs on a node , At this time, the exception of this link node must also be written to ES Go to the cluster

try{

// A lot of business logic

log();

}catch(Exception e){

log();

}

public void log(){

try{

// Write the log to es colony

}catch(Exception e){

log();

}

}

ES fault , Recursively call , Cause stack overflow

  1. For the production system OOM Abnormal monitoring and alarm

The best OOM Monitoring plan

Zabbix、Open-Falcon Monitoring platforms like that

You can set once the system appears OOM abnormal , Send an alarm to the corresponding developer , Through the mail 、 Text messages or nails IM Tools .

Generally speaking, we have the following levels of monitoring for online systems :

machine (CPU、 disk 、 Memory 、 The Internet ) Load of resources ,

JVM Of GC Frequency and memory usage

Business indicators of the system itself

Abnormal system error .

  1. stay JVM Automatically when memory overflows dump memory dump

As long as there is that dump memory dump , You can use the one introduced before MAT Such tools can instantly analyze too many objects

-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/usr/local/app/oom

The optimized jvm Templates

-Xms4096M -Xmx4096M -Xmn3072M -Xss1M  -XX:MetaspaceSize=256M -XX:MaxMetaspaceSize=256M -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFaction=92 -XX:+UseCMSCompactAtFullCollection -XX:CMSFullGCsBeforeCompaction=0 -XX:+CMSParallelInitialMarkEnabled -XX:+CMSScavengeBeforeRemark -XX:+DisableExplicitGC -XX:+PrintGCDetails -Xloggc:gc.log -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/usr/local/app/oom

  1. Case actual combat - Why is the system with only hundreds of requests per second because OOM And collapse
  1. Case background

First thing : It must be logging in to the online machine to read the log , Instead of doing something else

At that time, I saw a sentence similar to the following in the log file of the machine :

Exception in thread "http-nio-8080-exec-1089" java.lang.OutOfMemoryError: Java heap space

Every right Tomcat Friends who have some understanding of the underlying working principle of , It should be reflected immediately , there “http-nio-8080-exec-1089” It's about Tomcat Worker thread .

And then there's “java.lang.OutOfMemoryError: Java heap space” If you study the previous content carefully , All very clear. , It refers to the problem of heap memory overflow .

The meaning of this log , Namely Tomcat The working thread of needs to allocate objects in heap memory when processing requests , But I found that the heap memory was full , And there is no way to recycle any redundant objects , At this time, there is really no way to put more objects in the heap memory , Reported the exception .

  1. tomcat principle

Tomcat I am one of them JVM process , The system we have written is just some code

These codes are classes one by one , These classes are Tomcat Load into memory , Then from Tomcat To execute the class we write .

Tomcat The worker thread , Because he is responsible for calling Spring MVC And what we wrote Controller、Service、DAO Wait for a lot of code , So he found that the heap memory was insufficient when running , The heap memory overflow exception will be thrown .

  1. It is found that the online system has a memory overflow exception , What do I do ?

The first step must be to read the log

Specifically, what are you looking at ? There are two main points :

1、 See if it is heap memory overflow ? Or stack memory overflow ? Or is it Metaspace out of memory ? First, we have to determine the specific overflow type

2、 Look at which thread ran out of memory when running code . because Tomcat Not only does it have its own working thread when running , The code we write may also create some threads

What if the thread we started encountered a memory overflow ? So these two points should be paid attention to first .

Then after reading these two things , You have to remember that every system is online , Be sure to set a parameter :

-XX:+HeapDumpOnOutOfMemoryError

This parameter will export a memory snapshot to the specified location when the system memory overflows .

Then check and locate the memory overflow problem , We have to rely mainly on the memory snapshot automatically exported .

  1. Analyze the memory snapshot

mat analysis Find the objects that occupy the most memory

To analyze :

First of all, we will find that a large number of “byte[]” Array , A lot of byte[] The array takes up about 8G Left and right memory space . And our online machine gave Tomcat Of JVM Heap memory allocation is 8G About memory .

So we can draw the first conclusion directly :

Tomcat The worker thread will create a large number of byte[] Array , There are about 8G about , Put... Directly JVM The heap memory is full . At this point, it is time to continue processing new requests , There is no way to continue allocating new objects in the heap , So memory overflows .

Continue analysis :

The discovery is probably similar to the following pile byte[] Array :

byte[10008192] @ 0x7aa800000 GET /order/v2 HTTP/1.0-forward...

byte[10008192] @ 0x7aa800000 GET /order/v2 HTTP/1.0-forward...

An array like the one above , There are about 800 about , So that's the answer 8G Space

Development communication :

The Engineer in charge of writing code insisted , It's definitely not written by yourself , So in MAT You can continue to check who referenced this array , It can be roughly found that Tomcat Class reference , Specifically, it is similar to the following class : org.apache.tomcat.util.threads.TaskThread

This class is Tomcat Own thread class , Therefore, it can be considered as Tomcat Threads of create a large number of byte[] Array , Occupy 8G Of memory space .

Analyze again :

At this time, we found that Tomcat The working threads of are roughly 400 about , That is to say, each Tomcat The worker thread of will create 2 individual byte[] Array , Every byte[] An array is 10MB about

In the end is 400 individual Tomcat The worker thread is processing requests at the same time , The result is created 8G In memory byte[] Array , This leads to memory overflow .

System per second QPS Only 100

Now you can know tomcat 400 All working threads are working , But at this time, the request is 100 individual , It can be inferred that each request 4s The processing time of

Why? Tomcat When processing a request, the worker thread will create 2 individual 10MB Array of

Tomcat Search the configuration file of , The following configuration was found : max-http-header-size: 10000000

Lead to Tomcat The worker thread will create 2 An array , The size of each array is as configured above 10MB

Overall analysis :

Per second 100 A request , Each request processing requires 4 second , Lead to 4 In seconds 400 Requests are simultaneously 400 Threads are processing , Each thread will be created according to the configuration 2 An array , Each array is 10MB, It's full. 8G Of memory .

Why processing a request requires 4 Second ?

The engineer of this system said , It usually takes only a few hundred milliseconds to process a request .

In that case , The only way is to find the problem in the log , Continue to read the log when the accident occurred , It is found that except OOM outside , In fact, there are a large number of service request timeout exceptions , Similar to the following :

Timeout Exception....

Our system is passing RPC When calling other systems, a large number of request timeouts suddenly appear , Check the system immediately RPC Call timeout configuration , A surprising discovery , The Engineer in charge of this system will serve RPC The call timeout is set to just 4 second !

in other words , It must be at this time , The remote service itself failed , Cause our system RPC There is no access when calling him , Then it will be configured 4 An exception is thrown after the timeout of seconds , Here 4 Seconds , The worker thread will be directly stuck on the invalid network access .

This is a request processing need 4 The root cause of seconds , This leads to 100 Under the pressure of requests ,4 Second backlog 400 Requests are being processed at the same time , Lead to 400 Worker threads created 800 An array , Each array 10MB Memory , Exhausted 8G Of memory , Eventually, memory overflows !

  1. Optimization plan

take rpc The timeout length is changed to 1s

tomcat The array created for the request max-http-header-size Parameter reduction , It won't take up too much memory space

  1. Case actual combat -Jetty Server's NIO How does the mechanism cause out of heap memory overflow
  1. background

Use Jetty As Web Server time in a very rare scenario occurred in a heap memory overflow scenario

One day, I suddenly received an alarm on the line : A service deployed on a machine is suddenly inaccessible .

At this time, of course, the first reaction is to immediately log on to the machine and look at the log , Because the service is down , Probably OOM The resulting collapse , Of course, it can also be caused by other reasons .

At this time, we found the following information in the machine log :

nio handle failed java.lang.OutOfMemoryError: Direct buffer memory

at org.eclipse.jetty.io.nio.xxxx

at org.eclipse.jetty.io.nio.xxxx

at org.eclipse.jetty.io.nio.xxxx

In the above log , The most important thing is to tell us that OOM abnormal

It's actually an area we haven't seen :Direct buffer memory, And below we see a lot of jetty Related method call stack

  1. How to apply for extra heap memory , And how it was released
  1. apply

If in Java A block of off heap memory space should be applied for in the code , It's using DirectByteBuffer This class , You can build a DirectByteBuffer The object of , The object itself is in JVM In the heap .

But while you are building this object , A memory space will be delimited out of the heap memory and associated with this object

  1. Release

DirectByteBuffer Object is not referenced , After becoming a garbage object , Naturally, at some time young gc Or is it full gc When the time is right DirectByteBuffer Object recycling .

Just recycle one DirectByteBuffer object , It will naturally release its associated off heap memory

  1. Why is there an out of heap memory overflow

If you create many DirectByteBuffer object , It takes up a lot of off heap memory , Then these DirectByteBuffer Object not yet GC Thread to recycle , Then the off heap memory will not be released !

When the off heap memory is heavily DirectByteBuffer Object associations use , If you need to use more off heap memory , Then it will report memory overflow

Under what circumstances will there be a large number of DirectByteBuffer The object is always alive , As a result, a large amount of off heap memory cannot be released ?

There is a possibility , That is, the system carries ultra-high concurrency , Complex pressure is high , A large number of requests come in an instant , Created too many DirectByteBuffer It takes up a lot of off heap memory , At this point, continue to want to use off heap memory , Memory will overflow !

But is this the case with this system ?

Obviously not ! Because the load of this system is not as high as expected , There will be no instant requests .

  1. Real out of heap memory overflow reason analysis

jvm The parameters are unreasonable eden District 100m ,s District 10m,old District 1g

Lead to DirectByteBuffer Object cannot be recycled , Continue to flow to old District , I haven't started yet fgc

Finally, use off heap memory , Because there's not enough memory , Lead to oom On , Actually make progress old District DirectByteBuffer It can be recycled long ago

  1. Java NIO Haven't you considered this problem

Think about it , It allocates memory outside the heap , Will execute System.gc(), To reclaim off heap memory , Release space .

But we jvm The parameters are set -XX:+DisableExplicitGC, Lead to gc invalid

  1. The final optimization

to open up -XX:+DisableExplicitGC

Give Way DirectByteBuffer Stay in the old generation

Just let it go -XX:+DisableExplicitGC The limitation of ,Java NIO It is found that out of heap memory is insufficient , Naturally it will pass through System.gc() remind JVM Go to active garbage collection , You can recycle some DirectByteBuffer Free some off heap memory .

  1. Case actual combat : A micro service architecture RPC Caused by the call OOM troubleshooting
  1. background

system-used rpc The frame is thrift, service A Call the service B, Lead to service B Downtime

service B The machine log shows :

java.lang.OutOfMemoryError Java heap space

  1. screening

java.lang.OutOfMemoryError: Java heap space

xx.xx.xx.rpc.xx.XXXClass.read()

xx.xx.xx.rpc.xx.XXXClass.xxMethod()

xx.xx.xx.rpc.xx.XXXClass.xxMethod()

It can be preliminarily determined , It's self-developed RPC When the framework receives the request, it raises OOM

  1. analysis

mat Analyzed the memory snapshot

It is found that the one that occupies the most memory is an oversized byte[] Array

The machine heap memory is only 4g,byte[] It occupies 4g Space

that rpc When it's running , Data sent , Will use byte[] cache , Then it is de sequenced into objects , Give the business layer code , Clean up after treatment byte[] cache

service A The request object is modified , For example, increased 5 A field , service B The request object class has only 10 A field , Deserialization failure .

At that time, the code logic was opened up when deserialization failed 4g,byte[], Put the original information , This directly leads to oom

The engineer who wrote this code at that time , In case deserialization fails , I don't know the data size , Get a very large array directly .

Solution :

take 4g, Adjusted for 4m, Generally, the request will not exceed 4m, There is no need to open up such a large array

In addition, let the service A And the service B The definition of the request object is consistent

  1. Case actual combat : Not once WHERE Conditions of the SQL Statement OOM Troubleshoot problems
  1. background

Write sql I didn't bring it where Conditions , Directly lead to millions of data , Load into memory , Lead to oom

  1. screening

Found in the log OOM It's abnormal :java.lang.OutOfMemoryError,java heap space.

  1. analysis
  1. use MAT Tools quickly locate problem codes

use MAT One of them Histogram function , Check which objects occupy the most memory

In this interface , In fact, you can instantly see who is taking up too much memory , For example, here is obviously Demo1$Data This inner class takes up too much memory

  1. Continue analysis - Take a deeper look at objects that are taking up too much memory

Enter a dominator_tree The interface of , Show that you are JVM All threads in

You can clearly see which threads have created too many objects

main Threads rank first

an main Thread after , I found one of them java.util.ArrayList @ 0x5c00206a8

It shows that the thread creates a huge ArrayList, Let's continue this ArrayList, Inside is a java.lang.Object[] Array , Continue to expand , You will see a lot of Demo1$Data Object .

Find a Tomcat Working threads of create a lot of java.lang.HashMap

  1. Analyze again - Which line of code creates so many objects

thread_overview

Will show JVM All threads in and the method call stack of each thread at that time , And what objects are created in each method

  1. Case actual combat : Every day 10 100 million data log analysis system OOM problem
  1. background

Every day 10 A log cleaning system with hundreds of millions of data , from kafka Consumption log data , Clean the log format , Desensitize sensitive information

Data after cleaning , Deliver to other systems , For example, recommendation system , Advertising system , Analysis system

  1. The scene of the accident

Suddenly received an online alarm , It is found that the log cleaning system has occurred OOM It's abnormal

Log in to the online machine to check the log , Discovery is still so classic java.lang.OutOfMemoryError: java heap space

First, look at the exception log , Find out who caused this problem , At that time, we saw some information similar to the following in the log :

java.lang.OutOfMemoryError: java heap space

xx.xx.xx.log.clean.XXClass.process()

xx.xx.xx.log.clean.XXClass.xx()

xx.xx.xx.log.clean.XXClass.xx()

It's obvious that , It seems the same way (XXClass.process()) Repeated many times , Finally, it leads to the problem of heap memory overflow

Experienced friends may have found a problem , That is, there are a lot of recursive operations in the code at a certain place . It is after a large number of recursive operations , That is, after calling a method repeatedly , This leads to the problem of heap memory overflow .

  1. Mat analysis

A large number of XXClass.process() Recursive execution of method , Every XXClass.process() A large number of char Array !

Finally, because XXClass.process() Method is called recursively many times , This leads to a large number of char[] The array ran out of memory

Recursive calls add up to create char[] In fact, the sum of array objects is the most 1G nothing more

from GC In the log , We can see JVM Complete parameter setting at startup , The core content is as follows :

-Xmx1024m -Xms1024m -XX:+PrintGCDetails -XX:+PrintGC() -XX:+HeapDumpOnOutOfMemoryError -Xloggc:/opt/logs/gc.log -XX:HeapDumpPath=/opt/logs/dump

This machine is 4 nucleus 8G Of

Then let's take a look at what was recorded at that time gc.log journal .

[Full GC (Allocation Failure)  866M->654M(1024M)]

[Full GC (Allocation Failure)  843M->633M(1024M)]

[Full GC (Allocation Failure)  855M->621M(1024M)]

[Full GC (Allocation Failure)  878M->612M(1024M)]

because Allocation Failure The trigger Full GC quite a lot , That is, the heap memory cannot allocate new objects , Then the trigger GC

Discover every time Full GC Only a few objects can be recycled , I found that the heap memory is almost full

The log shows that it will be executed every second Full GC

Basically, we can make it clear , It should be when the system is running , because XXClass.process() Method recursively creates a large number of char[] Array , As a result, the heap memory is almost full .

This leads to a continuous period of time , Trigger once per second Full GC, Because the memory is full , Especially the old age may be almost full , So it may be executed every second young gc Before , I found that the available space in the old age was insufficient , Will trigger in advance full gc

It could be young gc There are too many surviving objects to put in Survivor in , Should enter the old generation , I can't put it down. , It can only be carried out full gc.

But every time full gc Only a few objects can be recycled , Until the last possible time full gc You can't recycle any objects , Then the new object cannot be put into heap memory , This will trigger OOM Memory overflow exception .

  1. Look at the JVM Runtime memory usage model

Restart the system And applications ,jstat Analysis per second

S0 S1 E O YGC FGC

0 100 57 69 36 0

0 100 57 69 36 0

0 100 65 69 37 0

0 100 0 99 37 0

0 100 0 87 37 1

At first, they were all young people Eden The area is rising , next YGC from 36 To 37, It happened once YGC, next Old The proportion of districts directly from 69% To 99%

explain YGC There are too many survivors after the war ,Survivor can't let go , Directly into the old age

Then the old generation took up 99% 了 , Directly triggered once Full GC, But only let the elderly take up the proportion 99% To 87% nothing more , A small number of objects are recycled

The above process is repeated several times , Let's think about it , The objects of the younger generation have repeatedly entered the older generation , Constantly trigger Full GC, But not many objects are recycled , After several cycles , Old age is full , Probably Full GC Not many objects are recycled , A large number of new objects cannot be put away , It triggers OOM 了 .

  1. Optimize
  1. Increase the heap memory size

Directly in 4 nucleus 8G On the machine , Increase space for heap memory , Directly to the heap memory 5G Of memory

Then run the system , adopt jstat Observe , You can find , Every time Young GC Later, the surviving objects fall into Survivor Regional , Will not casually enter the elderly , And because the heap memory is very large , Basically, it will not happen after running for a period of time OOM Problem. .

  1. Rewrite the code

Let him not occupy too much memory . The reason why the code was recursive , Because in a log , There may be a lot of user information , A log may merge the information of a dozen or dozens of users .

At this time, the code will recurse a dozen to dozens of times to process this log , Every recursion produces a lot of char[] Array , The log is cut for processing .

In fact, this code is completely unnecessary , Because for every log , If you find that the information contains multiple users , In fact, just cut out this log for processing , There is no need to call recursively , Each call cuts the log , Generate a lot of char[] Array .

So after optimizing the code of this step , I suddenly found that the memory usage of the online system has decreased 10 More than times .

  1. Case actual combat : It is caused by too many service class loaders at one time OOM problem
  1. background

web System ,tomcat start-up , For a while , Get feedback , Service instability , Often fake death , Receive feedback from upstream services .

At the same time, you can't visit for a period of time , It's better for a while , This is the phenomenon

  1. screening

Two situations are generally considered when pretending to die , First, the machine load is high , Second, frequent gc

top command It depends on the machine

4 nucleus 8G Standard online virtual machine

Consume cpu1% Resources for , Memory consumption 50% above

For this kind of machine, we usually provide deployed Services JVM There is always 5G~6G, Planed out Metaspace Areas and so on , Heap memory will probably be given 4G~5G The appearance of , After all, you have to leave some memory for creating a large number of threads .

JVM There are three main types of memory used , Stack memory 、 Heap memory and Metaspace Area , Now I usually give Metaspace Area 512MB Above space , The heap memory is assumed to have 4G, Then stack memory ? Each thread usually gives 1MB Of memory , So if you JVM There are hundreds of thousands of threads in the process , It's going to be close 1G Memory consumption of .

So the whole jvm Memory is about 6g, Heap memory 4g

Now I see jvm The memory consumption of the process is as fast as 50%, It means that the memory allocated to it is almost exhausted

And the most important thing is , He keeps the consumption of memory resources at 50% above , Even higher , Explain him GC I didn't recycle the memory when I was !

  1. What happens when memory usage is so high ?

Since the process of this service has such high memory utilization , There are only three possible problems .

The first is the high memory utilization , Lead to frequent full gc,gc It brings stop the world The problem affects the service .

The second is excessive memory utilization , Lead to JVM Happen to yourself OOM.

The third is the high memory utilization , Sometimes it may cause this process because the requested memory is insufficient , Directly killed by the operating system !

adopt jstat observation , Indeed, memory usage is high , It does happen often gc, however gc It takes only a few hundred milliseconds

In addition, if oom, Inevitably, the service cannot be accessed , The upstream service will definitely feed back that the service is dead

There is nothing in the log oom abnormal

Speculation is the third , The process is linux kill , Then monitor script monitoring , Once a process is killed , The script will automatically restart the process , So the service can be accessed later

  1. Who is taking up too much memory

Continue to analyze according to our ideas , If we want to solve this problem , You have to find out , What object is taking up too much memory , And then apply for too much memory , Finally, the process was killed ?

It's simple , Directly export a memory snapshot from the online .

After we run the system online for a period of time , use top Command and jstat The command observed for some time , Find out jvm It has cost more than 50% The memory of the , At this time, a memory snapshot is quickly exported for analysis .

Use at this time MAT When performing memory snapshot analysis , We found that , There are a lot of ClassLoader That is, class loader , There are thousands , And these classloaders load things , It's all a lot of byte[] Array , All these occupy more than 50% Of memory . It seems that he is the culprit !

Those ClassLoader Where did it come from ? Why do you load so many byte[] Array ?

At that time, the engineer who wrote the system code made a custom class loader , And a large number of custom class loaders are created in the code without restrictions , To repeatedly load a large amount of data , As a result, the memory is often exhausted at once , The process was killed !

  1. solve

So solving this problem is very simple , Directly modify the code , Avoid creating thousands of custom class loaders repeatedly , Avoid repeatedly loading a large amount of data into memory , That's all right. .

  1. Case actual combat : A data synchronization system frequently OOM out of memory
  1. background

Data synchronization system , from mq Pull data , Storage

After each restart , For some time oom

oom It's usually high concurrency , Too many objects are created , can't let go , Or there is a memory leak , It can't be recycled

  1. screening

jstat analysis ,ygc after , The elderly generation will grow a lot , In the old days, the utilization rate reached 100% after , Trigger fgc,fgc You can't recycle objects from old times , The usage rate in the old age cannot be reduced , And then there's just oom

mat analysis , Find a queue data structure , A large amount of data is directly quoted , It is this queue that fills the memory .

from kafka Consumption data will be written to this queue first , Then slowly pull out the processing from this queue , Write data

At that time, the code was , from kafka Pull out a batch of data , Make this batch of data into a list, Drop into this queue .

Finally, this situation is formed : A queue has 1000 Elements , Every element is list, Every list There are hundreds of data .

To do so , As a result, hundreds of thousands of data are squeezed in the memory queue , Even millions of data , Eventually, memory overflows .

from Kafka The speed of putting the data consumed in the queue is very fast , But consume data from the queue for processing , Then write to storage is slow , Eventually, it will lead to a rapid backlog of data in the memory queue , Cause memory overflow .

And each element of this queue is a List How to do it , It will cause the amount of data that the memory queue can hold to expand significantly .

  1. solve

The final solution to this problem is also very simple , Modify the use of the above memory queue , A fixed length blocking queue is made .

For example, at most 1024 Elements , Then every time from Kafka Consumption data , Write data to the queue one by one , Instead of making one List Put it in the queue as an element . So the most in memory is 1024 Data , Once the memory queue is full , here Kafka The consuming thread will stop working , Because it was blocked by the queue . There will not be too much data in the memory queue .

  1. Always review - Online systems JVM Parameter optimization 、GC Problem location and troubleshooting 、OOM Analyze and solve

  1. JVM The most fundamental working principle of running the system we have written

The division of various parts of memory

The code is executing , How does each memory area work together

How objects are allocated

GC How to trigger

GC What is the principle of execution

Common for control JVM What are the core parameters of working behavior

  1. JVM Parameter optimization

For a written system , How should we set up some relatively reasonable JVM Parameters

And then how to optimize and adjust it reasonably during the test JVM Parameters of

What's more, how does the system deployed on the line respond to JVM monitor , How to appear online GC Performance problems JVM Make reasonable optimization .

in addition to , We have provided a large number of actual combat cases from the production line , It shows you all kinds of strange scenes JVM GC problem , Through the case to show you the ideas and methods to solve such problems .

  1. OOM Analysis of 、 Positioning solutions

For a production system , If it does OOM Memory overflow problem , What kind of method should we adopt to analyze 、 Positioning and resolution ?

Similarly, we have also provided a large number of first-line production cases , It shows you all kinds of strange OOM The scene of the problem , At the same time, it shows the analysis of this kind of problem 、 Positioning and solution ideas .

  1. Personal summary
  1. When ygc

eden The district is full ,ygc

  1. When does the object enter old District
  1. The object is over 15 year
  2. Dynamic age rule determination , Age 1- Age exceeds capacity 50%, Just put n The above objects are moved to old District
  3. ygc after , The living object s There's no room for
  4. Big object
  1. When fullgc

Heap area :

  1. Can't put in old District

ygc front , Average value of all surviving objects

ygc after , The living object Can't put in old District Trigger fullgc

  1. old The use of the area exceeds the standard

cms In mode ,old The utilization rate of the area exceeds 92%, Trigger fullgc

The performance scenes are

  1. High concurrency , Create a large number of objects in a short time ,s There's no room for , Into the old District ;
  2. 2. Load too much data into memory at one time , Cause a lot of data to enter old District
  3. Memory leak , Objects cannot be recycled for a long time

Metadata area :

The metadata area loads a lot of class information , Methods information

Manual trigger :

According to specified system.gc(), Trigger fullgc

  1. System operation model jvm Observation indicators
  1. eden Zone growth frequency
  2. ygc frequency 、 Time consuming
  3. ygc Post survival object size ,s Whether the area can be put down , Better not to exceed s Half the capacity of the zone
  4. old Area object growth frequency
  5. fullgc frequency 、 Time consuming

  1. OOM Cause of occurrence

Metadata area , A large number of classes are loaded

Stack , There are too many levels of method calls , Form a large number of stack frames , Stack overflow

Pile up , A large number of created objects ,old There's no room for

  1. CMS

GC The process

Initial marker : Get into stw state , according to GCRoots Mark whether it is garbage , fast

Concurrent Tags : stop it stw state , The system may create objects or objects become garbage , Track as much as possible

Re label : Get into stw state , Mark phases for concurrency The objects moved by the system are re marked , fast

Concurrent cleanup : stop it stw state , Multithreading concurrently cleans up garbage objects

oldgc In the process , If there is any left 8% Continuous space , Not enough survival objects are put in , Downgrade to serial old, Single thread ,stw, Re label , clear

Using tag - Clear algorithm , After removal , You can turn on compression memory

-XX:+UseCMSCompactAtFullCollection

-XX:CMSFullGCsBeforeCompaction=0

  1. G1

advantage : Intelligent gc, Expect time to be controllable , Low delay , Suitable for large memory

region Default 2048 individual , Specify the heap memory ,g1 Automatically manage the heap , The default Cenozoic minimum proportion 5%, The largest proportion 60%

gc Using replication algorithm , Copy data to new region On , Clean up the old region

The heap proportion reaches 45% Trigger mixing gc, Object survival is higher than 85% Of region No recovery , Default score 8 Recycle a batch of garbage at a time , Reduce single stw Time

free region achieve 5% Stop when gc

blend gc Failure , Upgrade to fullgc, Single thread relabel 、 clear 、 Arrangement

blend gc The process of

Initial marker : according to GCRoots Whether it can reach , Mark whether the object is garbage ,stw, fast

Concurrent Tags : Let go of stw, The system can execute , Track all objects , More time-consuming .

Final marker :stw, Objects tracked during the concurrent marking phase Determine whether it is garbage , marked .

Mixed recycling :stw, Recover the part within the control range region

  • Operation and maintenance configuration
  1. jvm Template reference

4 nucleus 8g

-Xms4096M -Xmx4096M -Xmn3072M -Xss256K -XX:MetaspaceSize=256M -XX:MaxMetaspaceSize=256M -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=92 -XX:+UseCMSCompactAtFullCollection -XX:CMSFullGCsBeforeCompaction=0 -XX:+CMSParallelInitialMarkEnabled -XX:+CMSScavengeBeforeRemark -XX:+PrintGCDetails -verbose:gc -Xloggc:/opt/xx/gc.log -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/opt/oom/xxx -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=30m

-XX:SoftRefLRUPolicyMSPerMB=1000 -XX:-OmitStackTraceInFastThrow

-XX:+DisableExplicitGC Depending on the situation, join

Pile up 4g

The new generation 3g   eden 2457m  s District 307.2

Old age   1g

Metadata area 256m

Stack       1 Threads 1m,1000 Threads 1g

Use parnew + cms

old District Utilization rate reaches 92%, produce old gc CMSInitiatingOccupancyFaction=92

old District Enable memory grooming UseCMSCompactAtFullCollection

old District gc After that, the memory is directly collated once CMSFullGCsBeforeCompaction=0

old District gc When , Initial marker Multithreading concurrent execution CMSParallelInitialMarkEnabled

old District gc When , Mark again Trigger once ygc, Improve the performance of re marking CMSScavengeBeforeRemark

Display disable code gc DisableExplicitGC

Record gc journal PrintGCDetails

gc Log location loggc:gc.log

oom When dump journal HeapDumpOnOutOfMemoryError

dump Log path HeapDumpPath=/usr/local/app/oom

  1. 4 nucleus 8g Configuration case

-Xmx6144m

-Xms3686m

-Xss256k

-XX:MetaspaceSize=128m

-XX:MaxMetaspaceSize=256m

-XX:MaxGCPauseMillis=200

-XX:+UseG1GC

-XX:ParallelGCThreads=4

-XX:ConcGCThreads=4

-XX:-OmitStackTraceInFastThrow

-XX:MinHeapFreeRatio=30

-XX:MaxHeapFreeRatio=50

-XX:CICompilerCount=3

-XX:+PreserveFramePointer

-XX:+PrintGC

-XX:+PrintGCDetails

-XX:+PrintGCDateStamps

-XX:+UseGCLogFileRotation

-XX:NumberOfGCLogFiles=5

-XX:GCLogFileSize=32M

-XX:+HeapDumpOnOutOfMemoryError

-XX:HeapDumpPath=/opt/logs/1000

-Dport.http.server=8080 -Dlog.server=/opt/logs/1000

-Dport.shutdown.server=8081

-Ddocbase.server=/opt/app

-Dvdir.server=/order

-Djava.security.egd=file:/dev/./urandom

-Dcom.sun.management.jmxremote.authenticate=false

-Dcom.sun.management.jmxremote.ssl=false

-Djava.rmi.server.hostname=10.10.10.1

-Dcom.sun.management.jmxremote

-Dcom.sun.management.jmxremote.port=8780

-Dcom.sun.management.jmxremote.rmi.port=8780

-Dcom.sun.management.jmxremote.local.only=false -Xloggc:/opt/logs/1000/gc.log

-DAPPLOGDIR=/opt/logs/100019393/applog

-Djava.util.concurrent.ForkJoinPool.common.parallelism=4

-Djava.util.concurrent.ForkJoinPool.common.threadFactory=com.mycomplny.forkjoinworkerthreadfactory.XXXForkJoinWorkerThreadFactory

-Djdk.tls.ephemeralDHKeySize=2048

-Djava.protocol.handler.pkgs=org.apache.catalina.webresources

-Dorg.apache.catalina.security.SecurityListener.UMASK=0027

-Dignore.endorsed.dirs=

-Djava.util.logging.config.file=/opt/tomcat/conf/logging.properties

-Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager

-Dcatalina.base=/opt/tomcat

-Dcatalina.home=/opt/tomcat

-Djava.io.tmpdir=/opt/tomcat/temp

  1. Docker( Memory larger than 6G):

1.8 G1

( The system reserves memory :25% System memory available )

(JVM Heap size : System memory available - The system reserves memory )

(ParallelGCThreads,

ConcGCThreads,

-Djava.util.concurrent.ForkJoinPool.common.parallelism

-XX:CICompilerCount

According to the... In the container flavor Of CPU The number is set automatically )

(Xms, XX:MinHeapFreeRatio, XX:MaxHeapFreeRatio Dynamically adjust according to historical resource utilization )

Note: No setting Xmn

-Xms{ Dynamic adjustment }

-Xmx{JVM Heap size }

-Xss256k

-XX:MetaspaceSize=128m

-XX:MaxMetaspaceSize=256M

-XX:CICompilerCount={ Dynamic adjustment }

-XX:+PrintGC

-XX:+PrintGCDateStamps

-XX:+PrintGCDetails

-XX:+HeapDumpOnOutOfMemoryError

-XX:+UseG1GC

-XX:ParallelGCThreads={ Dynamic adjustment }

-XX:ConcGCThreads={ Dynamic adjustment }

-XX:MaxGCPauseMillis=200

-XX:ParallelGCThreads=1

-XX:-OmitStackTraceInFastThrow

-XX:HeapDumpPath={PATH}

-XX:+UseGCLogFileRotation

-XX:NumberOfGCLogFiles=5

-XX:GCLogFileSize=32M

-XX:MinHeapFreeRatio={ Dynamic adjustment }

-XX:MaxHeapFreeRatio={ Dynamic adjustment }

-Djava.util.concurrent.ForkJoinPool.common.parallelism={ Dynamic adjustment }

-Dport.http.server=${SERVER_HTTP_PORT}

-Dlog.server=${SERVER_LOG}

-Dport.shutdown.server=${SERVER_SHUTDOWN_PORT}

-Ddocbase.server=${DOC_BASE}

-Dvdir.server=${SERVER_VDIR}

-Djava.security.egd=file:/dev/./urandom

-Djava.rmi.server.hostname=${SERVER_IP}

-Dcom.sun.management.jmxremote

-Dcom.sun.management.jmxremote.port=${SERVER_JMX_PORT}

-Dcom.sun.management.jmxremote.rmi.port=${SERVER_JMX_PORT}

-Dcom.sun.management.jmxremote.authenticate=false

-Dcom.sun.management.jmxremote.ssl=false

-DAPPLOGDIR=${APPLOGDIR}

  1. Docker( Memory is greater than or equal to 2G Less than or equal to 4G):

1.8 G1

( The system reserves memory :800M)

(JVM Heap size : System memory available - The system reserves memory )

(ParallelGCThreads,

ConcGCThreads,

-Djava.util.concurrent.ForkJoinPool.common.parallelism

-XX:CICompilerCount

According to the... In the container flavor Of CPU The number is set automatically )

(Xms, XX:MinHeapFreeRatio, XX:MaxHeapFreeRatio Dynamically adjust according to historical resource utilization )

Note: No setting Xmn

-Xms{ Dynamic adjustment }

-Xmx{JVM Heap size }

-Xss256k

-XX:MetaspaceSize=128m

-XX:MaxMetaspaceSize=256M

-XX:CICompilerCount={ Dynamic adjustment }

-XX:+PrintGC

-XX:+PrintGCDateStamps

-XX:+PrintGCDetails

-XX:+HeapDumpOnOutOfMemoryError

-XX:+UseG1GC

-XX:ParallelGCThreads={ Dynamic adjustment }

-XX:ConcGCThreads={ Dynamic adjustment }

-XX:MaxGCPauseMillis=200

-XX:ParallelGCThreads=1

-XX:-OmitStackTraceInFastThrow

-XX:HeapDumpPath={PATH}

-XX:+UseGCLogFileRotation

-XX:NumberOfGCLogFiles=5

-XX:GCLogFileSize=32M

-XX:MinHeapFreeRatio={ Dynamic adjustment }

-XX:MaxHeapFreeRatio={ Dynamic adjustment }

-Djava.util.concurrent.ForkJoinPool.common.parallelism={ Dynamic adjustment }

-Dport.http.server=${SERVER_HTTP_PORT}

-Dlog.server=${SERVER_LOG}

-Dport.shutdown.server=${SERVER_SHUTDOWN_PORT}

-Ddocbase.server=${DOC_BASE}

-Dvdir.server=${SERVER_VDIR}

-Djava.security.egd=file:/dev/./urandom

-Djava.rmi.server.hostname=${SERVER_IP}

-Dcom.sun.management.jmxremote

-Dcom.sun.management.jmxremote.port=${SERVER_JMX_PORT}

-Dcom.sun.management.jmxremote.rmi.port=${SERVER_JMX_PORT}

-Dcom.sun.management.jmxremote.authenticate=false

-Dcom.sun.management.jmxremote.ssl=false

-DAPPLOGDIR=${APPLOGDIR}

  1. Docker( Memory 1G):

1.8 G1

(ParallelGCThreads,

ConcGCThreads,

-Djava.util.concurrent.ForkJoinPool.common.parallelism

-XX:CICompilerCount

According to the... In the container flavor Of CPU The number is set automatically )

(Xms, XX:MinHeapFreeRatio, XX:MaxHeapFreeRatio Dynamically adjust according to historical resource utilization )

Note: No setting Xmn

-Xms{ Dynamic adjustment }

-Xmx650m

-Xss256k

-XX:MetaspaceSize=128m

-XX:MaxMetaspaceSize=128m

-XX:CICompilerCount={ Dynamic adjustment }

-XX:+PrintGC

-XX:+PrintGCDateStamps

-XX:+PrintGCDetails

-XX:+HeapDumpOnOutOfMemoryError

-XX:+UseG1GC

-XX:ParallelGCThreads={ Dynamic adjustment }

-XX:ConcGCThreads={ Dynamic adjustment }

-XX:MaxGCPauseMillis=200

-XX:-OmitStackTraceInFastThrow

-XX:HeapDumpPath={PATH}

-XX:+UseGCLogFileRotation

-XX:NumberOfGCLogFiles=5

-XX:GCLogFileSize=32M

-XX:MinHeapFreeRatio={ Dynamic adjustment }

-XX:MaxHeapFreeRatio={ Dynamic adjustment }

-Djava.util.concurrent.ForkJoinPool.common.parallelism={ Dynamic adjustment }

-Dport.http.server=${SERVER_HTTP_PORT}

-Dlog.server=${SERVER_LOG}

-Dport.shutdown.server=${SERVER_SHUTDOWN_PORT}

-Ddocbase.server=${DOC_BASE}

-Dvdir.server=${SERVER_VDIR}

-Djava.security.egd=file:/dev/./urandom

-Djava.rmi.server.hostname=${SERVER_IP}

-Dcom.sun.management.jmxremote

-Dcom.sun.management.jmxremote.port=${SERVER_JMX_PORT}

-Dcom.sun.management.jmxremote.rmi.port=${SERVER_JMX_PORT}

-Dcom.sun.management.jmxremote.authenticate=false

-Dcom.sun.management.jmxremote.ssl=false

-DAPPLOGDIR=${APPLOGDIR}

  1. static state heap Pattern

Assume that the memory size of the container configuration requested by the user is M, be xmx、xms The size is

if M == 1GB:

    xmx = (1GB - 374MB)

    xms = 0.6 * xmx

elif M <= 4GB && M >= 2GB:

    xmx = (M - 800MB)

    xms = 0.6 * xmx

else:

    xmx = M * 0.75

    xms = 0.6 * xmx

原网站

版权声明
本文为[leowang5566]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/188/202207070143405917.html