当前位置:网站首页>Address mapping of virtual memory paging mechanism
Address mapping of virtual memory paging mechanism
2022-06-24 21:12:00 【The smell of tobacco】
summary
Previous articles Virtual memory The paging mechanism is briefly introduced . There is another question , That is how to map the logical address in the virtual memory to the physical address ? Let's make a brief analysis today .
For a paged address , It usually contains two elements :
- Page number : Page
- Offset : The number of bytes of the current page
The following addr_virtual(p, o) Represents a logical address , With addr_real(p, o) Represents a physical address ( The physical address is also paged ).
A page table
The first step is to think about , If you want to find the corresponding physical address according to a logical address , Then the corresponding relationship must be stored somewhere , Because mapping is irregular . What data structure should be used to store ?
Because in paging , page Is a minimum unit , Therefore, we only need the mapping relationship of page numbers , Logical and physical addresses have the same page size , The offset is exactly the same .
according to key seek value, This is not a map Well . Look at this again map Of key, Page numbers are all numbers , And it's sequential . This is just an array . beautiful , It's an array .
in other words , This page table is indexed by logical page numbers , One dimensional array with physical page number as value . The process of address translation is as follows :

The elements in the page table do not just store physical page numbers , Some additional flag bits are also stored , Used to identify the properties of the current page , A few simple examples :
- Existential bit : Whether it is currently loaded into memory . If it is not loaded, the operating system is required to load
- Modify bit : Whether the current page has been modified in memory . If modified , When reclaiming physical memory, you need to write it to the hard disk
- Kernel permissions : Whether the current page needs kernel mode to access
- Whether to read bits
- Whether bits can be written
- Is it executable
- wait
Because each process has its own virtual memory , So each process needs its own page table .
In order to improve operation efficiency , This translation process is completed by hardware , Both CPU Memory management unit in mmu To complete .
Is it easy to see ? good , Introduction completed , This concludes the article .
Problems and solutions
ha-ha , To make fun of . How could it be so easy to end . Now let's briefly analyze the problems of this simple model . According to the experience of the algorithm , Most algorithms are implemented , Or the time complexity is too high , Or the space complexity is too high .
Time issues
Imagine the steps to access a memory :
- Look up the page table to find the corresponding physical address
- Access physical address
The operation of looking up the page table is also a memory access . in other words , CPU Each memory access requires an additional memory access . The execution time is doubled directly .
Solution
The solution is that we have now used rotten : cache . Memory to CPU There are already L1L2 cache , stay mmu There is also a cache of page tables in TLB. The steps for each address translation are as follows ( Ignore missing pages ):
- see
TLBWhether there is a cache in , If there is direct access - if
TLBDoes not exist in the , Get from the memory page table and putTLB
TLB The premise of existence , Is that the access of the program is local . finally , It is the locality of the program that saves us .
Space problem
Let's simply calculate how much space is needed to store this page table .
stay 32 position CPU in , The accessible logical address space is 4G. Suppose the page size is : 4kb, Then the total number of pages is :
4G / 4kb = (2^32) / (2^12) = 2^20 = 1mb
suppose , Each element of the page table needs 4 Bytes , Then the total space required is : 4mb. Each process just needs to store the page table 4mb. And it's still 32 position , If it is 64 Who? ? You can calculate it , The result is exaggerated .
Solution
Learn from the idea of memory paging , We divide memory into n A page , You can load it on demand . Again , You can also divide a large page table into n A small page table , Then you can partially load , Both Multi level page table
Explain with the simplest two-level page table , The virtual memory is roughly divided as follows :

The structure of the page table is roughly as follows :

Be careful , At this time, the page number in the logical address stores two contents :
- Index of the first level page table
- Index of secondary page table
Why does multi-level page table solve the problem of space ? Again, according to the local principle of the program , Most of the corresponding values in the first level page table are empty , Most of the secondary page tables are not loaded into memory .
Now let's calculate again , still 32 position CPU, The size of the page is still 4kb, The element size in the page is still 4 byte . At this point, assume that each element of the first level page table is responsible for 4mb Space . Then the total number of pages occupied by the first level page table is : 4G / 4mb = (2^32) / (2^22) = 2^10. The space occupied by the first level page table is : (2^10) * (2^2)=4kb
The total number of pages in each secondary page table is : 4mb / 4kb = (2^22) / 2(12) = 2^10 = 1024, Occupancy space : (2^10) * (2^2) = 4kb
Only the first level page table is resident in memory , The second level page table only needs to load part of it . The space drops directly .
however , It brings a new problem , Now get a physical address , Need to access memory twice , This is slower than before ? Don't forget what just happened TLB, With this layer of cache , Most of the visits are in mmu It's going on inside . Again , The locality principle of the program saved us .
Multi level page table , Expand the secondary page table further , You can get a multi-level page table , I won't repeat .
The locality of the program
Know how the address is mapped , How does it help us to write programs in ordinary times ?
Page conversion is based on the locality of the program , So when we write code , Try to ensure that what you write is local , for instance :
int main() {
int i, j;
int arr[1024][1024];
// The first way
for(i = 0; i < 1024; i++){
for(j = 0; j < 1024; j++){
global_arr[i][j] = 0;
}
}
// The second way
for(j = 0; j < 1024; j++){
for(i = 0; i < 1024; i++){
global_arr[i][j] = 0;
}
}
}
The purpose of the above code is very simple , Give me a 1024*1024 Initializes a two-dimensional array of . Can you see the difference between the two methods ?
Traversal methods are different , Mode one Is a line by line traversal , Mode two Is a column by column traversal .
We know , Two dimensional arrays are stored sequentially in memory . in other words , A two-dimensional array : [[1, 2, 3], [4, 5, 6]], The storage order in memory is : 1, 2, 3, 4, 5, 6.
And our array , Each row 1024 individual int Elements , Is precisely 4kb The size of a page .
therefore , Mode one The order in which pages are accessed is : page1, page1 ... page1024, page1024, Per page access 1024 Next time , Switch to the next page , Co occurring 1024 Next page switching
and , Mode two The order in which pages are accessed is : page1, page2...page1024 ... page1, page2...page1024, Visit each page in turn , Per page access 1024 Time , Co occurring 1024*1024 Next page switching
High performance, lower judgment , Mode one More in line with the principle of locality , Mode two Your visit is too jumping .
Of course , Now when the memory is large , Everything is loaded into memory , meanwhile TLB Cached all page mappings , There is no difference between the two methods . however :
- if
TLBCapacity is insufficient , The new cache will eliminate the old cache , Frequent access to different pages will cause more cache invalidations - If memory capacity is insufficient , Writing a new page will weed out the old page , Frequent access to different pages will result in more memory swapping in and out .
Words alone are no proof
Of course , Words alone are no proof , In order to have an intuitive feeling about the switching mechanism of the above page , We go through getrusage Function to get the page switching information of the program . The code is as follows :
#include <stdio.h>
#include <sys/resource.h>
const int M = 1024;
// Increase the size of the column , To make the effect obvious . 10mb
const int N = 1024*10;
// Because the stack size is limited , Therefore, the variable is promoted to global , Put it in the pile
int global_arr[1024][1024*10];
int main() {
int i, j;
// The first way
for(i = 0; i < M; i++){
for(j = 0; j < N; j++){
global_arr[i][j] = 0;
}
}
// The second way
// for(j = 0; j < N; j++){
// for(i = 0; i < M; i++){
// global_arr[i][j] = 0;
// }
// }
struct rusage usage;
getrusage(RUSAGE_SELF, &usage);
printf(" Page recycle times : %ld\n", usage.ru_minflt);
printf(" Number of page breaks : %ld\n", usage.ru_majflt);
}
It is still relatively simple for computers to run such a small program , It won't make any difference , Therefore, the memory of the process should be limited . I am through restriction docker Available memory :
docker run -it -m 6m --memory-swap -1 debian bash
good , Everything is available. , Let's see the results :
Mode one

Mode two

You can see , Mode one Want to compare Mode two Is much better .
so , Procedures with high performance requirements , When you have no optimization direction , Locality may help you .
Link to the original text : https://hujingnb.com/archives/698
边栏推荐
- Typescript syntax
- opds sql组件能不能将流程参数通过上下文传给下一个组件
- After idea installs these plug-ins, the code can be written to heaven. My little sister also has to arrange it
- maptalks:数据归一化处理与分层设色图层加载
- 伯克利、MIT、剑桥、DeepMind等业内大佬线上讲座:迈向安全可靠可控的AI
- 传统的IO存在什么问题?为什么引入零拷贝的?
- Procedural life: a few things you should know when entering the workplace
- More than ten years' work experience is recommended at the bottom of the box: how much does it cost to find a job? See here! Brothers and sisters are recommended to collect and pay attention
- Analysis of errors in JSON conversion using objectmapper
- Sleep revolution - find the right length of rest
猜你喜欢

Variable setting in postman

Agency mode -- Jiangnan leather shoes factory

Common data model (updating)

Curl command

OSI notes sorting

Postman assertion

Combination mode -- stock speculation has been cut into leeks? Come and try this investment strategy!

Mapstacks: data normalization and layered color layer loading

Power apps Guide

More than ten years' work experience is recommended at the bottom of the box: how much does it cost to find a job? See here! Brothers and sisters are recommended to collect and pay attention
随机推荐
Web automation: summary of special scenario processing methods
Common self realization functions in C language development
go_ keyword
Vant component used in wechat applet
Berkeley, MIT, Cambridge, deepmind and other industry leaders' online lectures: towards safe, reliable and controllable AI
[multi thread performance tuning] multi thread lock optimization (Part 1): optimization method of synchronized synchronization lock
Golang daily question
Agency mode -- Jiangnan leather shoes factory
科创人·味多美CIO胡博:数字化是不流血的革命,正确答案藏在业务的田间地头
Adding subscribers to a list using mailchimp's API V3
Basic database syntax learning
Builder mode -- Master asked me to refine pills
Mapstacks: data normalization and layered color layer loading
Format method and parse method of dateformat class
Hongxiang Yunteng is compatible with dragon lizard operating system, and the product runs stably
Learn to use a new technology quickly
VIM usage
JMeter parameterization
Undo log and redo log must be clear this time
Intermediary model -- collaboration among departments