当前位置:网站首页>Use GCC's PGO (profile guided optimization) to optimize the entire system
Use GCC's PGO (profile guided optimization) to optimize the entire system
2022-06-12 17:08:00 【Tianya road Linux】
inspire
It comes from the experiment made by megahard on its own server , No more nonsense, just go to the picture above
Microsoft's approach
Microsoft's approach
You can see the use of PGO The optimized kernel still has some performance improvements
zero 、 Premise
Want to use PGO Compile optimization , Yours gcc The compiler needs to turn on support pgo characteristic ,gentoo The system is as follows
sudo vim /etc/portage/make.conf:
USE="pgo"
sudo emerge gcc
One 、 kernel kernel The optimization of the
cd /usr/src/linux
sudo make clean
sudo make menuconfig:
CONFIG_DEBUG_FS=y
CONFIG_GCOV_KERNEL=y
CONFIG_GCOV_PROFILE_ALL=y
sudo make KCFLAGS=“-fprofile-dir=/kernel-pgo/”
Final installation kernel And update the grub
sudo make install
sudo grub-config -o /boot/grub/grub.cfg
Restart the system
sudo reboot
Then run the system under the new kernel for a while , Open various software such as browser 、mpv player 、cmus、 office 、 Compiling software 、 download 、 game 、steam wait ( I will run through all the software and scenarios of the daily system ), So that the kernel can collect enough profile data (gcov data )
# notes : Open the CONFIG_DEBUG_FS=y、CONFIG_GCOV_KERNEL=y Characteristic kernel Performance will be significantly reduced , But this pair of collections is used for PGO Optimization of the profile Data is necessary
kernel PGO Optimization of the profile The data is stored in /sys/kernel/debug/gcov/kernel-pgo/ Under the table of contents , There are many small files , The format is similar to "#usr#src#linux#arch#x86#crypto#aesni-intel_glue.gcda"
sudo cp -r /sys/kernel/debug/gcov/kernel-pgo/ /
cd /usr/src/linux
sudo make clean
sudo make menuconfig:
CONFIG_DEBUG_FS=n
CONFIG_GCOV_KERNEL=n
CONFIG_GCOV_PROFILE_ALL=n
sudo make KCFLAGS=“-fprofile-use -fprofile-dir=/kernel-pgo/ -fprofile-correction -Wno-coverage-mismatch -Wno-error=coverage-mismatch”
Final installation kernel And update the grub
sudo make install
sudo grub-config -o /boot/grub/grub.cfg
Restart the system
sudo reboot
Okay , Now you can experience PGO What is the performance of the optimized kernel , Open the game to test fps Well , Is it higher than the original kernel frame number ?
Add : Use Clang Of LTO Optimize compilation kernel
since kernel 5.12 The start kernel allows lto To optimize the , However, it is limited to support clang+llvm compiler , I won't support it gcc, So you have to install it first clang and llvm, as well as lld The linker . The steps are simple ——
sudo make LLVM=1 LLVM_IAS=1 menuconfig:
CONFIG_DEBUG_FS=n
CONFIG_GCOV_KERNEL=n
CONFIG_GCOV_PROFILE_ALL=n
CONFIG_LTO_CLANG_FULL=y
then
sudo make LLVM=1 LLVM_IAS=1
that will do .
# notes : at present clang Kernel is not supported gcov Optimize , So the above gcc Of pgo and clang Of lto You can only choose one of the two optimization schemes .
If you don't think the regular kernel can satisfy you , You can also try compiling it yourself xanmod The kernel project , Compared with the conventional kernel, it has done a lot of update optimization ( For example, open O3 Level compilation optimization )——
https://www.xanmod.org/www.xanmod.org/
# Click... On the web page “tarball” You can download it. xanmod Kernel source package .
Two 、 Of all program software in the whole system PGO Optimize
First you need to turn off portage Two security features of the build system
sudo vim /etc/portage/make.conf:
FEATURES="-sandbox -usersandbox"
Then add the following gcc Compile parameters
sudo vim /etc/portage/make.conf:
COMMON_FLAGS="$( Your own original compilation optimization parameters ) -fprofile-generate -fprofile-dir=/portage-pgo/"
CFLAGS="${COMMON_FLAGS}"
CXXFLAGS="${COMMON_FLAGS}"
FCFLAGS="${COMMON_FLAGS}"
FFLAGS="${COMMON_FLAGS}"
LDFLAGS="${COMMON_FLAGS} -Wl,-O3 -Wl,--as-needed -Wl,--hash-style=gnu -Wl,--sort-common -Wl,--strip-all -ljemalloc -Wl,-ljemalloc"
EMERGE_DEFAULT_OPTS="--with-bdeps=y --ask --deep --verbose=y --load-average --keep-going"
sudo mkdir /portage-pgo/
Now start compiling the entire system
sudo emerge -e @world
Restart the system after compilation
sudo reboot
Then run the system under the new system for a period of time , Open various software such as browser 、mpv player 、cmus、 office 、 Compiling software 、 download 、 game 、steam wait ( I will run through all the software and scenarios of the daily system , Recommended 1-2 God ), To collect enough comprehensive profile data (gcov data ), Each program PGO Optimization of the profile The data is stored in /portage-pgo/ Under the table of contents
# notes : Open the -fprofile-generate The program performance of the feature will decrease significantly , But this pair of collections is used for PGO Optimization of the profile Data is necessary
Restart the system
sudo reboot
And then modify gcc Compile parameters
sudo vim /etc/portage/make.conf:
COMMON_FLAGS="$( Your own original compilation optimization parameters ) -fprofile-use -fprofile-dir=/portage-pgo/ -fprofile-correction -Wno-error=missing-profile"
CFLAGS="${COMMON_FLAGS}"
CXXFLAGS="${COMMON_FLAGS}"
FCFLAGS="${COMMON_FLAGS}"
FFLAGS="${COMMON_FLAGS}"
LDFLAGS="${COMMON_FLAGS} -Wl,-O3 -Wl,--as-needed -Wl,--hash-style=gnu -Wl,--sort-common -Wl,--strip-all -ljemalloc -Wl,-ljemalloc"
Compile the entire system again to use PGO Optimize ( Turn on pgo After optimization, the compilation speed will be greatly improved , It won't be as long as the last time )
sudo emerge -e @world
Restart the system after compilation
sudo reboot
Okay , Now you can experience PGO Comprehensive compilation optimization gentoo The system , Is it the ultimate performance ?
Use GCC Of PGO(Profile-guided Optimization) Optimize the whole system - You know
边栏推荐
- redis. clients. jedis. exceptions. JedisDataException: NOAUTH Authentication required
- 借助SpotBugs将程序错误扼杀在摇篮中
- redis.clients.jedis.exceptions.JedisDataException: NOAUTH Authentication required
- 2080虚拟机登录命令
- 云开发坤坤鸡乐盒微信小程序源码
- Selenium element positioning
- Kill program errors in the cradle with spotbugs
- 1723. 完成所有工作的最短时间
- Demande de doctorat | xinchao Wang, Université nationale de Singapour
- R语言使用ggplot2可视化dataframe数据中特定数据列的密度图(曲线)、并使用xlim参数指定X轴的范围
猜你喜欢
Demande de doctorat | xinchao Wang, Université nationale de Singapour
JVM memory model and local memory
Cloud development kunkun chicken music box wechat applet source code
Advanced Qt development: a preliminary study QT + OpenGL
Introduction to several common functions of fiddler packet capturing (stop packet capturing, clear session window contents, filter requests, decode, set breakpoints...)
Kill program errors in the cradle with spotbugs
Qiushengchang: Practice of oppo commercial data system construction
博士申請 | 新加坡國立大學Xinchao Wang老師招收圖神經網絡方向博士/博後
The significance of writing technology blog
有趣的 LD_PRELOAD
随机推荐
从50亿图文中提取中文跨模态新基准Zero,奇虎360全新预训练框架超越多项SOTA
怎么在公司里面做好测试工作(做好测试工作)
Male god goddess voting source code v5.5.21 voting source code
Différence entre le mode grand et le mode petit
Difference between big end mode and small end mode
Analysis of CA certificate with high value
5-5 configuring MySQL replication log point based replication
R语言使用epiDisplay包的pyramid函数可视化金字塔图、基于已有的汇总数据(表格数据)可视化金字塔图
Introduction to several common functions of fiddler packet capturing (stop packet capturing, clear session window contents, filter requests, decode, set breakpoints...)
\begin{algorithm} 笔记
How to play the map with key as assertion
ShardingJDBC 分库分表详解
MySQL提权总结
Cloud development kunkun chicken music box wechat applet source code
CVPR 2022 | 元学习在图像回归任务的表现
1723. 完成所有工作的最短时间
key为断言的map是怎么玩的
RMI, JNDI, LDAP introduction +log4j vulnerability analysis
2022-2028 global press dehydrator industry research and trend analysis report
邱盛昌:OPPO商业化数据体系建设实战