当前位置:网站首页>Perf simple process for multithreaded profile
Perf simple process for multithreaded profile
2022-07-04 04:04:00 【BanFS】
Background knowledge
Perf It is a tool for software performance analysis , adopt Perf, Applications can take advantage of PMU,tracepoint And special counters in the kernel for performance statistics .Perf It can not only analyze the performance problems of applications (per thread), You can also analyze the performance problems of the kernel , Handle all performance related events : Hardware events during program operation , Such as instructions retired ,processor clock cycles etc. ; Software events , Such as Page Fault And process switching .
Perf The basic principle is to sample the monitored object , The simplest case is based on tick Interrupt sampling , That is to say tick Trigger sampling point in interrupt , Determine the current context of the program at the sampling point . Suppose a program 90% All the time spent on the function func1() On , that 90% All of the sampling points should fall in the function func1() In the context of , The longer the sampling time is , The more reliable the above inference is . Use perf Have administrator privileges
Use perf Multithreading profile
Prepare multithreaded program
In this program , Two threads were created . Ran different times func1() Method .gcc -lpthread main.c
#include <pthread.h>
#include <stdio.h>
#include <string.h>
pthread_t thread[2];
void func1() {
int i = 0;
while (i<10000)
++i;
}
void func2() {
int i = 0;
while (i<10000)
i = i*2;
func1();
}
void *thread1()
{
for (;;)
{
func1();
}
pthread_exit(NULL);
}
void *thread2()
{
for (;;)
{
func2();
}
pthread_exit(NULL);
}
void thread_create(void)
{
int temp;
memset(&thread, 0, sizeof(thread));
if((temp = pthread_create(&thread[0], NULL, thread1, NULL)) != 0)
printf(" Threads 1 Create failure !\n");
else
printf(" Threads 1 Be created \n");
if((temp = pthread_create(&thread[1], NULL, thread2, NULL)) != 0)
printf(" Threads 2 Create failure \n");
else
printf(" Threads 2 Be created \n");
}
int main()
{
thread_create();
pthread_join(thread[0],NULL);
printf(" Threads 1 Join in ");
pthread_join(thread[1],NULL);
printf(" Threads 2 Join in ");
return 0;
}
Use perf Sample the program
For the program that hasn't been started
[email protected]:/home/banfushen/perf_cpu/multi_thread# perf record -h
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
...
-F, --freq <n> profile at this frequency
-g enables call-graph recording
-p, --pid <pid> record events on existing process id
-t, --tid <tid> record events on existing thread id
...
perf record -g -F 99 ./a.out, Sampling multithreaded programs , sampling frequency 99,(-F 99: sample at 99 Hertz (samples per second). I’ll sometimes sample faster than this (up to 999 Hertz), but that also costs overhead. 99 Hertz should be negligible. Also, the value ‘99’ and not ‘100’ is to avoid lockstep sampling, which can produce skewed results.). After running, you will get a perf.data, To get the flame diagram , Other tools are needed .
For the program that has been started
For programs that have been started , To get pid,perf record -g -F 99 -p <pid>
Download tool FlameGraph
git clone https://github.com/brendangregg/FlameGraph.git
[email protected]:~/perf_cpu/FlameGraph$ pwd
/home/banfushen/perf_cpu/FlameGraph
Generate flame chart
Yes perf.data Generate flame chart ( According to the above, it is a process )
perf script |/home/banfushen/perf_cpu/FlameGraph/stackcollapse-perf.pl|/home/banfushen/perf_cpu/FlameGraph/flamegraph.pl > output.svg![[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-68Nci8QI-1644756192593)(_v_images/20211214172626556_19376.png)]](/img/9a/4b8b620164dfd1697b1bbf03044e51.jpg)
Generate a flame diagram for a single thread
You know, threads id
[email protected]:/home/banfushen/perf_cpu/multi_thread# perf script -h
Usage: perf script [<options>]
or: perf script [<options>] record <script> [<record-options>] <command>
or: perf script [<options>] report <script> [script-args]
or: perf script [<options>] <script> [<record-options>] <command>
or: perf script [<options>] <top-script> [script-args]
...
-v, --verbose be more verbose (show symbol address, etc)
--pid <pid[,pid...]>
--tid <tid[,tid...]>
only consider symbols in these tids
perf script -v --tid <tid> Specified thread perf script -v --tid 2283471|/home/banfushen/perf_cpu/FlameGraph/stackcollapse-perf.pl|/home/banfushen/perf_cpu/FlameGraph/flamegraph.pl > output1.svg![[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-UAKuRRSC-1644756192594)(_v_images/20211214172949738_20703.png)]](/img/3a/efd6efcab3be126d9b2bea028c1f43.jpg)
Generate a flame diagram for multiple threads
perf script -v --tid <tid[,tid...]> Specify multiple threads perf script -v --tid 2283472,2283471|/home/banfushen/perf_cpu/FlameGraph/stackcollapse-perf.pl|/home/banfushen/perf_cpu/FlameGraph/flamegraph.pl > output3.svg![[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-y7qrS028-1644756192595)(_v_images/20211214173120899_8466.png)]](/img/e6/c3c01067801a09ddc443ddb161ab0d.jpg)
Reference material :
perf Examples
perf Performance analysis
Performance analysis tool perf elementary analysis
utilize perf analyse Linux Applications
Linux Performance analysis tool Perf brief introduction
边栏推荐
- logistic regression
- 1289_FreeRTOS中vTaskSuspend()接口实现分析
- MySQL is dirty
- 深度优先搜索简要讲解(附带基础题)
- 基于PHP的轻量企业销售管理系统
- [untitled]
- AAAI2022 | Word Embeddings via Causal Inference: Gender Bias Reducing and Semantic Information Preserving
- JDBC 进阶
- Cesiumjs 2022^ source code interpretation [0] - article directory and source code engineering structure
- 【读书会第十三期】多媒体处理工具 FFmpeg 工具集
猜你喜欢

Msgraphmailbag - search only driveitems of file types

MySQL maxscale realizes read-write separation

Pytest multi process / multi thread execution test case

Storage of MySQL database
![[PaddleSeg 源码阅读] PaddleSeg 自定义数据类](/img/88/37c535b371486db545abc392a685af.png)
[PaddleSeg 源码阅读] PaddleSeg 自定义数据类

postgresql 用户不能自己创建表格配置

还原窗口位置的微妙之处

The three-year revenue is 3.531 billion, and this Jiangxi old watch is going to IPO

Two sides of the evening: tell me about the bloom filter and cuckoo filter? Application scenario? I'm confused..

微信公众号网页授权
随机推荐
投资深度思考
[.NET + mqtt]. Mise en œuvre de la communication mqtt dans l'environnement net 6 et démonstration de code pour l'abonnement et la publication de messages bilatéraux du serveur et du client
Eh, the log time of MySQL server is less than 8h?
Add IDM to Google browser
Objective C attribute keyword
Spa in SDP
选择排序与冒泡排序模板
Leecode 122. Zuijia timing of buying and selling stocks ②
ctf-pikachu-CSRF
STM32外接DHT11显示温湿度
Defensive programming skills
支持首次触发的 Go Ticker
Huawei cloud Kunpeng engineer training (Guangxi University)
Epidemic strikes -- Thinking about telecommuting | community essay solicitation
Unity移动端游戏性能优化简谱之 画面表现与GPU压力的权衡
Penetration practice - sqlserver empowerment
Apple submitted the new MAC model to the regulatory database before the spring conference
MySQL one master multiple slaves + linear replication
[untitled]
'2'&gt;' 10'==true? How does JS perform implicit type conversion?