当前位置:网站首页>Performance evaluation report of YoMo codec - Y3
Performance evaluation report of YoMo codec - Y3
2020-11-09 19:12:00 【Cella】
YoMo Introduce
YoMo It is an open source real-time edge computing gateway 、 Development framework and microservice platform , The communication layer is based on QUIC agreement (2020-09-25 Update to Draft-31 edition ), Better release 5G Wait for the value of next generation low delay network . For streaming (Streaming Computing) Designed codec yomo-codec It can greatly improve the throughput of computing services ; Plug in based development model ,5 Minutes to go online your Internet of things real-time edge computing processing system .YoMo At present, it has been deployed in the field of industrial Internet .
Official website : https://yomo.run
YoMo Codec Introduce
yomo-codec-golang It's through golang Language implementation YoMo Codec Of SPEC describe ; Provide right TLV structure And the ability to encode and decode basic data types , And for YoMo Provide encoding and decoding tools to support its message processing . You can extend it to more types of data processing , It can even be extended and applied to other frameworks that need coding and decoding .
Project introduction :README.md
Why YoMo-Codec?
as everyone knows , stay HTTP We often use JSON As a codec for messages , Because it has a simple format , Easy to read and write , Support for multiple languages , So it's very popular in Internet applications , Then why do we need to do our own research YoMo Codec To support YoMo Application ?
- YoMo Streaming messages , Extract monitored key-value Process the business logic . If you use JSON codec , Will require that you must wait for the complete packet to be received before deserializing the packet as an object , And then extract the corresponding key-value value ; But for the YoMo Codec, By describing object data as a set of
TLV structure, When decoding packets , Can be in the decoding process earlier to understand the currentTWhether or not to listen to the prison key, To determine whether to jump directly to the next groupTLV structure, There is no need to decode non monitored packets unnecessarily , Thus, the efficiency of decoding is improved . - JSON A lot of reflection is usually used in decoding , So its performance will be affected , and YoMo Codec Because only for the actual monitored key-value decode , The use of actual reflection will be greatly reduced .
- In the industrial Internet or in the network applications that require strict computing resources , For the same codec operation, less loss is needed CPU resources , So that the limited computing resources can be more fully used .
This performance test is to verify YoMo Codec Than JSON It has higher data decoding performance and less resource consumption , Thus for YoMo Provide more real-time 、 Efficient 、 Low loss of message processing capability .
Test instructions
1. Test method
-
adopt Benchmark Benchmark , Provide Serial and parallel Two ways , The latter is making full use of CPU Performance in the case of resources .
-
The tested data package is generated by program , And guarantee that Codec And JSON The data used in the test contains key-value The values of pairs are exactly the same .
-
The data to be tested contains key-value The right data is divided into 3 Yes 、16 Yes 、32 Yes 、63 Yes These groups , Observe separately in different key-value The effect on decoding performance in case of quantity , And what's being monitored key The values are the middle values of their numbers , Such as : K08 It means to listen to No 8 individual key Value . So you get the following dimensions , It is then shown in the chart of the test results .
Symbolic representation Key-value The number of Monitored key Location C63-K32 common 63 Yes key-value Listen to extract section 32 position Of key Of value value C32-K16 common 32 Yes key-value Listen to extract section 16 position Of key Of value value C16-K08 common 16 Yes key-value Listen to extract section 08 position Of key Of value value C03-K02 common 03 Yes key-value Listen to extract section 02 position Of key Of value value -
The results of the test include :
- Decode and extract the monitored data from packets key Corresponding value Value operation performance comparison .
- Compare the occupancy in the same decoded extraction scenario CPU Time for .
2. data structure
-
Y3 Test data
0x80 0x01 value .... 0x3f value -
JSON The structure of the test data
{ "k1": value, ... "k63" value }
3. Data processing logic

4. Test project
-
The code of this test report can be downloaded from yomo-y3-stress-testing Project acquisition .
-
Main code structure description ( List only the documentation directly related to this test ):

5. Test environment
- Hardware environment :
- CPU:2.6 GHz 6P intel Core i7,GOMAXPROCS=12
- Memory :32GB
- Hard disk :SSD
- Software environment :
- macOS Catalina
- go version go1.14.1 darwin/amd64
- yomo-y3-stress-testing
Benchmark test
1. Serial test process
-
The code under test :
./internal/decoder/report_serial/report_benchmark_test.go, Such as :// in the light of YoMo Codec Y3 Benchmark func Benchmark_Codec_C63_K32(b *testing.B) { var key byte = 0x20 data := generator.NewCodecTestData().GenDataBy(63) b.ResetTimer() for i := 0; i < b.N; i++ { if decoder.TakeValueFromCodec(key, data) == nil { panic(errors.New("take is failure")) } } } // in the light of JSON Benchmark func Benchmark_Json_C63_K32(b *testing.B) { key := "k32" data := generator.NewJsonTestData().GenDataBy(63) data = append(data, decoder.TokenEnd) b.ResetTimer() for i := 0; i < b.N; i++ { if decoder.TakeValueFromJson(key, data) == nil { panic(errors.New("take is failure")) } } }- Benchmark_Codec_C63_K32: For key-value by 63 The data set of the group is extracted from the 32 individual key Data value , Serial benchmarking of this .
- Default :GOMAXPROCS=12
-
Start the test script :
./internal/decoder/report_serial/report_benchmark_test.shtemp_file="../../../docs/temp.out" report_file="../../../docs/report.out" go test -bench=. -benchtime=3s -benchmem -run=none | grep Benchmark > ${temp_file} \ && echo 'finished bench' \ && cat ${temp_file} \ && cat ${temp_file} | awk '{print $1,$3}' | awk -F "_" '{print $2,$3"-"substr($4,1,3),substr($4,7)}' | awk -v OFS=, '{print $1,$2,$3}' > ${report_file} \ && echo 'finished analyse' \ && cat ${report_file}Through to report_benchmark_test.go Test file run benchmark The benchmark , Generate test result set and save to
./docs/report.outIn file . -
Generate a result chart :
./docs/report_graphics.ipynbpython --version # Python version > 3.2.x pip install runipy bar_ylim=70000 barh_xlim=20 runipy ./report_graphics.ipynb
2. Parallel testing process
To maximize CPU Utilization ratio , Observe the performance of decoder in multi-core environment , Added Parallel Test items of
-
The code under test :
./internal/decoder/report_parallel/report_benchmark_test.go, Such as :func Benchmark_Codec_C63_K32(b *testing.B) { var key byte = 0x20 data := generator.NewCodecTestData().GenDataBy(63) b.ResetTimer() b.RunParallel(func(pb *testing.PB) { for pb.Next(){ if decoder.TakeValueFromCodec(key, data) == nil { panic(errors.New("take is failure")) } } }) }- The code is the same as the body of the serial , The difference is in the use of RunParallel To do parallel testing
- Default :GOMAXPROCS=12
-
Start the test script :
./internal/decoder/report_parallel/report_benchmark_test.shGenerate test result set and save to./docs/report.outIn file . -
Generate a result chart :
bar_ylim=18000 barh_xlim=25 runipy ./report_graphics.ipynb
3. test result
-
Serial Benchmark test result :
- Time consuming comparison of single decoding extraction : chart 3.1

-
Y3 And JSON The rate of time-consuming growth : chart 3.2

- Chart description :
- chart 3.1 Coordinates of :C63-K32, Indicates that the packet contains 63 Yes key-value, And listen to the same section 32 Bit key Extract it value.
- chart 3.1 Of Y coordinate : Represents the number of nanoseconds taken by a single operation .
- chart 3.2 Of X coordinate : Express (JSON Decoding takes time /Y3 Decoding takes time ) The increase times of . Such as :43010/2077=20.07
-
parallel Benchmark test result :
-
Time consuming comparison of single decoding extraction : chart 3.3

-
Y3 And JSON The rate of time-consuming growth : chart 3.4

-
4. Test and analysis
The above test results show that :
-
Y3 Decoding performance ratio of JSON There's a big improvement , As the packet contains key-value Yes, the more , The more obvious the performance improvement , On average, 10 Double growth . (20.7+15.8+6.2+3.3)/4=11.5
-
Parallel decoding with multiple cores , Its ns/op There is also a big improvement in the performance of . Parallel vs. serial has 3 Double the rise :
C63-K32 C32-K16 C16-K08 C03-K02 Serial test 2077 1361 1667 610 Parallel test 706 505 515 175 growth 290% 260% 320% 350%
CPU Resource analysis
1. Testing process
-
The code under test :
./cpu/cpu_pprof.gofunc main() { dataCodec := generator.NewCodecTestData().GenDataBy(63) dataJson := generator.NewJsonTestData().GenDataBy(63) dataJson = append(dataJson, decoder.TokenEnd) // pprof fmt.Printf("start pprof\n") go pprof.Run() time.Sleep(5 * time.Second) fmt.Printf("start testing...\n") for { if decoder.TakeValueFromCodec(0x20, dataCodec) == nil { panic(errors.New("take is failure")) } if decoder.TakeValueFromJson("k32", dataJson) == nil { panic(errors.New("take is failure")) } } }- pprof.Run(): Used to start pprof
-
The program circulates over and over again Y3 and JSON decode , Through observation cpu profile Its sampling diagram CPU The proportion of the resources of
-
Run the test :
# Run the observed code ,pprof The default startup 6060 port go run ./cpu_pprof.go # Take samples , adopt 8081 Port observation analysis chart go tool pprof -http=":8081" http://localhost:6060/debug/cpu/profile
2. test result

3. Test and analysis
As can be seen from the above figure ,YoMo Codec Y3 It has to be decoded. Right CPU The occupation of resources is far less than JSON, There's also a difference 10 More than times (0.73/0.07=10.4), This observation is related to Benchmark It can correspond to , Yes CPU Low resource usage , At the same time, the decoding speed is also improved .
Test conclusion
Y3 a JSON There is an order of magnitude improvement in decoding performance , In the packet key The more quantity, the more obvious the performance improvement , meanwhile Y3 Yes CPU There is also an order of magnitude reduction in resource usage ; Through this performance test, it can be verified that YoMo Codec Y3 Can decode for YoMo Or other scenarios that require high-performance decoding provide real-time 、 Efficient 、 Low loss of message processing capability .
版权声明
本文为[Cella]所创,转载请带上原文链接,感谢
边栏推荐
- How to page query after the 10 billion level data sub table?
- 【面试经验】BAT程序员面试200人,常见最常问的面试问题做出解析
- RBAC of kubernetes authority management (1)
- 【云小课】版本管理发展史之Git+——代码托管
- R8 编译器: 为 Kotlin 库和应用 '瘦身'
- PHP - curl copy paste access SMS verification code example
- 财务管理系统如何帮助企业实现财务自动化管理?
- dat.GUI 打造可视化工具(一)
- PHP - cURL复制粘贴性接入短信验证码示例
- Rabbitmq installation
猜你喜欢

RabbitMQ安装

CIM平台可视化建设

More than 60 technical executives gathered in Songshan Lake, and the first phase of Huawei cloud core partner developer training camp was successfully concluded

How the API gateway carries the API economic ecological chain

Installation and deployment of Flink

A great guide to curl

dat.GUI 打造可视化工具(一)

超简单集成华为系统完整性检测,搞定设备安全防护

如何使用RTSP推流组件EasyPusher将MP4文件推到EasyDarwin开源平台?

40 tips for life that may be useful
随机推荐
配置ng-zerro的nz-date-picker时间选择组件
[最佳实践]了解 Eolinker 如何助力远程办公
老旧系统重构技巧,轻松搞定遗留代码
手把手教你使用容器服务 TKE 集群审计排查问题
低功耗蓝牙单芯片为物联网助力
磁阻式随机存储器MRAM基本原理
[best practice] learn how eolinker helps Telecommuting
Configure the NZ date picker time selection component of ng zerro
Git + -- Code hosting in the history of version management
day83:luffy:添加购物车&导航栏购物车数字显示&购物车页面展示
Markdown plug-in of vscode
How to page query after the 10 billion level data sub table?
上云嘉年华,超低价云服务器来袭
GPS对时系统(网络对时仪器)应用电子政务系统
40 tips for life that may be useful
【云小课】版本管理发展史之Git+——代码托管
开源项目,私活利器,快速开发
Git old bird search manual
【神级操作】 以中国传统的孔子和老子的思想,来分析忍者代码!
[stm32h7] Chapter 6: stm32h7 dma2d acceleration of ThreadX guix