当前位置:网站首页>Performance evaluation report of YoMo codec - Y3
Performance evaluation report of YoMo codec - Y3
2020-11-09 19:12:00 【Cella】
YoMo Introduce
YoMo It is an open source real-time edge computing gateway 、 Development framework and microservice platform , The communication layer is based on QUIC agreement (2020-09-25 Update to Draft-31 edition ), Better release 5G Wait for the value of next generation low delay network . For streaming (Streaming Computing) Designed codec yomo-codec It can greatly improve the throughput of computing services ; Plug in based development model ,5 Minutes to go online your Internet of things real-time edge computing processing system .YoMo At present, it has been deployed in the field of industrial Internet .
Official website : https://yomo.run
YoMo Codec Introduce
yomo-codec-golang It's through golang Language implementation YoMo Codec Of SPEC describe ; Provide right TLV structure
And the ability to encode and decode basic data types , And for YoMo Provide encoding and decoding tools to support its message processing . You can extend it to more types of data processing , It can even be extended and applied to other frameworks that need coding and decoding .
Project introduction :README.md
Why YoMo-Codec?
as everyone knows , stay HTTP We often use JSON As a codec for messages , Because it has a simple format , Easy to read and write , Support for multiple languages , So it's very popular in Internet applications , Then why do we need to do our own research YoMo Codec To support YoMo Application ?
- YoMo Streaming messages , Extract monitored key-value Process the business logic . If you use JSON codec , Will require that you must wait for the complete packet to be received before deserializing the packet as an object , And then extract the corresponding key-value value ; But for the YoMo Codec, By describing object data as a set of
TLV structure
, When decoding packets , Can be in the decoding process earlier to understand the currentT
Whether or not to listen to the prison key, To determine whether to jump directly to the next groupTLV structure
, There is no need to decode non monitored packets unnecessarily , Thus, the efficiency of decoding is improved . - JSON A lot of reflection is usually used in decoding , So its performance will be affected , and YoMo Codec Because only for the actual monitored key-value decode , The use of actual reflection will be greatly reduced .
- In the industrial Internet or in the network applications that require strict computing resources , For the same codec operation, less loss is needed CPU resources , So that the limited computing resources can be more fully used .
This performance test is to verify YoMo Codec Than JSON It has higher data decoding performance and less resource consumption , Thus for YoMo Provide more real-time 、 Efficient 、 Low loss of message processing capability .
Test instructions
1. Test method
-
adopt Benchmark Benchmark , Provide Serial and parallel Two ways , The latter is making full use of CPU Performance in the case of resources .
-
The tested data package is generated by program , And guarantee that Codec And JSON The data used in the test contains key-value The values of pairs are exactly the same .
-
The data to be tested contains key-value The right data is divided into 3 Yes 、16 Yes 、32 Yes 、63 Yes These groups , Observe separately in different key-value The effect on decoding performance in case of quantity , And what's being monitored key The values are the middle values of their numbers , Such as : K08 It means to listen to No 8 individual key Value . So you get the following dimensions , It is then shown in the chart of the test results .
Symbolic representation Key-value The number of Monitored key Location C63-K32 common 63 Yes key-value Listen to extract section 32 position Of key Of value value C32-K16 common 32 Yes key-value Listen to extract section 16 position Of key Of value value C16-K08 common 16 Yes key-value Listen to extract section 08 position Of key Of value value C03-K02 common 03 Yes key-value Listen to extract section 02 position Of key Of value value -
The results of the test include :
- Decode and extract the monitored data from packets key Corresponding value Value operation performance comparison .
- Compare the occupancy in the same decoded extraction scenario CPU Time for .
2. data structure
-
Y3 Test data
0x80 0x01 value .... 0x3f value
-
JSON The structure of the test data
{ "k1": value, ... "k63" value }
3. Data processing logic
4. Test project
-
The code of this test report can be downloaded from yomo-y3-stress-testing Project acquisition .
-
Main code structure description ( List only the documentation directly related to this test ):
5. Test environment
- Hardware environment :
- CPU:2.6 GHz 6P intel Core i7,GOMAXPROCS=12
- Memory :32GB
- Hard disk :SSD
- Software environment :
- macOS Catalina
- go version go1.14.1 darwin/amd64
- yomo-y3-stress-testing
Benchmark test
1. Serial test process
-
The code under test :
./internal/decoder/report_serial/report_benchmark_test.go
, Such as :// in the light of YoMo Codec Y3 Benchmark func Benchmark_Codec_C63_K32(b *testing.B) { var key byte = 0x20 data := generator.NewCodecTestData().GenDataBy(63) b.ResetTimer() for i := 0; i < b.N; i++ { if decoder.TakeValueFromCodec(key, data) == nil { panic(errors.New("take is failure")) } } } // in the light of JSON Benchmark func Benchmark_Json_C63_K32(b *testing.B) { key := "k32" data := generator.NewJsonTestData().GenDataBy(63) data = append(data, decoder.TokenEnd) b.ResetTimer() for i := 0; i < b.N; i++ { if decoder.TakeValueFromJson(key, data) == nil { panic(errors.New("take is failure")) } } }
- Benchmark_Codec_C63_K32: For key-value by 63 The data set of the group is extracted from the 32 individual key Data value , Serial benchmarking of this .
- Default :GOMAXPROCS=12
-
Start the test script :
./internal/decoder/report_serial/report_benchmark_test.sh
temp_file="../../../docs/temp.out" report_file="../../../docs/report.out" go test -bench=. -benchtime=3s -benchmem -run=none | grep Benchmark > ${temp_file} \ && echo 'finished bench' \ && cat ${temp_file} \ && cat ${temp_file} | awk '{print $1,$3}' | awk -F "_" '{print $2,$3"-"substr($4,1,3),substr($4,7)}' | awk -v OFS=, '{print $1,$2,$3}' > ${report_file} \ && echo 'finished analyse' \ && cat ${report_file}
Through to report_benchmark_test.go Test file run benchmark The benchmark , Generate test result set and save to
./docs/report.out
In file . -
Generate a result chart :
./docs/report_graphics.ipynb
python --version # Python version > 3.2.x pip install runipy bar_ylim=70000 barh_xlim=20 runipy ./report_graphics.ipynb
2. Parallel testing process
To maximize CPU Utilization ratio , Observe the performance of decoder in multi-core environment , Added Parallel Test items of
-
The code under test :
./internal/decoder/report_parallel/report_benchmark_test.go
, Such as :func Benchmark_Codec_C63_K32(b *testing.B) { var key byte = 0x20 data := generator.NewCodecTestData().GenDataBy(63) b.ResetTimer() b.RunParallel(func(pb *testing.PB) { for pb.Next(){ if decoder.TakeValueFromCodec(key, data) == nil { panic(errors.New("take is failure")) } } }) }
- The code is the same as the body of the serial , The difference is in the use of RunParallel To do parallel testing
- Default :GOMAXPROCS=12
-
Start the test script :
./internal/decoder/report_parallel/report_benchmark_test.sh
Generate test result set and save to./docs/report.out
In file . -
Generate a result chart :
bar_ylim=18000 barh_xlim=25 runipy ./report_graphics.ipynb
3. test result
-
Serial Benchmark test result :
- Time consuming comparison of single decoding extraction : chart 3.1
-
Y3 And JSON The rate of time-consuming growth : chart 3.2
- Chart description :
- chart 3.1 Coordinates of :C63-K32, Indicates that the packet contains 63 Yes key-value, And listen to the same section 32 Bit key Extract it value.
- chart 3.1 Of Y coordinate : Represents the number of nanoseconds taken by a single operation .
- chart 3.2 Of X coordinate : Express (JSON Decoding takes time /Y3 Decoding takes time ) The increase times of . Such as :43010/2077=20.07
-
parallel Benchmark test result :
-
Time consuming comparison of single decoding extraction : chart 3.3
-
Y3 And JSON The rate of time-consuming growth : chart 3.4
-
4. Test and analysis
The above test results show that :
-
Y3 Decoding performance ratio of JSON There's a big improvement , As the packet contains key-value Yes, the more , The more obvious the performance improvement , On average, 10 Double growth . (20.7+15.8+6.2+3.3)/4=11.5
-
Parallel decoding with multiple cores , Its ns/op There is also a big improvement in the performance of . Parallel vs. serial has 3 Double the rise :
C63-K32 C32-K16 C16-K08 C03-K02 Serial test 2077 1361 1667 610 Parallel test 706 505 515 175 growth 290% 260% 320% 350%
CPU Resource analysis
1. Testing process
-
The code under test :
./cpu/cpu_pprof.go
func main() { dataCodec := generator.NewCodecTestData().GenDataBy(63) dataJson := generator.NewJsonTestData().GenDataBy(63) dataJson = append(dataJson, decoder.TokenEnd) // pprof fmt.Printf("start pprof\n") go pprof.Run() time.Sleep(5 * time.Second) fmt.Printf("start testing...\n") for { if decoder.TakeValueFromCodec(0x20, dataCodec) == nil { panic(errors.New("take is failure")) } if decoder.TakeValueFromJson("k32", dataJson) == nil { panic(errors.New("take is failure")) } } }
- pprof.Run(): Used to start pprof
-
The program circulates over and over again Y3 and JSON decode , Through observation cpu profile Its sampling diagram CPU The proportion of the resources of
-
Run the test :
# Run the observed code ,pprof The default startup 6060 port go run ./cpu_pprof.go # Take samples , adopt 8081 Port observation analysis chart go tool pprof -http=":8081" http://localhost:6060/debug/cpu/profile
2. test result
3. Test and analysis
As can be seen from the above figure ,YoMo Codec Y3 It has to be decoded. Right CPU The occupation of resources is far less than JSON, There's also a difference 10 More than times (0.73/0.07=10.4), This observation is related to Benchmark It can correspond to , Yes CPU Low resource usage , At the same time, the decoding speed is also improved .
Test conclusion
Y3 a JSON There is an order of magnitude improvement in decoding performance , In the packet key The more quantity, the more obvious the performance improvement , meanwhile Y3 Yes CPU There is also an order of magnitude reduction in resource usage ; Through this performance test, it can be verified that YoMo Codec Y3 Can decode for YoMo Or other scenarios that require high-performance decoding provide real-time 、 Efficient 、 Low loss of message processing capability .
版权声明
本文为[Cella]所创,转载请带上原文链接,感谢
边栏推荐
- Simple implementation of activity workflow interactive demo
- How to edit summation formula in MathType
- YoMo Codec - Y3的性能评测报告
- 老旧系统重构技巧,轻松搞定遗留代码
- PHP - cURL复制粘贴性接入短信验证码示例
- 60 余位技术高管齐聚松山湖,华为云第一期核心伙伴开发者训练营圆满落幕
- 关于生活,可能有用的40条建议
- Revealing the logic of moving path selection in Summoner Canyon?
- Old system refactoring skills, easy to handle legacy code
- 第三阶段 Day20 购物车模块实现 添加拦截器 添加用户权限校检 实现订单模块
猜你喜欢
手把手教你使用容器服务 TKE 集群审计排查问题
运用强大的 PowerBI 桑基图表示复杂运营业务流
[God level operation] analyze the Ninja code with the traditional Chinese thoughts of Confucius and Laozi!
How the API gateway carries the API economic ecological chain
Installation and deployment of Flink
dat.GUI 打造可视化工具(一)
解析:C++如何实现简单的学生管理系统(源码分享)
GPS timing system (network timing instrument) application of e-government system
开源项目,私活利器,快速开发
SQL Server附加数据库拒绝访问解决方法汇总
随机推荐
dat.GUI Creating visualization tools (1)
【科创人】Rancher江鹏:从清华工程物理学硕士到云计算开源创业者
Configure the NZ date picker time selection component of ng zerro
Gesture switch background, let live with goods more immersive
Gesture switch background, let live with goods more immersive
JT-day10
What is the Ethernet module? What are the functions and characteristics of the Ethernet module
云数据库的本质是什么?探究华为云数据库的核心价值
openocd+jlink_picture
DCL单例模式中的缺陷及单例模式的其他实现
Git + -- Code hosting in the history of version management
揭秘在召唤师峡谷中移动路径选择逻辑?
Rabbitmq installation
[stm32h7] Chapter 6: stm32h7 dma2d acceleration of ThreadX guix
60 余位技术高管齐聚松山湖,华为云第一期核心伙伴开发者训练营圆满落幕
A great guide to curl
Single linked list inversion
【云小课】版本管理发展史之Git+——代码托管
openocd+jlink_picture
[stm32f429] Chapter 6: stm32f429 dma2d acceleration of ThreadX guix