当前位置:网站首页>Drill down to protobuf - Introduction
Drill down to protobuf - Introduction
2022-06-13 08:59:00 【xiongpursuit88】
Before, the technology often used in network communication, general data exchange and other application scenarios is JSON or XML, In the recent development, I came into contact with Google Of ProtoBuf.
Learning by consulting relevant materials ProtoBuf And after studying its source code , It is found that the efficiency 、 Excellent compatibility, etc . In the future project technology selection , Especially network communication 、 Scenarios such as general data exchange should be preferred ProtoBuf.
I'm learning ProtoBuf In the process of translating the official main documents , One, of course, is learning ProtoBuf, The second is to cultivate the ability to read English documents , Three come because Google Documents ? There is no the !
After reading these documents, yes ProtoBuf Should have a considerable degree of understanding .
For translated documents, see [ Indexes ] Article index , Navigation for translation - technology - ProtoBuf Official documents .
But the official documents are more for reference and authority , It doesn't mean that you can understand the principle immediately after reading the official documents .
This article and the next few articles will be helpful to ProtoBuf The coding 、 serialize 、 Deserialization 、 The principles of reflection are introduced in detail , At the same time, we will try to express these principles more easily .
What is the ProtoBuf
Let's take a look at the definitions and descriptions given in the official documents :
protocol buffers It's a language that has nothing to do with 、 Platform independent 、 Extensible way to serialize structured data , It can be used for ( data ) Communication protocol 、 Data storage, etc .
Protocol Buffers It's a kind of flexibility , Efficient , Structure data serialization method of automation mechanism - Analogical XML, But compared to XML smaller (3 ~ 10 times )、 faster (20 ~ 100 times )、 It's simpler .
You can define the structure of the data , Then use the specially generated source code to easily write and read the structural data in various languages in various data streams . You can even update the data structure , Without breaking the deployed program compiled by the old data structure .
simply , ProtoBuf It's structural data serialize [1] Method , It's simple Analogy to XML[2], It has the following characteristics :
- Language has nothing to do 、 Platform independent . namely ProtoBuf Support Java、C++、Python multilingual , Support multiple platforms
- Efficient . I.e. ratio XML smaller (3 ~ 10 times )、 faster (20 ~ 100 times )、 It's simpler
- Extensibility 、 Compatibility is good. . You can update the data structure , Without affecting and destroying the original old program
serialize [1]: take Structural data or object convert to Can be stored and transmitted ( For example, network transmission ) The format of , At the same time, we should ensure that the serialization result is later ( Maybe in another computing environment ) Can be reconstructed back to the original structure data or object .
For a more detailed introduction, please refer to Wikipedia .
Analogy to XML[2]: This mainly refers to the analogy of serialization in the application scenario of data communication and data storage , But I think XML As an extended markup language and ProtoBuf There are still essential differences .
Use ProtoBuf
Yes ProtoBuf After having a certain understanding of the basic concepts of , Let's see how to use ProtoBuf.
First step , establish .proto file , Define the data structure , Here's an example 1 Shown :
// example 1: stay xxx.proto The document defines Example1 message
message Example1 {
optional string stringVal = 1;
optional bytes bytesVal = 2;
message EmbeddedMessage {
int32 int32Val = 1;
string stringVal = 2;
}
optional EmbeddedMessage embeddedExample1 = 3;
repeated int32 repeatedInt32Val = 4;
repeated string repeatedStringVal = 5;
}
In the above example, we defined a named Example1 Of news , The grammar is very simple ,message Keyword followed by the message name :
message xxx {
}
Then we defined message Fields with , In the form of :
message xxx {
// Rules of the field :required -> Fields can only and must appear 1 Time
// Rules of the field :optional -> The field can appear 0 Time or 1 Time
// Rules of the field :repeated -> The field can appear any number of times ( Include 0)
// type :int32、int64、sint32、sint64、string、32-bit ....
// Field number :0 ~ 536870911( remove 19000 To 19999 Number between )
Rules of the field type name = Field number ;
}
In the example above , We defined :
- type string, be known as stringVal Of optional Optional fields , The field number is 1, This field can appear 0 or 1 Time
- type bytes, be known as bytesVal Of optional Optional fields , The field number is 2, This field can appear 0 or 1 Time
- type EmbeddedMessage( Custom inline message type ), be known as embeddedExample1 Of optional Optional fields , The field number is 3, This field can appear 0 or 1 Time
- type int32, be known as repeatedInt32Val Of repeated Repeatable fields , The field number is 4, This field can appear Any times ( Include 0)
- type string, be known as repeatedStringVal Of repeated Repeatable fields , The field number is 5, This field can appear Any times ( Include 0)
About proto2 Definition message More syntax details for messages , For example, what types are supported , Field number assignment 、import
Import definition ,reserved For information about reserved fields, see [ translate ] ProtoBuf Official documents ( Two )- Grammar guide (proto2).
For some specifications of definition, please refer to [ translate ] ProtoBuf Official documents ( Four )- Specification guidelines
The second step ,protoc compile .proto File generation read / write interface
We are .proto The data structure is defined in the file , These data structures are for developers and business applications , Not for storage and transmission .
When these data need to be stored or transmitted , These data structures need to be serialized 、 Deserialization and read / write . So how to achieve it ? Never mind , ProtoBuf Will provide us with the corresponding interface code . How to provide ? The answer is through protoc This compiler .
The corresponding interface code can be generated through the following commands :
// $SRC_DIR: .proto The source directory // --cpp_out: Generate c++ Code // $DST_DIR: The target directory of the generated code // xxx.proto: For which proto File generation interface code
The resulting code will provide an interface similar to the following :


The third step , Call the interface for serialization 、 Deserialization and read / write
For example in the first step 1 Defined message, We can call the interface generated in step 2 , The implementation test code is as follows :
//
// Created by yue on 18-7-21.
//
#include <iostream>
#include <fstream>
#include <string>
#include "single_length_delimited_all.pb.h"
intmain(){
Example1 example1;
example1.set_stringval(“hello,world”);
example1.set_bytesval(“are you ok?”);
Example1_EmbeddedMessage <span class="token operator">*</span>embeddedExample2 <span class="token operator">=</span> new <span class="token function">Example1_EmbeddedMessage</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
embeddedExample2<span class="token operator">-></span><span class="token function">set_int32val</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
embeddedExample2<span class="token operator">-></span><span class="token function">set_stringval</span><span class="token punctuation">(</span><span class="token string">"embeddedInfo"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
example1<span class="token punctuation">.</span><span class="token function">set_allocated_embeddedexample1</span><span class="token punctuation">(</span>embeddedExample2<span class="token punctuation">)</span><span class="token punctuation">;</span>
example1<span class="token punctuation">.</span><span class="token function">add_repeatedint32val</span><span class="token punctuation">(</span><span class="token number">2</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
example1<span class="token punctuation">.</span><span class="token function">add_repeatedint32val</span><span class="token punctuation">(</span><span class="token number">3</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
example1<span class="token punctuation">.</span><span class="token function">add_repeatedstringval</span><span class="token punctuation">(</span><span class="token string">"repeated1"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
example1<span class="token punctuation">.</span><span class="token function">add_repeatedstringval</span><span class="token punctuation">(</span><span class="token string">"repeated2"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
std<span class="token operator">::</span>string filename <span class="token operator">=</span> <span class="token string">"single_length_delimited_all_example1_val_result"</span><span class="token punctuation">;</span>
std<span class="token operator">::</span>fstream <span class="token function">output</span><span class="token punctuation">(</span>filename<span class="token punctuation">,</span> std<span class="token operator">::</span>ios<span class="token operator">::</span>out <span class="token operator">|</span> std<span class="token operator">::</span>ios<span class="token operator">::</span>trunc <span class="token operator">|</span> std<span class="token operator">::</span>ios<span class="token operator">::</span>binary<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token operator">!</span>example1<span class="token punctuation">.</span><span class="token function">SerializeToOstream</span><span class="token punctuation">(</span><span class="token operator">&</span>output<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
std<span class="token operator">::</span>cerr <span class="token operator"><<</span> <span class="token string">"Failed to write example1."</span> <span class="token operator"><<</span> std<span class="token operator">::</span>endl<span class="token punctuation">;</span>
<span class="token function">exit</span><span class="token punctuation">(</span><span class="token operator">-</span><span class="token number">1</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
<span class="token keyword">return</span> <span class="token number">0</span><span class="token punctuation">;</span>
}
About protoc For more information about the use of and interface calls, see [ translate ] ProtoBuf Official documents ( Nine )- (C++ Development ) course
About example 1 For the complete code, see Source code :protobuf example 1. Among them single_length_delimited_all.* For example, related code and files .
Because the focus of this series of articles is to deepen ProtoBuf The coding 、 serialize 、 Reflection and other principles , About ProtoBuf The grammar of 、 Use and so on are only briefly introduced , For more details, please refer to the series of official documents I translated .
About ProtoBuf Some thoughts on
Official documents and many articles on the Internet mention ProtoBuf Analogical XML or JSON.
that ProtoBuf Is it equal to XML and JSON Well , Do they have exactly the same application scenario ?
Personally, I think if we want to ProtoBuf、XML、JSON Put the three together to compare , Two dimensions should be distinguished . One is Data structure , One is Data serialization . The data structure here is mainly oriented to the development or business level , Data serialization is oriented at the communication or storage level , Of course, data serialization also needs “ structure ” and “ Format ”, Therefore, the difference between the two mainly lies in the difference between domain oriented and scenario , The general requirements and emphasis will also be different . Data structure focuses on human readability, and sometimes even emphasizes the ability of semantic expression , Data compression and serialization efficiency .
From these two dimensions , We can make the following reflections .
XML As an extensible markup language ,JSON As a result of JS Data format , All have Data structure The ability of .
for example XML Can be derived from HTML ( although HTML Before XML, But conceptually ,HTML It's just a predefined tag XML),HTML Is to mark and express the structure of resources in the world wide web , So that the browser can better display the world wide web resources , At the same time, it should be as human readable as possible for developers to edit , This is business or development oriented Data structure .
Again XML It can also be derived from RDF/RDFS, Further express the relationship and semantics of resources in the semantic web , It also emphasizes Data structure And human readable .
About RDF/RDFS And the concept of semantic web can be understood by querying relevant materials , Or refer to 2-Answer series - Ontology building module ( One ) and 3-Answer series - Ontology building module ( Two ) , There are some brief introductions .
JSON It's the same thing , On many occasions, it more reflects Data structure The ability of , For example, the expression of the data structure as the interaction interface . stay MongoDB Used in JSON As a query statement , It is also exerting its ability of data structure .
Of course ,JSON、XML It can also be used directly Data serialization , In fact, many times they are used in the same way , For example, direct adoption of JSON、XML Network communication transmission , here JSON、XML It becomes a serialization format , It gives full play to the ability of data serialization . But it is often used this way , That doesn't mean it's reasonable . Actually JSON、XML Direct acting data serialization is usually not the best choice , Because they're at speed 、 efficiency 、 Not optimal in space . In other words, they are more suitable for data structuring than data serialization .
Pull out XML and JSON, Let's see ProtoBuf, alike ProtoBuf It also has the ability of data structure , In fact, it is the above introduction message Definition . We can .proto In file , adopt message、import、 Embedded message And other syntax to realize data structure , But it's easy to see ,ProtoBuf In terms of data structure and XML、JSON There's a big difference , Poor human readability , Not suitable for the above mentioned XML、JSON Some application scenarios of .
But from the perspective of data serialization, you will find that ProtoBuf There are clear advantages , efficiency 、 Speed 、 The space is almost completely dominant , After reading the back ProtoBuf Coded articles , You know better ProtoBuf How to squeeze every inch of space and performance as much as possible , And the coding principle is ProtoBuf The key ,message The ability to express is not ProtoBuf The most critical point . So it can be seen that ProtoBuf Focus on Data serialization Instead of Data structure .
Finally, I will make a little summary of these personal thoughts :
- XML、JSON、ProtoBuf All have Data structure and Data serialization The ability of
- XML、JSON Pay more attention to Data structure , Focus on human readability and semantic expression .ProtoBuf Pay more attention to Data serialization , Focus on efficiency 、 Space 、 Speed , Poor human readability , Lack of semantic expression ( To ensure maximum efficiency , Will discard some meta information )
- ProtoBuf The application scenarios are more specific ,XML、JSON The application scenarios are more abundant .
Next
Wang
Wang

边栏推荐
- Top+jstack to analyze the causes of excessive CPU
- Common network problems and troubleshooting methods of gbase
- Docker installing MySQL local remote connection docker container MySQL
- Use of grep
- Judgment of single exclamation point and double exclamation point in JS
- Knowledge points related to system architecture 3
- GBase 8a V95与V86压缩策略类比
- turf. JS usage
- Mapbox loads nationwide and provincial range, displaying multi-color animation points, migration lines, 3D histogram, etc
- H5 mobile terminal adaptation
猜你喜欢
Knowledge points related to system architecture 1
System analysis - detailed description
Browser render passes
[network security] webshell empowerment of new thinking of SQL injection
transforms. ColorJitter(0.3, 0, 0, 0)
Use of grep
Problems in the deconstruction and assignment of objects, comparison between empty strings and undefined
4. Relationship selector (parent-child relationship, ancestor offspring relationship, brother relationship)
【 sécurité 】 comment devenir ingénieur de sécurité de 0 à 1 contre - attaque pour la Fondation zéro
Tutorial (5.0) 03 Security policy * fortiedr * Fortinet network security expert NSE 5
随机推荐
Object in ES6 Use of entries()
H5 mobile terminal adaptation
GBase 常见网络问题及排查方法
pytorch相同结构不同参数名模型加载权重
Redirect vulnerability analysis of network security vulnerability analysis
Screenshot of cesium implementation scenario
【安全】零基礎如何從0到1逆襲成為安全工程師
Simulink如何添加模块到Library Browser
Brief description of port, domain communication port and domain service
Pytorch model tuning - only some layers of the pre training model are loaded
20211006 线性变换
顺时针打印个数组
Implement authentication code login and remember password (cookie)
Can I open an account for the reverse repurchase of treasury bonds? Can I directly open the security of securities companies on the app for the reverse repurchase of treasury bonds? How can I open an
Review one flex knowledge point every day
Object array de encapsulation
[QNX hypervisor 2.2 user manual] 4.5.1 build QNX guest
银行理财产品有哪些?清算期是多长?
【安全】零基础如何从0到1逆袭成为安全工程师
The Jenkins console does not output custom shell execution logs