当前位置:网站首页>Drill down to protobuf - Introduction
Drill down to protobuf - Introduction
2022-06-13 08:59:00 【xiongpursuit88】
Before, the technology often used in network communication, general data exchange and other application scenarios is JSON or XML, In the recent development, I came into contact with Google Of ProtoBuf.
Learning by consulting relevant materials ProtoBuf And after studying its source code , It is found that the efficiency 、 Excellent compatibility, etc . In the future project technology selection , Especially network communication 、 Scenarios such as general data exchange should be preferred ProtoBuf.
I'm learning ProtoBuf In the process of translating the official main documents , One, of course, is learning ProtoBuf, The second is to cultivate the ability to read English documents , Three come because Google Documents ? There is no the !
After reading these documents, yes ProtoBuf Should have a considerable degree of understanding .
For translated documents, see [ Indexes ] Article index , Navigation for translation - technology - ProtoBuf Official documents .
But the official documents are more for reference and authority , It doesn't mean that you can understand the principle immediately after reading the official documents .
This article and the next few articles will be helpful to ProtoBuf The coding 、 serialize 、 Deserialization 、 The principles of reflection are introduced in detail , At the same time, we will try to express these principles more easily .
What is the ProtoBuf
Let's take a look at the definitions and descriptions given in the official documents :
protocol buffers It's a language that has nothing to do with 、 Platform independent 、 Extensible way to serialize structured data , It can be used for ( data ) Communication protocol 、 Data storage, etc .
Protocol Buffers It's a kind of flexibility , Efficient , Structure data serialization method of automation mechanism - Analogical XML, But compared to XML smaller (3 ~ 10 times )、 faster (20 ~ 100 times )、 It's simpler .
You can define the structure of the data , Then use the specially generated source code to easily write and read the structural data in various languages in various data streams . You can even update the data structure , Without breaking the deployed program compiled by the old data structure .
simply , ProtoBuf It's structural data serialize [1] Method , It's simple Analogy to XML[2], It has the following characteristics :
- Language has nothing to do 、 Platform independent . namely ProtoBuf Support Java、C++、Python multilingual , Support multiple platforms
- Efficient . I.e. ratio XML smaller (3 ~ 10 times )、 faster (20 ~ 100 times )、 It's simpler
- Extensibility 、 Compatibility is good. . You can update the data structure , Without affecting and destroying the original old program
serialize [1]: take Structural data or object convert to Can be stored and transmitted ( For example, network transmission ) The format of , At the same time, we should ensure that the serialization result is later ( Maybe in another computing environment ) Can be reconstructed back to the original structure data or object .
For a more detailed introduction, please refer to Wikipedia .
Analogy to XML[2]: This mainly refers to the analogy of serialization in the application scenario of data communication and data storage , But I think XML As an extended markup language and ProtoBuf There are still essential differences .
Use ProtoBuf
Yes ProtoBuf After having a certain understanding of the basic concepts of , Let's see how to use ProtoBuf.
First step , establish .proto file , Define the data structure , Here's an example 1 Shown :
// example 1: stay xxx.proto The document defines Example1 message
message Example1 {
optional string stringVal = 1;
optional bytes bytesVal = 2;
message EmbeddedMessage {
int32 int32Val = 1;
string stringVal = 2;
}
optional EmbeddedMessage embeddedExample1 = 3;
repeated int32 repeatedInt32Val = 4;
repeated string repeatedStringVal = 5;
}
In the above example, we defined a named Example1 Of news , The grammar is very simple ,message Keyword followed by the message name :
message xxx {
}
Then we defined message Fields with , In the form of :
message xxx {
// Rules of the field :required -> Fields can only and must appear 1 Time
// Rules of the field :optional -> The field can appear 0 Time or 1 Time
// Rules of the field :repeated -> The field can appear any number of times ( Include 0)
// type :int32、int64、sint32、sint64、string、32-bit ....
// Field number :0 ~ 536870911( remove 19000 To 19999 Number between )
Rules of the field type name = Field number ;
}
In the example above , We defined :
- type string, be known as stringVal Of optional Optional fields , The field number is 1, This field can appear 0 or 1 Time
- type bytes, be known as bytesVal Of optional Optional fields , The field number is 2, This field can appear 0 or 1 Time
- type EmbeddedMessage( Custom inline message type ), be known as embeddedExample1 Of optional Optional fields , The field number is 3, This field can appear 0 or 1 Time
- type int32, be known as repeatedInt32Val Of repeated Repeatable fields , The field number is 4, This field can appear Any times ( Include 0)
- type string, be known as repeatedStringVal Of repeated Repeatable fields , The field number is 5, This field can appear Any times ( Include 0)
About proto2 Definition message More syntax details for messages , For example, what types are supported , Field number assignment 、import
Import definition ,reserved For information about reserved fields, see [ translate ] ProtoBuf Official documents ( Two )- Grammar guide (proto2).
For some specifications of definition, please refer to [ translate ] ProtoBuf Official documents ( Four )- Specification guidelines
The second step ,protoc compile .proto File generation read / write interface
We are .proto The data structure is defined in the file , These data structures are for developers and business applications , Not for storage and transmission .
When these data need to be stored or transmitted , These data structures need to be serialized 、 Deserialization and read / write . So how to achieve it ? Never mind , ProtoBuf Will provide us with the corresponding interface code . How to provide ? The answer is through protoc This compiler .
The corresponding interface code can be generated through the following commands :
// $SRC_DIR: .proto The source directory // --cpp_out: Generate c++ Code // $DST_DIR: The target directory of the generated code // xxx.proto: For which proto File generation interface code The resulting code will provide an interface similar to the following :


The third step , Call the interface for serialization 、 Deserialization and read / write
For example in the first step 1 Defined message, We can call the interface generated in step 2 , The implementation test code is as follows :
//
// Created by yue on 18-7-21.
//
#include <iostream>
#include <fstream>
#include <string>
#include "single_length_delimited_all.pb.h"
intmain(){
Example1 example1;
example1.set_stringval(“hello,world”);
example1.set_bytesval(“are you ok?”);
Example1_EmbeddedMessage <span class="token operator">*</span>embeddedExample2 <span class="token operator">=</span> new <span class="token function">Example1_EmbeddedMessage</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
embeddedExample2<span class="token operator">-></span><span class="token function">set_int32val</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
embeddedExample2<span class="token operator">-></span><span class="token function">set_stringval</span><span class="token punctuation">(</span><span class="token string">"embeddedInfo"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
example1<span class="token punctuation">.</span><span class="token function">set_allocated_embeddedexample1</span><span class="token punctuation">(</span>embeddedExample2<span class="token punctuation">)</span><span class="token punctuation">;</span>
example1<span class="token punctuation">.</span><span class="token function">add_repeatedint32val</span><span class="token punctuation">(</span><span class="token number">2</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
example1<span class="token punctuation">.</span><span class="token function">add_repeatedint32val</span><span class="token punctuation">(</span><span class="token number">3</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
example1<span class="token punctuation">.</span><span class="token function">add_repeatedstringval</span><span class="token punctuation">(</span><span class="token string">"repeated1"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
example1<span class="token punctuation">.</span><span class="token function">add_repeatedstringval</span><span class="token punctuation">(</span><span class="token string">"repeated2"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
std<span class="token operator">::</span>string filename <span class="token operator">=</span> <span class="token string">"single_length_delimited_all_example1_val_result"</span><span class="token punctuation">;</span>
std<span class="token operator">::</span>fstream <span class="token function">output</span><span class="token punctuation">(</span>filename<span class="token punctuation">,</span> std<span class="token operator">::</span>ios<span class="token operator">::</span>out <span class="token operator">|</span> std<span class="token operator">::</span>ios<span class="token operator">::</span>trunc <span class="token operator">|</span> std<span class="token operator">::</span>ios<span class="token operator">::</span>binary<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token operator">!</span>example1<span class="token punctuation">.</span><span class="token function">SerializeToOstream</span><span class="token punctuation">(</span><span class="token operator">&</span>output<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
std<span class="token operator">::</span>cerr <span class="token operator"><<</span> <span class="token string">"Failed to write example1."</span> <span class="token operator"><<</span> std<span class="token operator">::</span>endl<span class="token punctuation">;</span>
<span class="token function">exit</span><span class="token punctuation">(</span><span class="token operator">-</span><span class="token number">1</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
<span class="token keyword">return</span> <span class="token number">0</span><span class="token punctuation">;</span>
}
About protoc For more information about the use of and interface calls, see [ translate ] ProtoBuf Official documents ( Nine )- (C++ Development ) course
About example 1 For the complete code, see Source code :protobuf example 1. Among them single_length_delimited_all.* For example, related code and files .
Because the focus of this series of articles is to deepen ProtoBuf The coding 、 serialize 、 Reflection and other principles , About ProtoBuf The grammar of 、 Use and so on are only briefly introduced , For more details, please refer to the series of official documents I translated .
About ProtoBuf Some thoughts on
Official documents and many articles on the Internet mention ProtoBuf Analogical XML or JSON.
that ProtoBuf Is it equal to XML and JSON Well , Do they have exactly the same application scenario ?
Personally, I think if we want to ProtoBuf、XML、JSON Put the three together to compare , Two dimensions should be distinguished . One is Data structure , One is Data serialization . The data structure here is mainly oriented to the development or business level , Data serialization is oriented at the communication or storage level , Of course, data serialization also needs “ structure ” and “ Format ”, Therefore, the difference between the two mainly lies in the difference between domain oriented and scenario , The general requirements and emphasis will also be different . Data structure focuses on human readability, and sometimes even emphasizes the ability of semantic expression , Data compression and serialization efficiency .
From these two dimensions , We can make the following reflections .
XML As an extensible markup language ,JSON As a result of JS Data format , All have Data structure The ability of .
for example XML Can be derived from HTML ( although HTML Before XML, But conceptually ,HTML It's just a predefined tag XML),HTML Is to mark and express the structure of resources in the world wide web , So that the browser can better display the world wide web resources , At the same time, it should be as human readable as possible for developers to edit , This is business or development oriented Data structure .
Again XML It can also be derived from RDF/RDFS, Further express the relationship and semantics of resources in the semantic web , It also emphasizes Data structure And human readable .
About RDF/RDFS And the concept of semantic web can be understood by querying relevant materials , Or refer to 2-Answer series - Ontology building module ( One ) and 3-Answer series - Ontology building module ( Two ) , There are some brief introductions .
JSON It's the same thing , On many occasions, it more reflects Data structure The ability of , For example, the expression of the data structure as the interaction interface . stay MongoDB Used in JSON As a query statement , It is also exerting its ability of data structure .
Of course ,JSON、XML It can also be used directly Data serialization , In fact, many times they are used in the same way , For example, direct adoption of JSON、XML Network communication transmission , here JSON、XML It becomes a serialization format , It gives full play to the ability of data serialization . But it is often used this way , That doesn't mean it's reasonable . Actually JSON、XML Direct acting data serialization is usually not the best choice , Because they're at speed 、 efficiency 、 Not optimal in space . In other words, they are more suitable for data structuring than data serialization .
Pull out XML and JSON, Let's see ProtoBuf, alike ProtoBuf It also has the ability of data structure , In fact, it is the above introduction message Definition . We can .proto In file , adopt message、import、 Embedded message And other syntax to realize data structure , But it's easy to see ,ProtoBuf In terms of data structure and XML、JSON There's a big difference , Poor human readability , Not suitable for the above mentioned XML、JSON Some application scenarios of .
But from the perspective of data serialization, you will find that ProtoBuf There are clear advantages , efficiency 、 Speed 、 The space is almost completely dominant , After reading the back ProtoBuf Coded articles , You know better ProtoBuf How to squeeze every inch of space and performance as much as possible , And the coding principle is ProtoBuf The key ,message The ability to express is not ProtoBuf The most critical point . So it can be seen that ProtoBuf Focus on Data serialization Instead of Data structure .
Finally, I will make a little summary of these personal thoughts :
- XML、JSON、ProtoBuf All have Data structure and Data serialization The ability of
- XML、JSON Pay more attention to Data structure , Focus on human readability and semantic expression .ProtoBuf Pay more attention to Data serialization , Focus on efficiency 、 Space 、 Speed , Poor human readability , Lack of semantic expression ( To ensure maximum efficiency , Will discard some meta information )
- ProtoBuf The application scenarios are more specific ,XML、JSON The application scenarios are more abundant .
Next
Wang
Wang

边栏推荐
- GBase 8a V95与V86压缩策略类比
- 银行理财产品有哪些?清算期是多长?
- transforms. ColorJitter(0.3, 0, 0, 0)
- Visual studio tools using shortcut keys (continuous update)
- MySQL startup error: innodb: operating system error number 13 in a file operation
- Undefined and null in JS
- How to save the video of wechat video number locally?
- Redirect vulnerability analysis of network security vulnerability analysis
- Uni app essay
- Vscode plug in
猜你喜欢

Diversified tables through TL table row consolidation

20220606 关于矩阵的Young不等式

Uni app essay

Message Oriented Middleware

网络安全漏洞分析之重定向漏洞分析

Tutorial (5.0) 02 Management * fortiedr * Fortinet network security expert NSE 5

Completely uninstall PostgreSQL under Linux

redis

Browser render passes

useRoutes() may be used only in the context of a <Router> component.
随机推荐
20211115 矩阵对角化的充要条件;满秩矩阵不一定有n个线性无关的特征向量;对称矩阵一定可以对角化
Detailed explanation of C language callback function
20220524 如何把CoppeliaSim安装到D盘
Animation through svg
Simulink如何添加模块到Library Browser
20211006 线性变换
關於RSA加密解密原理
关于RSA加密解密原理
About RSA encryption and decryption principle
Object in ES6 Use of entries()
Basic use of cesium, including loading images, terrain, models, vector data, etc
How excel adds hyperlinks to some text in a cell
浅析Visual Studio 使用
GBase 8a磁盘问题及处理
What are the bank financial products? How long is the liquidation period?
On the use of regular expressions (bracket problem)
Vscode plug in
Margin:0 reason why auto does not take effect
Knowledge points related to system architecture 2
JS ask for the day of the year