当前位置:网站首页>Drill down to protobuf - Introduction

Drill down to protobuf - Introduction

2022-06-13 08:59:00 xiongpursuit88

Before, the technology often used in network communication, general data exchange and other application scenarios is JSON or XML, In the recent development, I came into contact with Google Of ProtoBuf.

Learning by consulting relevant materials ProtoBuf And after studying its source code , It is found that the efficiency 、 Excellent compatibility, etc . In the future project technology selection , Especially network communication 、 Scenarios such as general data exchange should be preferred ProtoBuf.

I'm learning ProtoBuf In the process of translating the official main documents , One, of course, is learning ProtoBuf, The second is to cultivate the ability to read English documents , Three come because Google Documents ? There is no the !

After reading these documents, yes ProtoBuf Should have a considerable degree of understanding .

For translated documents, see [ Indexes ] Article index , Navigation for translation - technology - ProtoBuf Official documents .

But the official documents are more for reference and authority , It doesn't mean that you can understand the principle immediately after reading the official documents .

This article and the next few articles will be helpful to ProtoBuf The coding 、 serialize 、 Deserialization 、 The principles of reflection are introduced in detail , At the same time, we will try to express these principles more easily .

What is the ProtoBuf

Let's take a look at the definitions and descriptions given in the official documents :

protocol buffers It's a language that has nothing to do with 、 Platform independent 、 Extensible way to serialize structured data , It can be used for ( data ) Communication protocol 、 Data storage, etc .

Protocol Buffers It's a kind of flexibility , Efficient , Structure data serialization method of automation mechanism - Analogical XML, But compared to XML smaller (3 ~ 10 times )、 faster (20 ~ 100 times )、 It's simpler .

You can define the structure of the data , Then use the specially generated source code to easily write and read the structural data in various languages in various data streams . You can even update the data structure , Without breaking the deployed program compiled by the old data structure .

simply , ProtoBuf It's structural data serialize [1] Method , It's simple Analogy to XML[2], It has the following characteristics :

  • Language has nothing to do 、 Platform independent . namely ProtoBuf Support Java、C++、Python multilingual , Support multiple platforms
  • Efficient . I.e. ratio XML smaller (3 ~ 10 times )、 faster (20 ~ 100 times )、 It's simpler
  • Extensibility 、 Compatibility is good. . You can update the data structure , Without affecting and destroying the original old program

serialize [1]: take Structural data or object convert to Can be stored and transmitted ( For example, network transmission ) The format of , At the same time, we should ensure that the serialization result is later ( Maybe in another computing environment ) Can be reconstructed back to the original structure data or object .
For a more detailed introduction, please refer to Wikipedia .
Analogy to XML[2]: This mainly refers to the analogy of serialization in the application scenario of data communication and data storage , But I think XML As an extended markup language and ProtoBuf There are still essential differences .

Use ProtoBuf

Yes ProtoBuf After having a certain understanding of the basic concepts of , Let's see how to use ProtoBuf.
First step , establish .proto file , Define the data structure , Here's an example 1 Shown :

//  example 1:  stay  xxx.proto  The document defines  Example1 message
message Example1 {
    optional string stringVal = 1;
    optional bytes bytesVal = 2;
    message EmbeddedMessage {
        int32 int32Val = 1;
        string stringVal = 2;
    }
    optional EmbeddedMessage embeddedExample1 = 3;
    repeated int32 repeatedInt32Val = 4;
    repeated string repeatedStringVal = 5;
}

In the above example, we defined a named Example1 Of news , The grammar is very simple ,message Keyword followed by the message name :

message xxx {

}

Then we defined message Fields with , In the form of :

message xxx {
  //  Rules of the field :required ->  Fields can only and must appear  1  Time 
  //  Rules of the field :optional ->  The field can appear  0  Time or 1 Time 
  //  Rules of the field :repeated ->  The field can appear any number of times ( Include  0)
  //  type :int32、int64、sint32、sint64、string、32-bit ....
  //  Field number :0 ~ 536870911( remove  19000  To  19999  Number between )
   Rules of the field   type   name  =  Field number ;
}

In the example above , We defined :

  • type string, be known as stringVal Of optional Optional fields , The field number is 1, This field can appear 0 or 1 Time
  • type bytes, be known as bytesVal Of optional Optional fields , The field number is 2, This field can appear 0 or 1 Time
  • type EmbeddedMessage( Custom inline message type ), be known as embeddedExample1 Of optional Optional fields , The field number is 3, This field can appear 0 or 1 Time
  • type int32, be known as repeatedInt32Val Of repeated Repeatable fields , The field number is 4, This field can appear Any times ( Include 0)
  • type string, be known as repeatedStringVal Of repeated Repeatable fields , The field number is 5, This field can appear Any times ( Include 0)

About proto2 Definition message More syntax details for messages , For example, what types are supported , Field number assignment 、import
Import definition ,reserved For information about reserved fields, see [ translate ] ProtoBuf Official documents ( Two )- Grammar guide (proto2).

For some specifications of definition, please refer to [ translate ] ProtoBuf Official documents ( Four )- Specification guidelines

The second step ,protoc compile .proto File generation read / write interface

We are .proto The data structure is defined in the file , These data structures are for developers and business applications , Not for storage and transmission .

When these data need to be stored or transmitted , These data structures need to be serialized 、 Deserialization and read / write . So how to achieve it ? Never mind , ProtoBuf Will provide us with the corresponding interface code . How to provide ? The answer is through protoc This compiler .

The corresponding interface code can be generated through the following commands :

// $SRC_DIR: .proto  The source directory  // --cpp_out:  Generate  c++  Code  // $DST_DIR:  The target directory of the generated code  // xxx.proto:  For which  proto  File generation interface code  

The resulting code will provide an interface similar to the following :

Example - Serialization and parsing interface .png
Example -protoc Generate interface .png

The third step , Call the interface for serialization 、 Deserialization and read / write
For example in the first step 1 Defined message, We can call the interface generated in step 2 , The implementation test code is as follows :

//
// Created by yue on 18-7-21.
//
#include <iostream>
#include <fstream>
#include <string>
#include "single_length_delimited_all.pb.h"

intmain(){
Example1 example1;
example1.set_stringval(“hello,world”);
example1.set_bytesval(“are you ok?”);

Example1_EmbeddedMessage <span class="token operator">*</span>embeddedExample2 <span class="token operator">=</span> new <span class="token function">Example1_EmbeddedMessage</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>

embeddedExample2<span class="token operator">-&gt;</span><span class="token function">set_int32val</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
embeddedExample2<span class="token operator">-&gt;</span><span class="token function">set_stringval</span><span class="token punctuation">(</span><span class="token string">"embeddedInfo"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
example1<span class="token punctuation">.</span><span class="token function">set_allocated_embeddedexample1</span><span class="token punctuation">(</span>embeddedExample2<span class="token punctuation">)</span><span class="token punctuation">;</span>

example1<span class="token punctuation">.</span><span class="token function">add_repeatedint32val</span><span class="token punctuation">(</span><span class="token number">2</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
example1<span class="token punctuation">.</span><span class="token function">add_repeatedint32val</span><span class="token punctuation">(</span><span class="token number">3</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
example1<span class="token punctuation">.</span><span class="token function">add_repeatedstringval</span><span class="token punctuation">(</span><span class="token string">"repeated1"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
example1<span class="token punctuation">.</span><span class="token function">add_repeatedstringval</span><span class="token punctuation">(</span><span class="token string">"repeated2"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>

std<span class="token operator">::</span>string filename <span class="token operator">=</span> <span class="token string">"single_length_delimited_all_example1_val_result"</span><span class="token punctuation">;</span>
std<span class="token operator">::</span>fstream <span class="token function">output</span><span class="token punctuation">(</span>filename<span class="token punctuation">,</span> std<span class="token operator">::</span>ios<span class="token operator">::</span>out <span class="token operator">|</span> std<span class="token operator">::</span>ios<span class="token operator">::</span>trunc <span class="token operator">|</span> std<span class="token operator">::</span>ios<span class="token operator">::</span>binary<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token operator">!</span>example1<span class="token punctuation">.</span><span class="token function">SerializeToOstream</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>output<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
    std<span class="token operator">::</span>cerr <span class="token operator">&lt;&lt;</span> <span class="token string">"Failed to write example1."</span> <span class="token operator">&lt;&lt;</span> std<span class="token operator">::</span>endl<span class="token punctuation">;</span>
    <span class="token function">exit</span><span class="token punctuation">(</span><span class="token operator">-</span><span class="token number">1</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>

<span class="token keyword">return</span> <span class="token number">0</span><span class="token punctuation">;</span>

}

About protoc For more information about the use of and interface calls, see [ translate ] ProtoBuf Official documents ( Nine )- (C++ Development ) course

About example 1 For the complete code, see Source code :protobuf example 1. Among them single_length_delimited_all.* For example, related code and files .

Because the focus of this series of articles is to deepen ProtoBuf The coding 、 serialize 、 Reflection and other principles , About ProtoBuf The grammar of 、 Use and so on are only briefly introduced , For more details, please refer to the series of official documents I translated .

About ProtoBuf Some thoughts on

Official documents and many articles on the Internet mention ProtoBuf Analogical XML or JSON.

that ProtoBuf Is it equal to XML and JSON Well , Do they have exactly the same application scenario ?

Personally, I think if we want to ProtoBuf、XML、JSON Put the three together to compare , Two dimensions should be distinguished . One is Data structure , One is Data serialization . The data structure here is mainly oriented to the development or business level , Data serialization is oriented at the communication or storage level , Of course, data serialization also needs “ structure ” and “ Format ”, Therefore, the difference between the two mainly lies in the difference between domain oriented and scenario , The general requirements and emphasis will also be different . Data structure focuses on human readability, and sometimes even emphasizes the ability of semantic expression , Data compression and serialization efficiency .

From these two dimensions , We can make the following reflections .

XML As an extensible markup language ,JSON As a result of JS Data format , All have Data structure The ability of .

for example XML Can be derived from HTML ( although HTML Before XML, But conceptually ,HTML It's just a predefined tag XML),HTML Is to mark and express the structure of resources in the world wide web , So that the browser can better display the world wide web resources , At the same time, it should be as human readable as possible for developers to edit , This is business or development oriented Data structure .

Again XML It can also be derived from RDF/RDFS, Further express the relationship and semantics of resources in the semantic web , It also emphasizes Data structure And human readable .

About RDF/RDFS And the concept of semantic web can be understood by querying relevant materials , Or refer to 2-Answer series - Ontology building module ( One ) and 3-Answer series - Ontology building module ( Two ) , There are some brief introductions .

JSON It's the same thing , On many occasions, it more reflects Data structure The ability of , For example, the expression of the data structure as the interaction interface . stay MongoDB Used in JSON As a query statement , It is also exerting its ability of data structure .

Of course ,JSON、XML It can also be used directly Data serialization , In fact, many times they are used in the same way , For example, direct adoption of JSON、XML Network communication transmission , here JSON、XML It becomes a serialization format , It gives full play to the ability of data serialization . But it is often used this way , That doesn't mean it's reasonable . Actually JSON、XML Direct acting data serialization is usually not the best choice , Because they're at speed 、 efficiency 、 Not optimal in space . In other words, they are more suitable for data structuring than data serialization .

Pull out XML and JSON, Let's see ProtoBuf, alike ProtoBuf It also has the ability of data structure , In fact, it is the above introduction message Definition . We can .proto In file , adopt message、import、 Embedded message And other syntax to realize data structure , But it's easy to see ,ProtoBuf In terms of data structure and XML、JSON There's a big difference , Poor human readability , Not suitable for the above mentioned XML、JSON Some application scenarios of .

But from the perspective of data serialization, you will find that ProtoBuf There are clear advantages , efficiency 、 Speed 、 The space is almost completely dominant , After reading the back ProtoBuf Coded articles , You know better ProtoBuf How to squeeze every inch of space and performance as much as possible , And the coding principle is ProtoBuf The key ,message The ability to express is not ProtoBuf The most critical point . So it can be seen that ProtoBuf Focus on Data serialization Instead of Data structure .

Finally, I will make a little summary of these personal thoughts :

  1. XML、JSON、ProtoBuf All have Data structure and Data serialization The ability of
  2. XML、JSON Pay more attention to Data structure , Focus on human readability and semantic expression .ProtoBuf Pay more attention to Data serialization , Focus on efficiency 、 Space 、 Speed , Poor human readability , Lack of semantic expression ( To ensure maximum efficiency , Will discard some meta information )
  3. ProtoBuf The application scenarios are more specific ,XML、JSON The application scenarios are more abundant .

Next

Wang

Wang

85 People praise points
" Take a walk with the little gift , Come to Jane's book and pay attention to me "
Appreciate the support No one has appreciated , support
404_89_117_101To be a coding-machine.
total assets 24 ( about 1.94 element ) It's written 6.9W word get 291 A great common 203 Fans
Focus on
原网站

版权声明
本文为[xiongpursuit88]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/02/202202270535255702.html