当前位置：网站首页>Discussion on opengauss parallel decoding

Discussion on opengauss parallel decoding

2022-06-11 15:48:00 【Gauss squirrel Club】

Today, with the rapid development of information technology , Various types of databases emerge in endlessly . Because it supports data synchronization between heterogeneous databases , The importance of logical replication is growing . At present openGauss The average performance of logical replication serial decoding is 3~5MBps, It is difficult to meet the needs of real-time synchronization in scenarios with high business pressure , Cause log accumulation , Thus affecting the production cluster business . therefore , We designed parallel decoding feature , Make multiple threads decode in parallel, so as to improve the decoding performance , In the basic scenario, the decoding performance can reach 100MBps.

Design thinking —— Why consider parallel decoding ？

The original serial decoding logic , Read logs from , To log decoding , And result splicing and sending are processed by a single thread , The main process and the approximate proportion of time-consuming are shown in the figure below ：
Insert picture description here
It can be seen that , The main step in the decoding process , It needs to be optimized through multi-threaded decoding ; The time-consuming of the sending step is also obvious , Batch sending can be used for further optimization .

Workflow —— Parallel decoding message sequence diagram

As shown below , In parallel decoding ,openGauss Data nodes of （Data Node, DN） There are three types of worker threads on ：

Sender/Collector Threads , Be responsible for receiving decoding requests from customers , And splicing the results of each decoding thread and sending them to the customer , The thread establishes only one in a decoding request ;
Reader/Dispatcher Threads , Responsible for reading WAL Log and distribute it to each decoding thread for decoding , The thread establishes only one in a decoding request ;
Decoder Threads , Decoding thread , Responsible for Reader/Dispatcher The log sent by the thread to itself is decoded （ When the thread is already decoding , This part of the log is temporarily stored in read change In line ）, And decode the result （ When the commit log has not been solved , The results of this part are temporarily stored in decode change In line ） issue Sender/Collector Threads , The thread can establish multiple decoding requests in one decoding request .

Insert picture description here
The message sequence diagram is described as follows ：

Client to DN Send logical copy request , The DN It can be a host or a standby . In the logical copy option, you can set parameters to select to connect only the standby machine , To prevent excessive pressure on the main engine .
In addition to receiving customer requests Sender Out of thread ,DN Still need to establish Reader/Dispatcher Threads and several Decoder Threads .
Reader Read xlog journal , Pre treatment . If the relevant log involves TOAST Column , This step is required to complete TOAST Splicing .
Dispatcher Distribute the pretreated logs to each Decoder Threads .
Decoder Each thread decodes independently . Here, the decoding format can be set through configuration options （json、text or bin）.
Decoder Send the decoding result to Collector.
Collector Summarize the decoding results in transaction units .
To reduce the number of transmissions , Reduce network I/O Impact on decoding performance , When the batch sending function is enabled （ namely sending-batch Parameter set to 1） when ,Sender After accumulating a certain amount of logs （ Threshold set to 1MB）, Batch return decoding results to customers .
If the customer needs to stop the logical replication process , Disconnect from DN Logical copy connection .
Sender towards Reader/Dispatcher Threads and Decoder The thread sends an exit signal .
After each thread receives the exit signal , Release occupied resources , Finish cleaning the environment and exit .

Technical details 1—— Visibility transformation

In logical decoding , Because it is to parse the historical log , Therefore, it is very important to judge the visibility of tuples in the log . In the original logic of serial decoding , We use the active transaction linked list mechanism to judge the visibility . But for parallel decoding , Each decoding thread maintains a linked list of active transactions, which is expensive , It will adversely affect the decoding performance , So we did a visibility transformation , Use CSN（Commit Sequence Number, That is, submit transaction number ） To judge tuple visibility . For each transaction number xid, The visibility process is as follows ：
Insert picture description here

The main process is ：

according to xid Gets the information used to determine visibility CSN, Here, make sure you can follow any xid Get CSN value . If xid Is an outlier , Will return a... That represents a specific state CSN, such CSN It can also be used for visibility judgment ;
if CSN Has been submitted , Compare it with... In the snapshot CSN Compare , This transaction CSN Smaller ones are not visible , Otherwise you can see ;
if CSN Not submitting , It's invisible .

Based on the above logic , In parallel decoding , The logic to judge the snapshot visibility of a tuple is to judge the tuples in turn Xmin（ Transaction number at the time of insertion ） and Xmax（ Delete / Transaction number at the time of update ） Snapshot visibility . The overall idea here is ,Xmin invisible / Not submitted or Xmax Visible tuples are invisible , and Xmin Visible and Xmax invisible / If not submitted, the tuple is visible . Each tag bit in tuple maintains its original meaning and participates in visibility judgment .

Technical details 2—— Bulk delivery

After using parallel decoding , The time occupied by the decoding process decreases significantly , But at this time, the decoding result sending thread will become the bottleneck , It is too expensive to perform a complete transmission process for each decoding result . Therefore, we use the method of batch sending , Collect the decoding results temporarily , When the threshold is exceeded, it is sent to the client . When sending in bulk , The length of each decoding result needs to be recorded additionally , And the agreed separator , So that the user of parallel decoding function can split the logs sent in batches .

Usage mode

We have added some optional configuration items for parallel decoding ：

Decoding thread concurrency

By configuring options parallel-decode-num, Specifies the of parallel decoding Decoder Number of threads . Its value range is 1~20 Of int type , take 1 It means decoding according to the original serial logic , This feature does not enter the logic book . The default value is 1. When this option is configured as 1 when , It is forbidden to configure the following decoding format options decode-style.

Decode the whitelist

By configuring options white-table-list, Specify the table to decode . Its value is text The type contains a string of table names in the whitelist , Different tables with ’,' Isolate delimiters . example ：select * from pg_logical_slot_peek_changes(‘slot1’, NULL, 4096, ‘white-table-list’, ‘public.t1,public.t2’);.

Limit standby decoding only

By configuring options standby-connection, Specifies whether to restrict standby only decoding . Its value is bool type , take true It means that only the standby machine can be connected , An error will be reported when connecting to the host to decode and exit ; take false There are no restrictions on . The default value is false.

Decoding format

By configuring options decode-style, Specifies the decoding format . Its value is char Type characters ’j’、’t’ or ’b’, Represent the json Format 、text Format and binary format . The default value is ’b’ Binary format decoding .

Bulk delivery

By configuring options sending-batch, Specify whether to send decoding results in batch . Its value is int Type 0 or 1,0 It means that batch sending is not enabled ,1 Represents that the decoding result has reached or just exceeded 1MB Send decoding results in batch , The default value here is 0, That is, batch sending is not enabled by default .

To use the JDBC Take parallel decoding as an example , The following configuration is required to establish the connection ：

PGReplicationStream stream = conn
.getReplicationAPI()
.replicationStream()
.logical()
.withSlotName(replSlotName)
.withSlotOption("include-xids", true)
.withSlotOption("skip-empty-xacts", true)
.withSlotOption("parallel-decode-num", 10)
.withSlotOption("white-table-list", "public.t1,public.t2")
.withSlotOption("standby-connection", true)
.withSlotOption("decode-style", "t")
.withSlotOption("sending-bacth", 1)
.start();

From the bottom of the first 6 Go to the bottom 2 Behavior adds logic , representative 10 Concurrent decoding 、 Decode table only public.t1 and public.t2、 Enable standby connection 、 The decoding format is text And turn on the batch sending function , When the configuration parameters are out of range , An error will be reported and the allowable value range of the parameter will be prompted .

Auxiliary function —— Monitoring function

In parallel decoding , In order to easily locate the decoding performance bottleneck in the scene with slow decoding speed , We added gs_get_parallel_decode_status() function , Used to view the current DN The decoding threads on the store logs that have not been decoded read change Queue and store decoding results that have not yet been sent decode change Length of queue .

This function has no arguments , The returned result has four columns , Namely slot_name、parallel_decode_num、read_change_queue_length、decode_change_queue_length.

slot_name Copy slot name , The type is text;parallel_decode_num Is the number of parallel decoding threads , The type is int;read_change_queue_length The type is text, It records the of each decoding thread read change） The queue length ;decode_change_queue_length The type is text, It records the of each decoding thread decode change The queue length . Use as follows ：
Insert picture description here

For decoding stall scenarios , Decoding DN Execute this function on , Then view the results of the query read_change_queue_length, The length of the read log queue in each decoding thread is recorded here , If the value here is too small, it indicates that the log reading is blocked , You need to continue to locate whether it is a disk I/O Insufficient . View the results of the query decode_change_queue_length, The length of the decoding log queue in each decoding thread is recorded here , If the value here is too small, it means that the decoding speed is too slow , You can appropriately increase the number of decoding threads . If read_change_queue_length and decode_change_queue_length Are relatively large , This indicates that the decoding log sending is blocked , It is necessary to detect the playback log speed of parallel decoding users in the target database . Generally speaking , Decoding stall scenarios generally include CPU、I/O、 Or insufficient memory resources , By using standby decoding to ensure sufficient resources, decoding stall can generally be avoided .

Conclusion

Parallel decoding can greatly improve the performance of logical copy decoding , It is less important to increase the service pressure of decoding instances . As a key technical scheme of heterogeneous database data replication , For parallel decoding openGauss The importance of is also self-evident .

原网站

版权声明
本文为[Gauss squirrel Club]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/162/202206111535103741.html