当前位置:网站首页>Introduction to asynchronous task capability of function calculation - task trigger de duplication
Introduction to asynchronous task capability of function calculation - task trigger de duplication
2022-07-04 14:47:00 【Alibaba cloud native】
author : Gradually meaning
Preface
Whether in the field of big data processing , In the field of message processing , Mission systems have a critical capability - Task trigger de duplication . This ability can be used in some scenarios that require high accuracy ( For example, in the field of finance ) It's essential . As Serverless Task processing platform ,Serverless Task Such guarantees are also required , At the user application level and in its own system, the two dimensions have the exact trigger semantics of tasks . This article mainly introduces the technical details of the asynchronous task function of function calculation aiming at the topic of message processing reliability , It also shows how to use this capability provided by function calculation in practical applications to enhance the reliability of task execution .
Talking about task de duplication
When discussing asynchronous message processing systems , The basic semantics of message processing are unavoidable topics . In an asynchronous message processing system ( Mission system ) in , The simplified processing flow of a message is shown in the following figure :
chart 1
The user issues a task - Enter the queue - The task processing unit listens and obtains messages - Scheduling to actual worker perform
In the process of task message flow , Any component ( link ) Possible downtime and other problems will lead to incorrect message delivery . The general task system will provide up to 3 Three levels of message processing semantics :
At-Most-Once: Ensure that the message is delivered at most once . When the network partition appears 、 When system components are down , Message loss may occur ;
At-Least-Once: Ensure that the message is delivered at least once . The messaging link supports error retry , The message retransmission mechanism is used to ensure that the downstream must receive the upstream message , But in the case of downtime or network partition , May cause the same message to be delivered multiple times .
Exactly-Once Mechanism can ensure that the message is sent exactly once , Accurate once does not mean that there is no retransmission in the case of downtime or network partition , Instead, retransmission does not change the state of the receiver , It is the same as the result of one transmission . In actual production , It often depends on retransmission mechanism & The receiver removes the duplicate ( idempotent ) To achieve Exactly Once.
Function calculation can provide Task distribution Of Exactly Once semantics , That is, in any case , Repeated tasks will be considered by the system as the same trigger , Then the task distribution will be carried out only once .
Binding graph 1, If you want to do a heavy task , The system must provide at least two dimensions of support :
System side support : Task scheduling system itself failover It does not affect the correctness and uniqueness of message delivery ;
Provide users with a mechanism , It can be combined with business scenarios , Trigger the whole business logic + Perform to heavy .
below , We will combine the simplified Serverless Task System architecture , Let's talk about how function calculation achieves the above capabilities .
Function to calculate the implementation background of asynchronous task trigger de duplication
The task system architecture of function calculation is shown in the following figure :
chart 2
First , The user calls the function calculation API Issue a task ( step 1) Access to the system API-Server in ,API-Server After verification, the message is sent to the internal queue ( step 2.1). An asynchronous module in the background monitors the internal queue in real time ( step 2.2), Then call the resource management module to get the runtime resources ( step 2.2-2.3). After getting the runtime resources , The scheduling module sends the task data to VM Level client ( step 3.1), And the client forwards the task to the actual user running resource ( step 3.2). In order to guarantee the two dimensions mentioned above , We need support at the following levels :
System side support : Steps in 2.1 - 3.1 in , Of any intermediate process Failover Only one step can be triggered 3.2 Implementation , That is, the user instance will be scheduled to run only once ;
User side application level de duplication capability : It can support users to repeatedly execute steps 1, But it will only be triggered once step 3.2 Implementation .
The system side is upgraded gracefully & Failover When the task distribution to re guarantee
When the user's message enters the function computing system ( That is, complete step 2.1) after , The user's request will receive HTTP Status code 202 Of Response, Users can think that they have successfully submitted a task . Enter from the task message MQ rise , Its life cycle is determined by Scheduler maintain , therefore Scheduler Stability and MQ The stability of the system will directly affect Exactly Once The implementation of .
In most open source messaging systems ( Such as MQ、Kafka) Generally, it provides the semantics of message multi copy storage and unique consumption . Function to calculate the message queue used ( At the bottom RocketMQ) Same thing , Underlying storage 3 The replica implementation eliminates the need to focus on the stability of message storage . besides , The message queue used by the function calculation also has the following features :
The uniqueness of consumption : Each message in each queue is consumed , Will enter “ Invisible mode ”. In this mode , This message is not available to other consumers ;
The actual consumer of each message needs to update the invisible time of the pattern in real time ; When the consumer is finished , Delete the message to be displayed .
therefore , The whole life cycle of messages in the queue is shown in the following figure :
chart 3
Scheduler Mainly responsible for message processing , Its task mainly consists of the following parts :
Calculate the scheduling strategy of the load balancing module according to the function , Listen to the queue for which you are responsible ;
When a message appears in the queue , Pull the news , And maintain a state in memory : Until the message consumption is completed ( The user instance returns the result of function execution ) front , The visible time of continuously updating messages , Make sure that the message does not appear in the queue again ;
When the task is completed , Show delete this message .
In the scheduling model of queues , Function calculation is used by ordinary users “ Single line ” Management mode of ; That is, all asynchronous execution requests of each user are isolated from each other by an independent queue , And by a Scheduler Fixed responsible . This load mapping relationship is managed by the load balancing service calculated by the function , As shown in the figure below ( We will introduce this part in more detail in the following articles ):
chart 4
When Scheduler 1 In case of downtime or upgrade , A task has two execution states :
If the message has not been delivered to the user's execution instance ( chart 2 Step in 3.1 ~ 3.2), So when this one Scheduler The responsible queue is used by others Scheduler After picking up , The message will reappear after the consumption visible period , therefore Scheduler 2 The message will be retrieved again , Do the following triggering .
If the message has already started executing ( step 3.2), When the news is in Scheduler 2 After it reappears in , We rely on users VM Medium Agent Conduct status management . here Scheduler 2 The corresponding Agent Send execution request ; here Agent It is found that the message already exists in memory , Then the execution request will be ignored , And inform you of the execution results through this link Scheduler 2, And then complete Failover The recovery of .
User side business level distribution de duplication is realized
The function computing system can accurately consume each message under a single point of failure , However, if the user side repeatedly triggers function execution for the same business data , Function calculation cannot identify whether different messages are logically the same task . This often happens in network partitions . In the figure 2 in , If the user calls 1 Timeout occurred , There are two possible situations at this time :
The message did not reach the function computing system , The task was not submitted successfully ;
The message has reached the function calculation and is queued , Task submitted successfully , However, due to the timeout, the user cannot know the information about the successful submission .
In most cases, the user will retry the submission . If it's No 2 In this case , Then the same task will be submitted and executed many times . So function computation needs to provide a mechanism , Ensure the accuracy of the business in this scenario .
Function calculation provides TaskID This task concept (StatefulAsyncInvocationID). The ID Globally unique . Each time a user submits a task, he can specify such a ID. When a request timeout occurs , Users can retry indefinitely . All repeated retries will be verified on the function calculation side . Function calculation is used internally DB Right task Meta Data storage ; When there is the same ID This request will be rejected when entering the system , And back to 400 error . At this point, the client can know the submission status of the task .
In practical use, it is indicated by Go SDK For example , You can edit the following code to trigger the task :
import fc "github.com/aliyun/fc-go-sdk"
func SubmitJob() {
invokeInput := fc.NewInvokeFunctionInput("ServiceName", "FunctionName")
invokeInput = invokeInput.WithAsyncInvocation().WithStatefulAsyncInvocationID("TaskUUID")
invokeOutput, err := fcClient.InvokeFunction(invokeInput)
...
}
And submitted a unique task .
summary
This paper introduces function calculation Serverless Task For the technical details of task trigger de duplication , In order to support scenarios that have strict requirements on the accuracy of task execution . In the use of Serverless Task after , You don't have to worry about the... Of any system components Failover, Each task you submit will be executed exactly once . To support the distribution and de duplication of business side semantics , You can set the global uniqueness of a task when submitting it ID, Use the capabilities provided by functional computation to help you de reprocess tasks .
Previous recommendation
1 minute Serverless Build real websites at top speed
Use Serverless 1 Minutes to easily build your first personal website !
Free Quota , Pick up ! Xiaobai can also set up a station at a high speed : No need to consider the server and website source code , We provide you with free computing resources , Operation and maintenance management server . Complete the scene experience during the activity , I have the chance to get Tmall supermarket 10 Yuan voucher .( Suggest pc The experience )
Activity time : 2022 year 6 month 20 Japan -7 month 1 Japan ( Collect during working days )
Q & a group : Nail search “44700570”
Experience address : Click to read the original text or Scan QR code
Click on here , Direct experience !
边栏推荐
- LVGL 8.2 keyboard
- LVGL 8.2 keyboard
- Xcode abnormal pictures cause IPA packet size problems
- 如何配和弦
- 曝光一下阿里的工资待遇和职位级别
- Detailed analysis of pytorch's automatic derivation mechanism, pytorch's core magic
- Popular framework: the use of glide
- 自动控制原理快速入门+理解
- LVGL 8.2 Draw label with gradient color
- 5g TV cannot become a competitive advantage, and video resources become the last weapon of China's Radio and television
猜你喜欢
Kubernets Pod 存在 Finalizers 一直处于 Terminating 状态
LVGL 8.2 text shadow
如何配和弦
自动控制原理快速入门+理解
Data Lake (13): spark and iceberg integrate DDL operations
scratch古堡历险记 电子学会图形化编程scratch等级考试三级真题和答案解析2022年6月
[information retrieval] link analysis
Digi restarts XBee Pro S2C production. Some differences need to be noted
Compile oglpg-9th-edition source code with clion
LeetCode 1200 最小绝对差[排序] HERODING的LeetCode之路
随机推荐
LVLG 8.2 circular scrolling animation of a label
[cloud native] how can I compete with this database?
Ranking list of databases in July: mongodb and Oracle scores fell the most
5G电视难成竞争优势,视频资源成中国广电最后武器
LVLG 8.2 circular scrolling animation of a label
Red envelope activity design in e-commerce system
潘多拉 IOT 开发板学习(RT-Thread)—— 实验3 按键实验(学习笔记)
LVGL 8.2 Draw label with gradient color
Digi XBee 3 rf: 4 protocols, 3 packages, 10 major functions
Transplant tinyplay for imx6q development board QT system
如何搭建一支搞垮公司的技术团队?
Progress in architecture
Scratch Castle Adventure Electronic Society graphical programming scratch grade examination level 3 true questions and answers analysis June 2022
C language course design questions
Deep learning 7 transformer series instance segmentation mask2former
C language set operation
PLC模拟量输入 模拟量转换FC S_ITR (CODESYS平台)
Abnormal value detection using shap value
软件测试之测试评估
Free, easy-to-use, powerful lightweight note taking software evaluation: drafts, apple memo, flomo, keep, flowus, agenda, sidenote, workflow