当前位置:网站首页>An in-depth analysis of the election mechanism in kubernetes
An in-depth analysis of the election mechanism in kubernetes
2022-06-29 22:38:00 【Hermokrates】
Overview
stay Kubernetes Of kube-controller-manager , kube-scheduler, And the use of Operator The underlying implementation of controller-rumtime Both support the... In the highly available system leader The election , This article will understand controller-rumtime ( The underlying implementation is client-go) Medium leader Election to kubernetes controller How is it implemented .
Background
Running kube-controller-manager when , Yes, there are some parameters provided to cm Conduct leader Used in elections , Please refer to the official documents Parameters To understand the relevant parameters .
--leader-elect Default: true
--leader-elect-renew-deadline duration Default: 10s
--leader-elect-resource-lock string Default: "leases"
--leader-elect-resource-name string Default: "kube-controller-manager"
--leader-elect-resource-namespace string Default: "kube-system"
--leader-elect-retry-period duration Default: 2s
...
I think the election action of these components is passed etcd On going , But the back is right controller-runtime When studying , It is found that its related... Is not configured etcd Related parameters , This has aroused curiosity about the electoral mechanism . With this curiosity, I searched for something about kubernetes The election of , This is what the official website introduces , The following is a popular summary of the official instructions .simple leader election with kubernetes
Read the article to learn ,kubernetes API It provides an election mechanism , As long as the container running in the cluster , Can achieve the election function .
Kubernetes API Complete the election action by providing two properties
- ResourceVersions: Every API Object is the only one ResourceVersion
- Annotations: Every API Objects can be applied to these key Annotate
notes : Such elections will increase APIServer The pressure of the . That's right etcd It will have an impact
So with this information , Let's take a look , stay Kubernetes In the cluster , Who is the cm Of leader( The cluster we provide has only one node , So this node is leader)
stay Kubernetes All are enabled in leader The election service will generate a EndPoint , In this EndPoint There will be the above mentioned label(Annotations) To identify who is leader.
$ kubectl get ep -n kube-system
NAME ENDPOINTS AGE
kube-controller-manager <none> 3d4h
kube-dns 3d4h
kube-scheduler <none> 3d4h
Here we use kube-controller-manager For example , Take a look at this EndPoint Any information
[[email protected] ~]# kubectl describe ep kube-controller-manager -n kube-system
Name: kube-controller-manager
Namespace: kube-system
Labels: <none>
Annotations: control-plane.alpha.kubernetes.io/leader:
{
"holderIdentity":"master-machine_06730140-a503-487d-850b-1fe1619f1fe1","leaseDurationSeconds":15,"acquireTime":"2022-06-27T15:30:46Z","re...
Subsets:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal LeaderElection 2d22h kube-controller-manager master-machine_76aabcb5-49ff-45ff-bd18-4afa61fbc5af became leader
Normal LeaderElection 9m kube-controller-manager master-machine_06730140-a503-487d-850b-1fe1619f1fe1 became leader
It can be seen that Annotations: control-plane.alpha.kubernetes.io/leader: Which is marked node yes leader.
election in controller-runtime
controller-runtime of leader The part of the election is pkg/leaderelection below , in total 100 Line code , Let's see what we've done ?
You can see , Only some options for creating resource locks are provided here
type Options struct {
// stay manager Startup time , Decide whether to hold an election
LeaderElection bool
// Use that resource lock The default is rent lease
LeaderElectionResourceLock string
// The namespace where the election takes place
LeaderElectionNamespace string
// This attribute will decide to hold leader The name of the lock resource
LeaderElectionID string
}
adopt NewResourceLock You can see , This is the way to go client-go/tools/leaderelection below , And this leaderelection There is also a example To learn how to use it .
adopt example You can see , The entrance to the election is a RunOrDie() Function of
// Here we have one lease lock , The comments say that they are willing to exist in the cluster lease Less monitoring
lock := &resourcelock.LeaseLock{
LeaseMeta: metav1.ObjectMeta{
Name: leaseLockName,
Namespace: leaseLockNamespace,
},
Client: client.CoordinationV1(),
LockConfig: resourcelock.ResourceLockConfig{
Identity: id,
},
}
// Start the election cycle
leaderelection.RunOrDie(ctx, leaderelection.LeaderElectionConfig{
Lock: lock,
// Here you must ensure that the lease you own is invoked cancel() Terminate before , Otherwise there will still be one loop Running
ReleaseOnCancel: true,
LeaseDuration: 60 * time.Second,
RenewDeadline: 15 * time.Second,
RetryPeriod: 5 * time.Second,
Callbacks: leaderelection.LeaderCallbacks{
OnStartedLeading: func(ctx context.Context) {
// Fill in your code here ,
// usually put your code
run(ctx)
},
OnStoppedLeading: func() {
// Clean up your... Here lease
klog.Infof("leader lost: %s", id)
os.Exit(0)
},
OnNewLeader: func(identity string) {
// we're notified when new leader elected
if identity == id {
// I just got the lock
return
}
klog.Infof("new leader elected: %s", identity)
},
},
})
Come here , We learned about the concept of a lock and how to start a lock , Look at the below ,client-go All provide those locks .
In the code tools/leaderelection/resourcelock/interface.go Defines a lock abstraction ,interface Provides a common interface , For locking leader Resources used in elections .
type Interface interface {
// Get Return to the election record
Get(ctx context.Context) (*LeaderElectionRecord, []byte, error)
// Create Create a LeaderElectionRecord
Create(ctx context.Context, ler LeaderElectionRecord) error
// Update will update and existing LeaderElectionRecord
Update(ctx context.Context, ler LeaderElectionRecord) error
// RecordEvent is used to record events
RecordEvent(string)
// Identity Returns the identification of the lock
Identity() string
// Describe is used to convert details on current resource lock into a string
Describe() string
}
So the implementation of this abstract interface is , Implemented resource lock , We can see ,client-go Four kinds of resource locks are provided
- leaselock
- configmaplock
- multilock
- endpointlock
leaselock
Lease yes kubernetes Control the passage in the plane ETCD To achieve a Leases Resources for , Mainly to provide a control mechanism for distributed lease . Relevant to this API The description of can be referred to :Lease .
stay Kubernetes In the cluster , We can use the following command to view the corresponding lease
$ kubectl get leases -A
NAMESPACE NAME HOLDER AGE
kube-node-lease master-machine master-machine 3d19h
kube-system kube-controller-manager master-machine_06730140-a503-487d-850b-1fe1619f1fe1 3d19h
kube-system kube-scheduler master-machine_1724e2d9-c19c-48d7-ae47-ee4217b27073 3d19h
$ kubectl describe leases kube-controller-manager -n kube-system
Name: kube-controller-manager
Namespace: kube-system
Labels: <none>
Annotations: <none>
API Version: coordination.k8s.io/v1
Kind: Lease
Metadata:
Creation Timestamp: 2022-06-24T11:01:51Z
Managed Fields:
API Version: coordination.k8s.io/v1
Fields Type: FieldsV1
fieldsV1:
f:spec:
f:acquireTime:
f:holderIdentity:
f:leaseDurationSeconds:
f:leaseTransitions:
f:renewTime:
Manager: kube-controller-manager
Operation: Update
Time: 2022-06-24T11:01:51Z
Resource Version: 56012
Self Link: /apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-controller-manager
UID: 851a32d2-25dc-49b6-a3f7-7a76f152f071
Spec:
Acquire Time: 2022-06-27T15:30:46.000000Z
Holder Identity: master-machine_06730140-a503-487d-850b-1fe1619f1fe1
Lease Duration Seconds: 15
Lease Transitions: 2
Renew Time: 2022-06-28T06:09:26.837773Z
Events: <none>
Let's take a look leaselock The implementation of the ,leaselock Will implement the abstraction as a resource lock
type LeaseLock struct {
// LeaseMeta Is an attribute similar to other resource types , contain name ns And other things about lease Properties of
LeaseMeta metav1.ObjectMeta
Client coordinationv1client.LeasesGetter // Client Is to provide informer The function of
// lockconfig Including the above through describe What you see Identity And recoder Used to record changes to resource locks
LockConfig ResourceLockConfig
// lease Namely API Medium Lease resources , You can refer to the above API Use
lease *coordinationv1.Lease
}
Let's take a look leaselock Implemented those methods ?
Get
Get It's from spec Returns the record of the election
func (ll *LeaseLock) Get(ctx context.Context) (*LeaderElectionRecord, []byte, error) {
var err error
ll.lease, err = ll.Client.Leases(ll.LeaseMeta.Namespace).Get(ctx, ll.LeaseMeta.Name, metav1.GetOptions{
})
if err != nil {
return nil, nil, err
}
record := LeaseSpecToLeaderElectionRecord(&ll.lease.Spec)
recordByte, err := json.Marshal(*record)
if err != nil {
return nil, nil, err
}
return record, recordByte, nil
}
// It can be seen that this resource is returned spec The values filled in
func LeaseSpecToLeaderElectionRecord(spec *coordinationv1.LeaseSpec) *LeaderElectionRecord {
var r LeaderElectionRecord
if spec.HolderIdentity != nil {
r.HolderIdentity = *spec.HolderIdentity
}
if spec.LeaseDurationSeconds != nil {
r.LeaseDurationSeconds = int(*spec.LeaseDurationSeconds)
}
if spec.LeaseTransitions != nil {
r.LeaderTransitions = int(*spec.LeaseTransitions)
}
if spec.AcquireTime != nil {
r.AcquireTime = metav1.Time{
spec.AcquireTime.Time}
}
if spec.RenewTime != nil {
r.RenewTime = metav1.Time{
spec.RenewTime.Time}
}
return &r
}
Create
Create Is in kubernetes Try to create a lease in the cluster , You can see ,Client Namely API Of the corresponding resources provided REST client , The result will be Kubernetes Create this in the cluster Lease
func (ll *LeaseLock) Create(ctx context.Context, ler LeaderElectionRecord) error {
var err error
ll.lease, err = ll.Client.Leases(ll.LeaseMeta.Namespace).Create(ctx, &coordinationv1.Lease{
ObjectMeta: metav1.ObjectMeta{
Name: ll.LeaseMeta.Name,
Namespace: ll.LeaseMeta.Namespace,
},
Spec: LeaderElectionRecordToLeaseSpec(&ler),
}, metav1.CreateOptions{
})
return err
}
Update
Update Is an update Lease Of spec
func (ll *LeaseLock) Update(ctx context.Context, ler LeaderElectionRecord) error {
if ll.lease == nil {
return errors.New("lease not initialized, call get or create first")
}
ll.lease.Spec = LeaderElectionRecordToLeaseSpec(&ler)
lease, err := ll.Client.Leases(ll.LeaseMeta.Namespace).Update(ctx, ll.lease, metav1.UpdateOptions{
})
if err != nil {
return err
}
ll.lease = lease
return nil
}
RecordEvent
RecordEvent It is a record of events occurring during the election , Now let's go back to the previous part stay kubernetes View in cluster ep Information can be seen when event in became leader Events , Here is what will happen event Add to meta-data in .
func (ll *LeaseLock) RecordEvent(s string) {
if ll.LockConfig.EventRecorder == nil {
return
}
events := fmt.Sprintf("%v %v", ll.LockConfig.Identity, s)
subject := &coordinationv1.Lease{ObjectMeta: ll.lease.ObjectMeta}
// Populate the type meta, so we don't have to get it from the schema
subject.Kind = "Lease"
subject.APIVersion = coordinationv1.SchemeGroupVersion.String()
ll.LockConfig.EventRecorder.Eventf(subject, corev1.EventTypeNormal, "LeaderElection", events)
}
Here we have a general understanding of what a resource lock is , Other kinds of resource locks are implemented in the same way , There is not much more to be said here ; Let's take a look at the election process .
election workflow
The code entry of the election is in leaderelection.go , Here will continue the above example Analyze the whole election process downward .
As we saw earlier, the entrance to the election is a RunOrDie() Function of , So continue to learn from here . Get into RunOrDie, Actually, there are only a few lines , I got a general understanding of RunOrDie The client that will use the provided configuration to start the election , Then it will block , until ctx sign out , Or stop holding leader Lease of .
func RunOrDie(ctx context.Context, lec LeaderElectionConfig) {
le, err := NewLeaderElector(lec)
if err != nil {
panic(err)
}
if lec.WatchDog != nil {
lec.WatchDog.SetLeaderElection(le)
}
le.Run(ctx)
}
Look at the below NewLeaderElector What did you do ? You can see ,LeaderElector It's a structure , This is just creating him , This structure provides everything we need in the election (LeaderElector Namely RunOrDie Created election client ).
func NewLeaderElector(lec LeaderElectionConfig) (*LeaderElector, error) {
if lec.LeaseDuration <= lec.RenewDeadline {
return nil, fmt.Errorf("leaseDuration must be greater than renewDeadline")
}
if lec.RenewDeadline <= time.Duration(JitterFactor*float64(lec.RetryPeriod)) {
return nil, fmt.Errorf("renewDeadline must be greater than retryPeriod*JitterFactor")
}
if lec.LeaseDuration < 1 {
return nil, fmt.Errorf("leaseDuration must be greater than zero")
}
if lec.RenewDeadline < 1 {
return nil, fmt.Errorf("renewDeadline must be greater than zero")
}
if lec.RetryPeriod < 1 {
return nil, fmt.Errorf("retryPeriod must be greater than zero")
}
if lec.Callbacks.OnStartedLeading == nil {
return nil, fmt.Errorf("OnStartedLeading callback must not be nil")
}
if lec.Callbacks.OnStoppedLeading == nil {
return nil, fmt.Errorf("OnStoppedLeading callback must not be nil")
}
if lec.Lock == nil {
return nil, fmt.Errorf("Lock must not be nil.")
}
le := LeaderElector{
config: lec,
clock: clock.RealClock{
},
metrics: globalMetricsFactory.newLeaderMetrics(),
}
le.metrics.leaderOff(le.config.Name)
return &le, nil
}
LeaderElector Is the established election client ,
type LeaderElector struct {
config LeaderElectionConfig // The configuration of this , Contains some time parameters , health examination
// recoder Related properties
observedRecord rl.LeaderElectionRecord
observedRawRecord []byte
observedTime time.Time
// used to implement OnNewLeader(), may lag slightly from the
// value observedRecord.HolderIdentity if the transition has
// not yet been reported.
reportedLeader string
// clock is wrapper around time to allow for less flaky testing
clock clock.Clock
// lock observedRecord
observedRecordLock sync.Mutex
metrics leaderMetricsAdapter
}
You can see Run The implemented election logic is passed in when initializing the client Three callback
func (le *LeaderElector) Run(ctx context.Context) {
defer runtime.HandleCrash()
defer func() {
// Execute... On exit callbacke Of OnStoppedLeading
le.config.Callbacks.OnStoppedLeading()
}()
if !le.acquire(ctx) {
return
}
ctx, cancel := context.WithCancel(ctx)
defer cancel()
go le.config.Callbacks.OnStartedLeading(ctx) // At election , perform OnStartedLeading
le.renew(ctx)
}
stay Run Called in acquire, This is Through one loop To call tryAcquireOrRenew, until ctx Pass on the end signal
func (le *LeaderElector) acquire(ctx context.Context) bool {
ctx, cancel := context.WithCancel(ctx)
defer cancel()
succeeded := false
desc := le.config.Lock.Describe()
klog.Infof("attempting to acquire leader lease %v...", desc)
// jitterUntil Is a function that executes timing func() Is the logic of timed tasks
// RetryPeriod Is the cycle interval
// JitterFactor Is the retry factor , It is similar to the coefficient in the delay queue (duration + maxFactor * duration)
// sliding Whether logic is calculated in time
// Context passing
wait.JitterUntil(func() {
succeeded = le.tryAcquireOrRenew(ctx)
le.maybeReportTransition()
if !succeeded {
klog.V(4).Infof("failed to acquire lease %v", desc)
return
}
le.config.Lock.RecordEvent("became leader")
le.metrics.leaderOn(le.config.Name)
klog.Infof("successfully acquired lease %v", desc)
cancel()
}, le.config.RetryPeriod, JitterFactor, true, ctx.Done())
return succeeded
}
In fact, the election action here is tryAcquireOrRenew in , Let's take a look tryAcquireOrRenew;tryAcquireOrRenew Is trying to get one leader lease , If you have already obtained , Renew the lease ; Otherwise, the lease can be obtained true, conversely false
func (le *LeaderElector) tryAcquireOrRenew(ctx context.Context) bool {
now := metav1.Now() // Time
leaderElectionRecord := rl.LeaderElectionRecord{
// Build an election record
HolderIdentity: le.config.Lock.Identity(), // The identity of the elector ,ep Related to host name
LeaseDurationSeconds: int(le.config.LeaseDuration / time.Second), // Default 15s
RenewTime: now, // Recapture time
AcquireTime: now, // For the time
}
// 1. from API Gets or creates a recode, If you can get it, you already have a lease , Instead, create a new lease
oldLeaderElectionRecord, oldLeaderElectionRawRecord, err := le.config.Lock.Get(ctx)
if err != nil {
if !errors.IsNotFound(err) {
klog.Errorf("error retrieving resource lock %v: %v", le.config.Lock.Describe(), err)
return false
}
// The action of creating a lease is to create a corresponding resource, This lock Namely leaderelection Four types of locks provided ,
// Look at you runOrDie What lock is passed in the initialization of
if err = le.config.Lock.Create(ctx, leaderElectionRecord); err != nil {
klog.Errorf("error initially creating leader election record: %v", err)
return false
}
// Here we have obtained or created the lease , Then record some of its properties ,LeaderElectionRecord
le.setObservedRecord(&leaderElectionRecord)
return true
}
// 2. Get records to check identity and time
if !bytes.Equal(le.observedRawRecord, oldLeaderElectionRawRecord) {
le.setObservedRecord(oldLeaderElectionRecord)
le.observedRawRecord = oldLeaderElectionRawRecord
}
if len(oldLeaderElectionRecord.HolderIdentity) > 0 &&
le.observedTime.Add(le.config.LeaseDuration).After(now.Time) &&
!le.IsLeader() {
// No leader, Conduct HolderIdentity Compare , Plus time , It's not time to run , Jump out of
klog.V(4).Infof("lock is held by %v and has not yet expired", oldLeaderElectionRecord.HolderIdentity)
return false
}
// 3. We will try to update . ad locum leaderElectionRecord Set to default . Let's correct it before updating .
if le.IsLeader() {
// That means leader, Fix his time
leaderElectionRecord.AcquireTime = oldLeaderElectionRecord.AcquireTime
leaderElectionRecord.LeaderTransitions = oldLeaderElectionRecord.LeaderTransitions
} else {
// LeaderTransitions Is refers to leader adjustment ( Change to other ) Several times , If it is ,
// It is a change , Keep the original value
// conversely , be +1
leaderElectionRecord.LeaderTransitions = oldLeaderElectionRecord.LeaderTransitions + 1
}
// Update when done APIServer Lock resources in , That is, update the attribute information of the corresponding resource
if err = le.config.Lock.Update(ctx, leaderElectionRecord); err != nil {
klog.Errorf("Failed to update lock: %v", err)
return false
}
// setObservedRecord Through a new record To update the lock record
// Operation is safe , Locking ensures that the critical area can be locked by only one thread / Process operation
le.setObservedRecord(&leaderElectionRecord)
return true
}
summary
Come here , Already fully aware of the use of kubernetes What is the process of the election ; Let's briefly review , Above leader Electing all the steps :
- The preferred service to be created is that of the service leader, The lock can be
lease,endpointAnd other resources - It's already leader Instances of will continue to be leased , The default value for leases is 15 second (
leaseDuration);leader Update the lease time when the lease expires (renewTime). - Other follower, Continuously check the existence of the corresponding resource lock , If there is already leader, Then check
renewTime, If the rental time is exceeded (), It means leader There is a problem and the election needs to be restarted , Until there is follower Upgrade to leader. - In order to avoid resources being preempted ,Kubernetes API Used
ResourceVersionTo avoid being repeatedly modified ( If the version number is inconsistent with the requested version number , It means that it has been modified , that APIServer Will return an error )
Reference
Kubernetes Implementation principle of concurrency control and data consistency
边栏推荐
- Advanced use of the optional class
- 如果我在珠海,到哪里开户比较好?究竟网上开户是否安全么?
- 把数组排成最小的数_数组中的逆序对(归并统计法)_数字在升序数组中出现的次数_丑数(剑指offer)
- 5分钟快速上手 pytest 测试框架
- 从检查点恢复后读取不到mysql的数据有那位兄台知道原因吗
- 22 years of a doctor in Huawei
- math_基本初等函数图型(幂函数/指数/对数/三角/反三角)
- Numpy array creation
- Realizing deep learning framework from zero -- RNN from theory to practice [practice]
- 一文2500字手把手教你使用jmeter进行分布式压力测试【保姆级教程】
猜你喜欢

5-1系统漏洞扫描

Low code, end-to-end, one hour to build IOT sample scenarios, and the sound network released lingfalcon Internet of things cloud platform

华为7年经验的软件测试总监,给所有想转行学软件测试的同学的几个建议
Why does copying files on a shared folder on a local area network (ERP server) result in the loss of the local Internet

AI场景存储优化:云知声超算平台基于 JuiceFS 的存储实践

leetcode:91. 解码方法【dfs + 记忆化】

科大讯飞 AI 学习机暑期新品发布会 AI + 教育深度结合再创产品新高度

Hezhou air32f103cbt6 development board hands-on Report

Matplotlib histogram

Daily question brushing record (VIII)
随机推荐
A mysql IBD file is too large processing record
One click file sharing software jirafeau
Does Australia require that PVC plastic sheets comply with as/nzs 1530.3 with a flame spread index of 0?
Is it reliable to open an account on the compass with your mobile phone? Is there any hidden danger in this way
从零实现深度学习框架——LSTM从理论到实战【理论】
2022年第一季度保险服务数字化跟踪分析
股票开户安全吗?上海股票开户。
leetcode:91. Decoding method [DFS + memorization]
static关键字续、继承、重写、多态
MySQL lock common knowledge points & summary of interview questions
为什么在局域网(ERP服务器)共享文件夹上拷贝文件时导致全局域英特网断网
Qt5.14.2 error connecting to the MySQL database of Ubuntu 20.04
Basic use of Nacos configuration center
云原生爱好者周刊:炫酷的 Grafana 监控面板集合
IFLYTEK AI learning machine summer new product launch AI + education depth combination to create a new height of products
一键式文件共享软件Jirafeau
Common PostgreSQL data operation notes: time
免费将pdf转换成word的软件分享,这几个软件一定要知道!
LeetCode85+105+114+124
Why does copying files on a shared folder on a local area network (ERP server) result in the loss of the local Internet