当前位置:网站首页>Service grid is still difficult - CNCF
Service grid is still difficult - CNCF
2020-11-09 00:40:00 【On jdon】
The service grid is more mature than it was a year or two ago , however , It's still hard for users . Service grid has two technical roles , Platform owners and service owners . Platform owners ( Also known as grid Administrator ) Have a service platform , It also defines the overall strategy and implementation of service owner adopting service grid . The service owner owns one or more services in the grid .
For platform owners , It's easier to use a service grid , Because the project is being implemented to simplify the network configuration , How to configure security policy and visualize the whole grid . for example , stay Istio in , Platform owners can set up whatever they like Istio Authentication policy or authorization policy . The platform owner can host / port / TLS The gateway is configured on the related settings , At the same time, the actual routing behavior and traffic policy of the target service are entrusted to the service owner . Service owners who implement well tested and common scenarios can start with Istio Benefit from usability improvement of , Thus, it is easy to load its microservices into the grid .
And service owners have a steep learning curve .
I think the service grid is still very difficult , Here's why :
1. Lack of clear guidance on whether a service grid is needed
Before users start evaluating multiple service grids or delving into specific service grids , They need guidance on whether the service grid can help . Unfortunately , It's not a simple one, it's / No problem . There are many factors to consider :
- How many people in your engineering organization ?
- How many micro services do you have ?
- What language do these microservices use ?
- Do you have experience with open source projects ?
- What platform do you run the service on ?
- What functions does the service grid need ?
- For a given service grid project , Whether the function is stable ?
For various service grid projects , The answer becomes different , This adds complexity . Even in Istio Inside , We also use microservices to make the most of it Istio 1.5 Grid in earlier versions , But the decision will Multiple Istio The control plane component changes to Overall application to reduce operational complexity . for example , It makes more sense to run a whole service rather than four or five microservices .
2. After injecting into the sidecar , Your service may be interrupted immediately
Last Thanksgiving , I try to use the latest Zookeeper Rudder chart helps users run in grid Zookeeper service .Zookeeper As Kubernetes StatefulSet function . Once I try to inject the special envoy side car agent into every Zookeeper pod,Zookeeperpod Will not run and continue to restart , Because they can't build leaders and communicate among members .
By default ,Zookeeper monitor Pod IP Address for communication between servers . however ,Istio And other service grids require local hosts (127.0.0.1) As the listening address , This makes Zookeeper Servers can't communicate with each other .
Working with upstream communities , We are Zookeeper,Casssandra,Elasticsearch,Redis and Apache NiFi Added Configuration solution . I'm sure there are other apps that are not compatible with sidecar .
3. Your service may behave abnormally when it starts or stops
Kubernetes Lack of a standard way to declare container dependencies . There is one Sidecar Kubernetes Enhancement suggestions (KEP), however Kubernetes The version has not yet implemented , And it will take some time to stabilize the feature . meanwhile , Service owners may observe unexpected behavior when starting or stopping .
To help solve this problem ,Istio Global configuration options are implemented for platform owners , In order to Application startup delay To Sidecar Until it's ready .Istio Service owners will soon be allowed to pod Level to configure .
4. Zero configuration for your service , But not zero code change
One of the main goals of the service grid project is to provide zero configuration for service owners . image Istio Some of these projects have added intelligent protocol detection capabilities , To help detect protocols and simplify the grid entry experience , however , We still recommend that users explicitly state the agreement in production . By means of Kubernetes Add appProtocol Set up , Service owners can now use standard methods for the newer Kubernetes edition ( for example 1.19) Running in Kubernetes Service configuration application protocol .
In order to make full use of the function of service grid , Unfortunately, zero code changes are not possible .
- In order for service owners and platform owners to observe service tracking correctly , Between services Propagation trace header crucial .
- To avoid confusion and unexpected behavior , It's important to reexamine the service code for retries and timeouts , To see if adjustments should be made and to understand their behavior and sidecar The relationship between retrying and timeout of proxy configuration .
- In order to make Sidecar The agent checks the traffic sent from the application container and intelligently uses the content to make decisions , for example Request based routing or Header based authorization , For service owners , It is important to ensure that pure traffic is sent from the source service and that the target Service trusts sidecar The agent safely upgrades the connection .
5. Service owners need to understand the nuances of client and server configuration
Before using the service grid , I don't know there's too much with overtime and from Envoy The agent retries the relevant configuration . Most users are familiar with request timeouts , Idle timeout and number of retries , But there are many nuances and complexities :
- When it comes to idle timeout ,HTTP There is one under the agreement idle_timeout, It applies to HTTP Connection manager and upstream cluster HTTP Connect . There is one stream_idle_timeout A flow and existence do not The upstream or The downstream Activity and even route idle_timeout Rewritable stream_idle_timeout.
- Automatically retry It's also complicated . Retrying is not just the number of retries , And it's the maximum number of retries allowed , This may not be the actual number of retries . The actual number of retries depends on the retrial condition , Route requests Overtime s And the interval between retries , These intervals must fall between the total request timeout and Retrying the budget s within .
In the world of non service grid , There is only... Between the source container and the target container 1 A connection pool , But in the service grid world , Yes 3 A connection pool :
- Source container to source Sidecar agent
- Source Sidecar Agent to target Sidecar agent
- The goal is Sidecar Proxy to target container
Each of these connection pools has its own individual configuration . Carl · stoney (Karl Stoney) Of Blog A good description of these problems , Explains the complexity , Any one of the three could go wrong and how to fix them .
版权声明
本文为[On jdon]所创,转载请带上原文链接,感谢
边栏推荐
- How does semaphore, a thread synchronization tool that uses an up counter, look like?
- AI人工智能编程培训学什么课程?
- 使用递增计数器的线程同步工具 —— 信号量,它的原理是什么样子的?
- Exception capture and handling in C + +
- Operation 2020.11.7-8
- Realization of file copy
- 非阻塞的无界线程安全队列 —— ConcurrentLinkedQueue
- Concurrent linked queue: a non blocking unbounded thread safe queue
- Platform in architecture
- Database design: paradigms and anti paradigms
猜你喜欢
What courses will AI programming learn?
Salesforce connect & external object
App crashed inexplicably. At first, it thought it was the case of the name in the header. Finally, it was found that it was the fault of the container!
VIM 入门手册, (VS Code)
How to reduce the resource consumption of istio agent through sidecar custom resource
非阻塞的无界线程安全队列 —— ConcurrentLinkedQueue
SaaS: another manifestation of platform commercialization capability
使用递增计数器的线程同步工具 —— 信号量,它的原理是什么样子的?
SQL语句的执行
如何通过Sidecar自定义资源减少Istio代理资源消耗
随机推荐
How to get started with rabbitmq
The vowels in the inverted string of leetcode
教你如何 分析 Android ANR 问题
Leetcode-15: sum of three numbers
计算机网络 应用层
Operation 2020.11.7-8
服务器性能监控神器nmon使用介绍
Get the first cover image of video through canvas
How does semaphore, a thread synchronization tool that uses an up counter, look like?
Database design: paradigms and anti paradigms
App crashed inexplicably. At first, it thought it was the case of the name in the header. Finally, it was found that it was the fault of the container!
14.Kubenetes简介
你有没有想过为什么交易和退款要拆开不同的表
服务器性能监控神器nmon使用介绍
First development of STC to stm32
使用容器存储表格数据
Review of API knowledge
Introduction skills of big data software learning
AI人工智能编程培训学什么课程?
Concurrent linked queue: a non blocking unbounded thread safe queue