当前位置:网站首页>Service grid is still difficult - CNCF
Service grid is still difficult - CNCF
2020-11-09 00:40:00 【On jdon】
The service grid is more mature than it was a year or two ago , however , It's still hard for users . Service grid has two technical roles , Platform owners and service owners . Platform owners ( Also known as grid Administrator ) Have a service platform , It also defines the overall strategy and implementation of service owner adopting service grid . The service owner owns one or more services in the grid .
For platform owners , It's easier to use a service grid , Because the project is being implemented to simplify the network configuration , How to configure security policy and visualize the whole grid . for example , stay Istio in , Platform owners can set up whatever they like Istio Authentication policy or authorization policy . The platform owner can host / port / TLS The gateway is configured on the related settings , At the same time, the actual routing behavior and traffic policy of the target service are entrusted to the service owner . Service owners who implement well tested and common scenarios can start with Istio Benefit from usability improvement of , Thus, it is easy to load its microservices into the grid .
And service owners have a steep learning curve .
I think the service grid is still very difficult , Here's why :
1. Lack of clear guidance on whether a service grid is needed
Before users start evaluating multiple service grids or delving into specific service grids , They need guidance on whether the service grid can help . Unfortunately , It's not a simple one, it's / No problem . There are many factors to consider :
- How many people in your engineering organization ?
- How many micro services do you have ?
- What language do these microservices use ?
- Do you have experience with open source projects ?
- What platform do you run the service on ?
- What functions does the service grid need ?
- For a given service grid project , Whether the function is stable ?
For various service grid projects , The answer becomes different , This adds complexity . Even in Istio Inside , We also use microservices to make the most of it Istio 1.5 Grid in earlier versions , But the decision will Multiple Istio The control plane component changes to Overall application to reduce operational complexity . for example , It makes more sense to run a whole service rather than four or five microservices .
2. After injecting into the sidecar , Your service may be interrupted immediately
Last Thanksgiving , I try to use the latest Zookeeper Rudder chart helps users run in grid Zookeeper service .Zookeeper As Kubernetes StatefulSet function . Once I try to inject the special envoy side car agent into every Zookeeper pod,Zookeeperpod Will not run and continue to restart , Because they can't build leaders and communicate among members .
By default ,Zookeeper monitor Pod IP Address for communication between servers . however ,Istio And other service grids require local hosts (127.0.0.1) As the listening address , This makes Zookeeper Servers can't communicate with each other .
Working with upstream communities , We are Zookeeper,Casssandra,Elasticsearch,Redis and Apache NiFi Added Configuration solution . I'm sure there are other apps that are not compatible with sidecar .
3. Your service may behave abnormally when it starts or stops
Kubernetes Lack of a standard way to declare container dependencies . There is one Sidecar Kubernetes Enhancement suggestions (KEP), however Kubernetes The version has not yet implemented , And it will take some time to stabilize the feature . meanwhile , Service owners may observe unexpected behavior when starting or stopping .
To help solve this problem ,Istio Global configuration options are implemented for platform owners , In order to Application startup delay To Sidecar Until it's ready .Istio Service owners will soon be allowed to pod Level to configure .
4. Zero configuration for your service , But not zero code change
One of the main goals of the service grid project is to provide zero configuration for service owners . image Istio Some of these projects have added intelligent protocol detection capabilities , To help detect protocols and simplify the grid entry experience , however , We still recommend that users explicitly state the agreement in production . By means of Kubernetes Add appProtocol Set up , Service owners can now use standard methods for the newer Kubernetes edition ( for example 1.19) Running in Kubernetes Service configuration application protocol .
In order to make full use of the function of service grid , Unfortunately, zero code changes are not possible .
- In order for service owners and platform owners to observe service tracking correctly , Between services Propagation trace header crucial .
- To avoid confusion and unexpected behavior , It's important to reexamine the service code for retries and timeouts , To see if adjustments should be made and to understand their behavior and sidecar The relationship between retrying and timeout of proxy configuration .
- In order to make Sidecar The agent checks the traffic sent from the application container and intelligently uses the content to make decisions , for example Request based routing or Header based authorization , For service owners , It is important to ensure that pure traffic is sent from the source service and that the target Service trusts sidecar The agent safely upgrades the connection .
5. Service owners need to understand the nuances of client and server configuration
Before using the service grid , I don't know there's too much with overtime and from Envoy The agent retries the relevant configuration . Most users are familiar with request timeouts , Idle timeout and number of retries , But there are many nuances and complexities :
- When it comes to idle timeout ,HTTP There is one under the agreement idle_timeout, It applies to HTTP Connection manager and upstream cluster HTTP Connect . There is one stream_idle_timeout A flow and existence do not The upstream or The downstream Activity and even route idle_timeout Rewritable stream_idle_timeout.
- Automatically retry It's also complicated . Retrying is not just the number of retries , And it's the maximum number of retries allowed , This may not be the actual number of retries . The actual number of retries depends on the retrial condition , Route requests Overtime s And the interval between retries , These intervals must fall between the total request timeout and Retrying the budget s within .
In the world of non service grid , There is only... Between the source container and the target container 1 A connection pool , But in the service grid world , Yes 3 A connection pool :
- Source container to source Sidecar agent
- Source Sidecar Agent to target Sidecar agent
- The goal is Sidecar Proxy to target container
Each of these connection pools has its own individual configuration . Carl · stoney (Karl Stoney) Of Blog A good description of these problems , Explains the complexity , Any one of the three could go wrong and how to fix them .
版权声明
本文为[On jdon]所创,转载请带上原文链接,感谢
边栏推荐
- Depth first search and breadth first search
- 你有没有想过为什么交易和退款要拆开不同的表
- C + + adjacency matrix
- App crashed inexplicably. At first, it thought it was the case of the name in the header. Finally, it was found that it was the fault of the container!
- 程序员都应该知道的URI,一文帮你全面了解
- When iperf is installed under centos7, the solution of make: * no targets specified and no makefile found. Stop
- Dynamic ReLU:微软推出提点神器,可能是最好的ReLU改进 | ECCV 2020
- 对象
- AQS 都看完了,Condition 原理可不能少!
- 非阻塞的无界线程安全队列 —— ConcurrentLinkedQueue
猜你喜欢
随机推荐
Teacher Liang's small class
The road of cloud computing - going to sea - small goal: Hello world from. Net 5.0 on AWS
Using containers to store table data
AQS 都看完了,Condition 原理可不能少!
分库分表的几种常见玩法及如何解决跨库查询等问题
对象
Are there many Python application scenarios?
The interface testing tool eolinker makes post request
Exception capture and handling in C + +
Computer network application layer
Linked blocking queue based on linked list
API部分的知识点复习
Database design: paradigms and anti paradigms
链表
SQL语句的执行
Huawei HCIA notes
How does semaphore, a thread synchronization tool that uses an up counter, look like?
How to make scripts compatible with both Python 2 and python 3?
How to deploy pytorch lightning model to production
移动大数据自有网站精准营销精准获客