当前位置:网站首页>Best practices of opentelemetry in service grid architecture
Best practices of opentelemetry in service grid architecture
2022-07-27 03:14:00 【Baidu cloud native computing】
【 Baidu cloud native guide 】OpenTelemetry By the famous OpenTracing、 OpenCensus Two products in 2019 Merged in , It has a wide range of application scenarios and a strong development team .2021 year ,OpenTelemetry The use of has become a new industry standard , By establishing consistency between data sources , Make data collection and processing more convenient . that , How to introduce OpenTelemetry And then get observable trace Ability , What problems will you encounter in the process of practice ? This paper summarizes the related problems and gives the solution , Finally, a Demo Hand in hand Istio Install in OpenTelemetry.
1. OpenTelemetry Introduce
OpenTelemetry( abbreviation OTel) Is a series of tools 、APIs and SDKs. Use OpenTelemetry Can produce 、 collect 、 Output a series of telemetry data (telemetry data Such as trace、metrics and log) Help users analyze the performance and behavior of applications .
OpenTelemetry It has nothing to do with the manufacturer , And as an industrial standard, it is supported by a large number of manufacturers . in addition ,OpenCensus(Google) and OpenTracing(Uber) It's its predecessor .
The architecture is as follows :

It mainly includes the following parts :
Instrumentation: Including automatic mode API 、SDK Mode access OpenTelemetry.
OTLP: OpenTelemetry Defined Transport Protocol .
OpenTelemetry Collector: Provide vendor independent solutions , Used to implement Data reception (receive), Handle (process) and Output (export) The function of .
2. OpenTelemetry stay Service Mesh Practice under the framework
at present Istio Achieve observability trace programme It can be summarized as :
Business transparent transmission is required header Information ( Get the corresponding... From the incoming request header Information 、 And add a response to the outgoing request header Information ).
Sidecar Automatically send to the monitoring system Span Information .
Through to Istio The scheme is observable trace Analysis of , Several problems can be found :
There is intrusion into business code Trace Context propagation (Trace Context propagation), Business penetration is required for corresponding header Information .
be-all Span The data are all by Sidecar Side generation , Internal method level monitoring information cannot be collected .
In the industry APM In the technical scheme ,JavaAgent( A bytecode Technology , No invasion of business ) As a mainstream technical solution . Again OpenTelemetry auto-instrumentation adopt Java Agent The technology of dynamically injecting bytecode provided can realize any Java Application telemetry data acquisition , It supports many popular libraries and frameworks .
2.1 Non intrusive Trace Context propagation
about Java Applications ,OpenTelemetry JavaAgent Supported by Trace context propagation as follows :
"tracecontext": W3C Trace Context (addbaggageas well to include W3C baggage)"baggage": W3C Baggage"b3": B3 Single"b3multi": B3 Multi"jaeger": Jaeger (includes Jaeger baggage)"xray": AWS X-Ray"ottrace": OT Trace
Above b3multi/b3( With x-b3 start ) Exactly Istio Which requires transparent transmission header Information .
2.2 Business Internal method level monitoring
at present OpenTelemetry JavaAgent A large number of libraries are supported 、 Framework and application server , It can collect the internal method level execution logic of the business .

2.3 trace Information is associated with business logs
Usually in business troubleshooting , Not only to check the corresponding trace Link information , At the same time, it should also be based on the corresponding trace_id、span_id View the corresponding business log information .
OpenTelemetry Java Agent A lot has been preset about the current span Information on log Of MDC(Mapped Diagnostic Context) in , Businesses can easily configure this information , Businesses can easily print the corresponding trace_id、span_id Information . The following for SpringBoot Use in logback Configuration example for :
logging.pattern.level = trace_id=%mdc{trace_id} span_id=%mdc{span_id} trace_flags=%mdc{trace_flags} %5p2.4 System architecture

JavaAgent Send the collected data to OpenTelemetry Collector .
Sidecar The generated trace span Message sent to OpenTelemetry Collector.
OpenTelemetry Collector Output the data to the monitoring system, such as jaeger or Kafka in .
Users can use the corresponding UI see trace data .
2.5 practice
This article is mainly used to demonstrate , No high availability deployment .
2.5.1 install istio
download istio Installation package ( This article takes 1.13.2 For example ):https://github.com/istio/istio/releases
$ istioctl install -f istio-opentelemtry-iop.yamlistio-opentelemetry-iop.yaml The contents of are as follows :
apiVersion: install.istio.io/v1alpha1kind: IstioOperatorspec:meshConfig:defaultConfig:holdApplicationUntilProxyStarts: truetracing:sampling: 100enablePrometheusMerge: false# Define extended providersextensionProviders:- name: otel-tracezipkin:service: otel-collector.istio-system.svc.cluster.localport: 9411maxTagLength: 56components:ingressGateways:- name: istio-ingressgatewayenabled: true
In particular : Above extensionProviders Property defines otel-trace, The information is opentelemetry Collector The address of ( It can receive zipkin Format trace, Therefore, what is defined here is zipkin Information about )
2.5.2 install OpenTelemetry Collector
according to https://github.com/open-telemetry/opentelemetry-operator install OpenTelemetry Collector. Suppose installed K8S Service by :otel-collector.istio-system.svc.cluster.local:9411,9411 For receiving zipkin Data format port .
opentelemetry-collector-config.yaml The contents of are as follows :
# collector Of receivers To configurereceivers:otlp:protocols:grpc:http:zipkin:processors:batch:memory_limiter:# collector Of exporters To configure ,exporters:zipkin:endpoint: "http://zipkin.istio-system.svc.cluster.local:9411/api/v2/spans"service:pipelines:traces:# about trace data , Turn on otlp、zipkin The protocol port provides servicesreceivers: [otlp, zipkin]processors: [memory_limiter, batch]# about trace data , Last output to zipkin inexporters: [zipkin]
2.5.3 install jaeger
kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.14/samples/addons/jaeger.yamlbecause collector exporters Output trace The data format is zipkin,jaege compatible zipkin Format trace data , So the final trace Data can be obtained through jaeger Exhibition .
2.5.4 Submit Telemetry CRD
Submit Telemetry CRD, take envoy Butted trace The information of is changed to otel-trace, That is to say Opentelemetry Collector.
$ kubectl apply -f mesh-trace-telemetry.yamlmesh-trace-telemetry.yaml The contents of are as follows :
apiVersion: telemetry.istio.io/v1alpha1kind: Telemetrymetadata:name: mesh-trace-telemtrynamespace: istio-systemspec:tracing:- providers:- name: otel-tracerandomSamplingPercentage: 100
2.5.5 Deploy Mesh application
For the corresponding namespace default Turn on auto Sidecar Injection function :
$ kubectl label namespace default istio-injection=enabled --overwriteaccording to K8S To deploy microservice applications consumer-demo and provider-demo. Application description :
consumer-demo: Service consumer , To provide with HTTP Interface , This interface will call the service provider provider-demo Provided HTTP Interface , call provider-dem The call forms of include RestTemplate and Fegin.
provider-demo: service provider , Provide HTTP Interface .
You need to add the corresponding parameters to start the application , As shown below :
# When starting, specify javaagent# -Dotel.resource.attributes=service.name=provider-demo stay span Medium service Information# -Dotel.propagators=b3multi transparent transmission b3multi headerjava -javaagent:/opentelemetry-javaagent.jar -Dotel.resource.attributes=service.name=provider-demo -Dotel.propagators=b3multi -jar provider-demo-1.0-SNAPSHOT.jar
When deploying, you need to add the corresponding configuration in the container environment :
env:- name: OTEL_TRACES_EXPORTER # javaagent Output otlp Protocol type datavalue: otlp- name: OTEL_EXPORTER_OTLP_ENDPOINT # collector Addressvalue: http://otel-collector.istio-system.svc.cluster.local:4317- name: OTEL_JAVAAGENT_ENABLED # Turn on javaagent functionvalue: 'true'
2.5.6 Observe Mesh Monitoring data
The following figure shows this article consumer-demo and provider-demo Monitoring data generated in business access ( To monitor the system Jaeger For example ):

3. summary
By means of Service Mesh Introduce... Into the architecture OpenTelemetry, The business benefits can be summarized as follows :
adopt OpenTelemtry auto-instrumentation The mechanism can realize non intrusive Trace Context propagation , Make the service insensible Service Mesh Observability .
adopt OpenTelemtry auto-instrumentation The mechanism can realize the monitoring data collection at the internal method level of the business , make up Service Mesh Only in Sidecar The defect of side collecting monitoring data .
adopt OpenTelemetry Logger MDC auto-instrumentation The mechanism can correspond the monitoring information to trace_id、span_id Such information is configured and output in the business log , Facilitate business troubleshooting .
To serve more customers , Baidu AI Cloud has launched service grid products CSM(Cloud Service Mesh), Provide users with high reliability 、 No operation and maintenance 、 Open Cloud native microservice governance products , In the hot public beta , Developers interested in service grid are welcome to discuss and practice with us .
For more product introduction and use, please click to read the original text or directly visit the following links :
https://cloud.baidu.com/product/csm.html
---------- END ----------
blockbuster ! Cloud native computing communication group established
Scan the code and add a little assistant to apply to join , Be sure to note : name - company / School - region , Note according to the format , To pass and invite into the group .

Learn more about microservices 、 Information about cloud native technology , Please pay attention to our official account of WeChat 【 Baidu cloud is original 】!

边栏推荐
- Shell (38): SSH port forwarding
- Manually build ABP framework from 0 -abp official complete solution and manually build simplified solution practice
- Plato farm has a new way of playing, and the arbitrage eplato has secured super high returns
- 177. 第N高的薪水(简单)
- day6
- 确定了,2022下半年软考报名8月开始
- CS224W fall 1.2 Applications of Graph ML
- The most complete basic knowledge of software testing in the whole network (a must for beginners)
- Thread.Sleep(0)的作用
- Alibaba cloud technology expert Yang Zeqiang: Construction of observability on elastic computing cloud
猜你喜欢

Baidu cloud face recognition

在线问题反馈模块实战(十五):实现在线更新反馈状态功能

Okaleido tiger is about to log in to binance NFT in the second round, which has aroused heated discussion in the community

Okaleido tiger is about to log in to binance NFT in the second round, which has aroused heated discussion in the community

A math problem cost the chip giant $500million!

Worth more than 100 million! The 86 version of "red boy" refuses to be a Daocheng Xueba. He is already a doctor of the Chinese Academy of Sciences and has 52 companies under his name

What did kubedmin do?

毕业2年转行软件测试获得12K+,不考研月薪过万的梦想实现了
![Analysis of [paper] pointlanenet papers](/img/f6/8001be4f90fe15100e0295de02491f.png)
Analysis of [paper] pointlanenet papers
The most complete basic knowledge of software testing in the whole network (a must for beginners)
随机推荐
My crawler notes (VII) blog traffic +1 through Crawlers
Cs224w fall course - --- 1.1 why graphs?
如何使用DevExpress WPF在WinUI中创建第一个MVVM应用程序?
Marqueeview realizes sliding display effect
ZJCTF_login
次轮Okaleido Tiger即将登录Binance NFT,引发社区热议
商城小程序项目完整源码(微信小程序)
Data Lake (20): Flink is compatible with iceberg, which is currently insufficient, and iceberg is compared with Hudi
Manually build ABP framework from 0 -abp official complete solution and manually build simplified solution practice
[SQL simple question] leetcode 627. change gender
Inftnews | "traffic + experience" white lining e Digital Fashion Festival leads the new changes of digital fashion
Role of thread.sleep (0)
关于url编解码应该选用的函数
Analysis of [paper] pointlanenet papers
[二分查找简单题] LeetCode 35. 搜索插入位置,69. x 的平方根,367. 有效的完全平方数,441. 排列硬币
setTimeout第一个参数应该注意的地方
子模块cache缓存失效
go实现导出excel表格
一道数学题,让芯片巨头亏了5亿美金!
2513: Xiao Yong's academic score (common divisor problem)