当前位置:网站首页>Microservices: how to solve the problem of link tracing

Microservices: how to solve the problem of link tracing

2020-11-06 01:15:00 Wang Zebin

​ Microservice architecture is to divide a single application into various small and connected services , Each service performs a single business function , They are independent and decoupled from each other , Each service can evolve independently . Compared with the traditional single service , Microservices are isolated 、 Technology heterogeneity 、 Scalability and simplified deployment .

​ alike , Microservice architecture brings many benefits at the same time , It also adds a lot of complexity to the system . It's a distributed service , Usually deployed in different data centers 、 On the cluster of different servers . and , The same microservice system may be composed of different teams 、 Developed in different languages . Usually, an application consists of multiple microservices , The data interaction between microservices needs to be completed through remote procedure call , So in a system of many microservices , Requests need to flow between services , The call link is complex , When something goes wrong , It is very difficult to locate problems and trace anomalies .

​ Link tracking system is to solve the above problems , It is used to track the complete call link for each request , Record the name of the task called from the beginning of the request to the end of the request 、 Time consuming 、 Tag data and log information , And through the visual interface analysis and display , To help technicians accurately locate abnormal Services 、 Find performance bottlenecks 、 Sort out the call link and estimate the system capacity .

​ Almost all of the theoretical models of link tracking system refer to Google A paper on ”Dapper, a Large-Scale Distributed Systems Tracing Infrastructure”, Typical products are Uber jaeger、Twitter zipkin、 Taobao eagle eye, etc . These products are implemented in different ways , But there are generally three core steps : Data collection 、 Data storage and query presentation .

​ Link tracking system first step , And the most basic work is data collection . In the process , The link tracking system needs to invade the user code to bury the point , Used to collect tracking data . But because of different link tracking systems API Not compatible with each other , So there are different ways to write buried code , As a result, users need to make great changes when switching different link tracking products . To solve this kind of problem , So it was born OpenTracing standard , Designed to unify link tracking systems API.

Two 、OpenTracing standard

​ OpenTracing It's a distributed tracing protocol , It's not about the platform or the language , It has a unified interface specification , Easy access to different distributed tracking systems .

​ OpenTracing The semantic specification can be found in :https://github.com/opentracing/specification/blob/master/specification.md

2.1 Data model (Data Model)

​ OpenTracing The data models defined in the semantic specification are Trace、Sapn as well as Reference.

2.1.1 Trace

​ Trace Represents a complete tracking link , for example : The execution of a transaction or process . One Trace It's made up of one or more Span Composed of directed acyclic graphs (DAG).

The figure below shows an example by 8 individual Span Composed of Trace:

        [Span A]  ←←←(the root span)
            |
     +------+------+
     |             |
 [Span B]      [Span C] ←←←(Span C is a `ChildOf` Span A)
     |             |
 [Span D]      +---+-------+
               |           |
           [Span E]    [Span F] >>> [Span G] >>> [Span H]
                                       ↑
                                       ↑
                                       ↑
                         (Span G `FollowsFrom` Span F)

According to the time axis way, it is more intuitive to show the Trace:

––|–––––––|–––––––|–––––––|–––––––|–––––––|–––––––|–––––––|–> time

 [Span A···················································]
   [Span B··············································]
      [Span D··········································]
    [Span C········································]
         [Span E·······]        [Span F··] [Span G··] [Span H··]
2.1.2 Span

​ Span Represents an independent unit of work , It's the basic component of a tracking link . for example : once RPC call 、 A function call or Http request .

Every Span Encapsulates the following states :

  • Operation name

    Used to indicate that Span Task name of . for example : One RPC Method name , A function name , Or the name of a subtask in a large task .

  • Start timestamp

    Mission start time .

  • End timestamp .

    Mission end time . adopt Span End timestamp and start timestamp of , You can figure out what to do Span The overall time taken .

  • A group of Span label

    every last Span A tag is a key value pair . The key must be a string , The value can be a string 、 Boolean or numeric type . Common label keys can refer to :https://github.com/opentracing/specification/blob/master/semantic_conventions.md

  • A group of Span journal

    Every one of them Span The log consists of a key value pair and a corresponding timestamp . The key must be a string , Values can be of any type . Common log key reference :https://github.com/opentracing/specification/blob/master/semantic_conventions.md

2.1.3 Reference

​ One Span Can be associated with one or more Span There is a causal relationship , This relationship is called Reference.OpenTracing Two relationships are currently defined :ChildOf( Father and son ) Relationship and FollowsFrom( Follow ) Relationship .

  • ChildOf Relationship

    Father Span According to laizi Span The results of the implementation of , Now Span To the father Span Of Reference The relationship is ChildOf. For example, for once RPC call , Server side Span( Son Span) With the client to call Span( Father Span) Namely ChildOf Relationship .

  • FollowsFrom Relationship

    Father Span The implementation of the law is not dependent on laizi Span The results of the implementation of , Now Span To the father Span Of Reference The relationship is FollowFrom.FollowFrom Often used to represent asynchronous calls , For example, in the message queue Consumer Span And Producer Span The relationship between .

2.2 Application interface (API)

2.2.1 Tracer

​ Tracer Interface is used to create Span、 Cross process injection and extraction of data . It usually has the following functions :

  • Start a new span
    Create and launch a new Span.
  • Inject
    take SpanContext Injection carrier (Carrier).
  • Extract
    From the carrier (Carrier) Extract from SpanContext.
2.2.2 Span
  • Retrieve a SpanContext
    return Span Corresponding SpanContext.

  • Overwrite the operation name
    Update operation name .

  • Set a span tag
    Set up Span Tag data .

  • Log structured data
    Recording structured data .

  • Set a baggage item
    baggage item It's a string type key value pair , It corresponds to something Span, along with Trace Spread together . Because each key value is copied to every local and remote child Span, This can lead to huge networks and CPU expenses .

  • Get a baggage item
    obtain baggage item Value .

  • Finish
    End one Span.

2.2.3 Span Context

​ Used to carry data across service boundaries , Include trace ID、Span ID And need to spread downstream Span Of baggage data . stay OpenTracing in , Mandatory requirements SpanContext Instances are immutable , To avoid being in Span Complex lifecycle issues with completion and reference .

2.2.4 NoopTracer

​ All right OpenTracing API The implementation of the , Some form of NoopTracer, For marking control OpenTracing Or inject something harmless to the test .

3、 ... and 、Jaeger

​ Jaeger yes Uber Open source distributed tracking system , Its application interface completely follows OpenTracing standard .jaeger It uses go Language writing , It has the characteristics of cross platform and cross language , Provides a variety of language client call interface , for example c++、java、go、python、ruby、php、nodejs etc. . Project address :https://github.com/jaegertracing

3.1 Jaeger Components

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-miLIEWHv-1604561903414)(https://i.loli.net/2020/04/13/bvTxdUkBRuawY1F.png)]

  • jaeger-client

    jaeger The client code base of , It has achieved OpenTracing agreement . When our application assembles it , Responsible for collecting data , And send it to jaeger-agent. This is the only place we need to write code .

  • jaeger-agent

    Responsible for receiving from jaeger-client It's from Trace/Span Information , And upload them to jaeger-collector.

  • jaeger-collector

    Responsible for receiving from jaeger-agent It's from Trace/Span Information , And after verification 、 Index, etc , And then write to the back-end storage .

  • data store

    Responsible for data storage .Jaeger Data storage is a pluggable component , At present, we support Cassandra、ElasticSearch and Kafka.

  • jaeger-query & ui

    Responsible for data query , And display the query results through the front-end interface .

    [ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-ogkrm3Hb-1604561903417)(https://i.loli.net/2020/04/13/UMoHYtlX1ydsx5Q.jpg)]

3.2 Quick start

​ Jaeger The official provided all-in-one Mirror image , Easy and fast testing :

#  Pull the mirror image 
$docker pull jaegertracing/all-in-one:latest

#  Running the mirror 
$docker run -d --name jaeger \
  -e COLLECTOR_ZIPKIN_HTTP_PORT=9411 \
  -p 5775:5775/udp \
  -p 6831:6831/udp \
  -p 6832:6832/udp \
  -p 5778:5778 \
  -p 14268:14268 \
  -p 9411:9411 \
  -p 16686:16686 \
  jaegertracing/all-in-one:latest

​ adopt all-in-one The image starts , We found that Jaeger It takes up a lot of ports . Here are the port instructions :

port agreement The module it belongs to function
5775 UDP agent Receive compressed format Zipkin thrift data
6831 UDP agent Receive compressed format Jaeger thrift data
6832 UDP agent Receive in binary format Jaeger thrift data
5778 HTTP agent Service configuration 、 Sampling policy port
14268 HTTP collector Receive a message sent directly by the client Jaeger thrift data
9411 HTTP collector receive Zipkin Sent json perhaps thrift data
16686 HTTP query Browser display port

​ After starting , We can visit http://localhost:16686 , View and query the collected data in the browser .

​ Because through all-in-one The data collected in the mirror mode is stored in docker in , Can't persist , So it can only be used in development or test environments , Can't be used in a production environment . According to the actual situation in the production environment , Deploy each component separately .

Four 、Jaeger Application in business code

​ The system uses Jaeger It's simple , Just insert a small amount of code into the original program . The following code simulates a query user account balance , The business scenario of executing the deduction :

4.1 initialization jaeger function

​ It mainly configures the relevant parameters according to the actual needs , For example, the service name 、 Sampling mode 、 Sampling ratio and so on .

func initJaeger() (tracer opentracing.Tracer, closer io.Closer, err error) {
	//  Construct configuration information 
	cfg := &config.Configuration{
		//  Set the service name 
		ServiceName: "ServiceAmount",
		//  Set the sampling parameters 
		Sampler: &config.SamplerConfig{
			Type:  "const", //  Full sampling mode 
			Param: 1,       //  On state 
		},
	}
	
	//  Generate a new tracer
	tracer, closer, err = cfg.NewTracer()
	if err == nil {
		//  Set up tracer Is a global singleton object 
		opentracing.SetGlobalTracer(tracer)
	}
	return
}

4.2 Check the user balance function

​ Used to detect user balance , Simulate a subtask Span.

func CheckBalance(request string, ctx context.Context) {
	//  Create a child span
	span, _ := opentracing.StartSpanFromContext(ctx, "CheckBalance")

	//  The simulation system performs a series of operations , Time consuming 1/3 second 
	time.Sleep(time.Second / 3)

	//  Example : Put the information that needs to be tracked into tag
	span.SetTag("request", request)
	span.SetTag("reply", "CheckBalance reply")

	//  End the present span
	span.Finish()

	log.Println("CheckBalance is done")
}

4.3 Debit function from user account

​ Deduct money from a user's account , Simulate a subtask span.

func Reduction(request string, ctx context.Context) {
	//  Create a child span
	span, _ := opentracing.StartSpanFromContext(ctx, "Reduction")

	//  The simulation system performs a series of operations , Time consuming 1/2 second 
	time.Sleep(time.Second / 2)

	//  Example : Put the information that needs to be tracked into tag
	span.SetTag("request", request)
	span.SetTag("reply", "Reduction reply")

	//  End the present span
	span.Finish()

	log.Println("Reduction is done")
}

4.4 The main function

​ initialization jaeger Environmental Science , Generate tracer, Create a father span, And call query balance and deduction two subtasks span.

package main

import (
	"context"
	"fmt"
	"github.com/opentracing/opentracing-go"
	"github.com/uber/jaeger-client-go/config"
	"io"
	"log"
	"time"
)

func main() {
	//  initialization jaeger, Create a new tracer
	tracer, closer, err := initJaeger()
	if err != nil {
		panic(fmt.Sprintf("ERROR: cannot init Jaeger: %v\n", err))
	}
	defer closer.Close()

	//  Create a new span, As a father span, Start the billing process 
	span := tracer.StartSpan("CalculateFee")
	
	//  Generative father span Of context
	ctx := opentracing.ContextWithSpan(context.Background(), span)

	//  Example : Set up a span Tag information 
	span.SetTag("db.instance", "customers")
	//  Example : Output a span Log information 
	span.LogKV("event", "timed out")

	//  Will father span Of context As a parameter , Call the check user balance function 
	CheckBalance("CheckBalance request", ctx)

	//  Will father span Of context As a parameter , Call the deduction function 
	Reduction("Reduction request", ctx)

	//  End father span
	span.Finish()
}

5、 ... and 、Jaeger stay gRPC Applications in microservices

​ We still simulate a query user account balance , The business scenario of executing the deduction , And the query user account balance and the implementation of deduction function into gRPC Microservices :

5.1 gRPC Server End code

main.go:

​ Relying on a third-party code base github.com/grpc-ecosystem/go-grpc-middleware/tracing/opentracing, The dependency library will OpenTracing Package as generic gRPC middleware , And pass gRPC Interceptors are embedded seamlessly gRPC In service .

package main

import (
	"fmt"
	"github.com/grpc-ecosystem/go-grpc-middleware/tracing/opentracing"
	"github.com/opentracing/opentracing-go"
	"github.com/uber/jaeger-client-go/config"
	"google.golang.org/grpc"
	"google.golang.org/grpc/reflection"
	"grpc-jaeger-server/account"
	"io"
	"log"
	"net"
)

//  initialization jaeger
func initJaeger() (tracer opentracing.Tracer, closer io.Closer, err error) {
	//  Construct configuration information 
	cfg := &config.Configuration{
		//  Set the service name 
		ServiceName: "ServiceAmount",

		//  Set the sampling parameters 
		Sampler: &config.SamplerConfig{
			Type:  "const", //  Full sampling mode 
			Param: 1,       //  Turn on full sampling mode 
		},
	}

	//  Generate a new tracer
	tracer, closer, err = cfg.NewTracer()
	if err == nil {
		//  Set up tracer Is a global singleton object 
		opentracing.SetGlobalTracer(tracer)
	}
	return
}

func main() {
	//  initialization jaeger, Create a new tracer
	tracer, closer, err := initJaeger()
	if err != nil {
		panic(fmt.Sprintf("ERROR: cannot init Jaeger: %v\n", err))
	}
	defer closer.Close()
	log.Println("succeed to init jaeger")

	//  register gRPC account service 
	server := grpc.NewServer(grpc.UnaryInterceptor(grpc_opentracing.UnaryServerInterceptor(grpc_opentracing.WithTracer(tracer))))
	account.RegisterAccountServer(server, &AccountServer{})
	reflection.Register(server)
	log.Println("succeed to register account service")

	//  monitor gRPC account Service port 
	listener, err := net.Listen("tcp", ":8080")
	if err != nil {
		log.Println(err)
		return
	}
	log.Println("starting register account service")

	//  Turn on gRpc account service 
	if err := server.Serve(listener); err != nil {
		log.Println(err)
		return
	}
}

Billing microservices accountsever.go:

package main

import (
	"github.com/opentracing/opentracing-go"
	"golang.org/x/net/context"
	"grpc-jaeger-server/account"
	"time"
)

//  Billing services 
type AccountServer struct{}

//  Detect user balance microservices , The simulator span Mission 
func (s *AccountServer) CheckBalance(ctx context.Context, request *account.CheckBalanceRequest) (response *account.CheckBalanceResponse, err error) {
	response = &account.CheckBalanceResponse{
		Reply: "CheckBalance Reply", //  Processing results 
	}

	//  Create a child span
	span, _ := opentracing.StartSpanFromContext(ctx, "CheckBalance")

	//  The simulation system performs a series of operations , Time consuming 1/3 second 
	time.Sleep(time.Second / 3)

	//  Put the information that needs to be tracked into tag
	span.SetTag("request", request)
	span.SetTag("reply", response)

	//  End the present span
	span.Finish()

	return response, err
}

//  Deduct micro service from user account , The simulator span Mission 
func (s *AccountServer) Reduction(ctx context.Context, request *account.ReductionRequest) (response *account.ReductionResponse, err error) {
	response = &account.ReductionResponse{
		Reply: "Reduction Reply", //  Processing results 
	}

	//  Create a child span
	span, _ := opentracing.StartSpanFromContext(ctx, "Reduction")

	//  The simulation system performs a series of operations , Time consuming 1/3 second 
	time.Sleep(time.Second / 3)

	//  Put the information that needs to be tracked into tag
	span.SetTag("request", request)
	span.SetTag("reply", response)

	//  End the present span
	span.Finish()
	return response, err
}

5.2 gRPC Client End code main.go:

package main

import (
	"context"
	"fmt"
	"github.com/grpc-ecosystem/go-grpc-middleware/tracing/opentracing"
	"github.com/opentracing/opentracing-go"
	"github.com/uber/jaeger-client-go/config"
	"google.golang.org/grpc"
	"grpc-jaeger-client/account"
	"io"
	"log"
)

//  initialization jaeger
func initJaeger() (tracer opentracing.Tracer, closer io.Closer, err error) {
	//  Construct configuration information 
	cfg := &config.Configuration{
		//  Set the service name 
		ServiceName: "ServiceAmount",

		//  Set the sampling parameters 
		Sampler: &config.SamplerConfig{
			Type:  "const", //  Full sampling mode 
			Param: 1,       //  Turn on full sampling mode 
		},
	}

	//  Generate a new tracer
	tracer, closer, err = cfg.NewTracer()
	if err == nil {
		//  Set up tracer Is a global singleton object 
		opentracing.SetGlobalTracer(tracer)
	}
	return
}

func main() {
	//  initialization jaeger, Create a new tracer
	tracer, closer, err := initJaeger()
	if err != nil {
		panic(fmt.Sprintf("ERROR: cannot init Jaeger: %v\n", err))
	}
	defer closer.Close()
	log.Println("succeed to init jaeger")

	//  Create a new span, As a father span
	span := tracer.StartSpan("CalculateFee")

	//  Turn off when function returns span
	defer span.Finish()

	//  Generate span Of context
	ctx := opentracing.ContextWithSpan(context.Background(), span)

	//  Connect gRPC server
	conn, err := grpc.Dial("localhost:8080",
		grpc.WithInsecure(),
		grpc.WithUnaryInterceptor(grpc_opentracing.UnaryClientInterceptor(grpc_opentracing.WithTracer(tracer),
		)))
	if err != nil {
		log.Println(err)
		return
	}

	//  establish gRPC Billing service client 
	client := account.NewAccountClient(conn)

	//  Will father span Of context As a parameter , Call to check the balance of the user gRPC Microservices 
	checkBalanceResponse, err := client.CheckBalance(ctx,
		&account.CheckBalanceRequest{
			Account: "user account",
		})
	if err != nil {
		log.Println(err)
		return
	}
	log.Println(checkBalanceResponse)

	//  Will father span Of context As a parameter , Call the deduction of gRPC Microservices 
	reductionResponse, err := client.Reduction(ctx,
		&account.ReductionRequest{
			Account: "user account",
			Amount: 1,
		})
	if err != nil {
		log.Println(err)
		return
	}
	log.Println(reductionResponse)
}

notes :
The source code of this article is located in :https://github.com/wangshizebin/micro-service
The development tool used in this paper is :goland From Swish Download

版权声明
本文为[Wang Zebin]所创,转载请带上原文链接,感谢