当前位置:网站首页>Performance test - GTI application service performance monitoring platform

Performance test - GTI application service performance monitoring platform

2022-06-12 05:22:00 Government officials

GTI Application service performance monitoring platform

0. Version acquisition

  1. InfluxDB: https://portal.influxdata.com/downloads
  2. Grafana: https://grafana.com/grafana/download?platform=linux
  3. Telegraf: https://github.com/influxdata/telegraf/releases
  4. Collectd: http://collectd.org/download.shtml

1. GTI Introduction to application service performance monitoring platform

Want to build New Relic & One APM Such a beautiful real-time monitoring platform , We just need InfluxDB/Collectd&Telegraf/Grafana, The relationship between these tools is as follows :
Collect data (Collectd&Telegraf) -> Store the data (InfluxDB) -> Display the data (Grafana)

  1. InfluxDB yes Go Language development of an open source distributed time series database , Ideal for storing metrics , event , Analysis and other data , Can be deployed independently at will ;
  2. Collectd yes C Language to write a system performance collection tool , Can be deployed independently at will , We use it to monitor Java application (WebApp, NativeApp) Performance of , Its working principle is as follows :
     Insert picture description here
  3. Telegraf yes Go Language to write a system acquisition tool , It needs to be deployed on the system or middleware server to be monitored , We use it to monitor the performance of server systems and mainstream middleware ;
  4. Grafana Is pure JavaScript Developed front-end tools , Used to access the InfluxDB, Custom report , Display charts, etc , Can be deployed independently at will ;
     Insert picture description here

1. InfluxDB

  1. install
    1. yum localinstall influxdb-1.2.4.x86_64.rpm
    2. Copy typs.db to /usr/local/share/collected/ ( If it does not exist , Create the directory , The path can be customized )
    3. chmod 777 /usr/local/share/collectd/types.db
    4. ps: influxdb The later version removes the graphical management page , It is recommended to install if there is no special need 1.2.4 edition
  2. To configure
    1. modify /etc/influxdb/influxdb.conf as follows :
      1. reporting-disabled = true ( Ban InfluxDB Upload relevant information to its official website )
      2. bind-address = “:8090” ( Add this node , Use command at the same time : netstat -Inp | grep 8090 Check whether the port already exists )
      3. [admin]
        enabled = true
        bind-adress = “:8083”
      4. [http]
        enabled = true
        bind-adress = “:8086”
      5. [[collectd]]
        enabled = true
        bind-adress = “ The server IP:25826”
        database = “ktv” ( It is recommended to name by business )
        typesdb = “/usr/local/share/collectd/types.db”
        batch-size = 5000
        batch-pending = 10
        batch-timeout = “10s”
        read-buffer = 0
  3. function
    1. service influxdb start
  4. Report errors
    1. Error message : Redirecting to /bin/systemctl start influxdb.service
      reason : At present Linux The operating system uses RedHat7&CentOS7&Fedora, I won't support it service How to start the service , Need to use systemctl Command to start
      remarks : View the current Linux Operating system version information : lsb_release -a If the command does not exist , By order : yum install lsb install
      solve : systemctl start influxdb
  5. Create database
    1. Open... On the browser http:// The server IP:8083
    2. stay Query Enter... In the column CREATE DATABASE "ktv" enter
    3. Influxdb1.2.4 Later versions remove the management page , You need to log in through the background command
      1. Create database : influx -host 127.0.0.1:8086
      2. Create database : CREATE DATABASE “ktv”

2. Telegraf

1. Linux edition
  1. install
    yum localinstall telegraf-1.10.1-1.x86_64.rpm
  2. upgrade
    1. Backup configuration files : /etc/telegraf/telegraf.conf
    2. Query installed version : rpm -qa | grep telegraf
    3. Uninstall the installed version : rpm -e --nodeps telegraf-1.2.1-1.x86_64
    4. Install the latest version : yum localinstall telegraf-1.10.1-1.x86_64.rpm
  3. To configure
    1. Basic configuration
      (1) modify /etc/telegraf/telegraf.conf as follows :
          [agent]
              logfile = “/var/log/telegraf/telegraf.log”
          [[outputs.influxdb]]
              urls = [“http://InfluxDB The server IP:8086”]
              database = “ Actual database name ”
    2. Middleware configuration
           If you need to monitor any middleware , You need to configure IP, Port and access rights , If you need to monitor multiple instances , Please separate them with commas
               example :
                  [[inputs.redis]]
                      servers = [“tcp://192.168.57.10:6379”, “tcp://192.168.57.10:6378”]
                  [[inputs.zookeeper]]
                      servers = [“192.168.57.10:2181”, “192.168.57.10:2182”]
    3. Network monitoring configuration
           take /etc/telegraf/telegraf.conf Remove the comments before the following two indicator configurations in , Turn on network monitoring :
              [[input.net]]
              [[input.netstat]]
    4. Process monitoring configuration
      1. Start as follows , Monitor single or multiple processes through the process list
             modify /etc/telegraf/telegraf.conf as follows :
                [[inputs.procstat]]
                    pattern = “processName”
             You can use pgrep -f processName The command checks whether the process name corresponds to the target monitoring process , Ensure that there is only one process number returned by the command execution and that it corresponds to the process to be monitored . It can be modified by processName Debugging by , Ensure the uniqueness of the execution result of the command , If you need to monitor multiple processes , Then add multiple [[inputs.procatat]] node , Such as :
                [[inputs.procstat]]
                    pattern = “processName1”
                [[inputs.procstat]]
                    pattern = “processName2”
    5. GPU Monitoring configuration
          Telegraf The current support nvidia Card GPU monitor , telegraf call GPU The built-in information display program in the driver collects information , Need to be in telegraf.conf New configuration in :
              [[inputs.nvidia_smi]]
                  ## Optional: path to nvidia-smi binary, defaults to $PATH viaexec.LookPath
                  # bin_path = “/usr/bin/nvidia-smi”
                  ## Optional: timeout for GPU polling
                  # timeout = “5s”
    6. Monitoring configuration extensions
           When telegraf The performance indicators of some application components or middleware cannot be monitored , When you need to write a monitoring script or a monitoring command to complete , have access to [[inputs.exec]] Configuration item , Monitor multiple instances , Multiple can be added [[inputs.exec]] node , At the same time name_suffix Parameters distinguish different instances
           As shown below :
              [[inputs.exec]]
                  ---- If you execute multiple commands or scripts , You need to use commas to separate , redis Configured password , There are special characters that need to be escaped
                  commands = [
                      “/redis-3.0.5-master/src/redis-cil -h 192.168.57.10 -p 6379 slowlog len”
                      ]
                  ---- table prefix , The default is exec, Customizable
                  name_override = “redis_slowlog”
                  timeout = “5s”
                  ---- Table name suffix , Used to distinguish different monitoring instances , If you only monitor a single instance , Configuration can be ignored
                  name_suffix = “_192.168.57.10”
                  ---- Formatting Data
                  data_fromat = “value”
                  data_type = “integer”
  4. function
            service telegraf start
  5. Report errors
         If during operation , It is found that there is no monitoring data for startup failure or successful startup , Please check the screen log or /var/log/telegraf Running logs under the directory .

3. Collectd

    1. install

         Copy collectd to /opt

    2. To configure
  1. Turn on WebAPP( Container class ) or NativeAPP( Local quotation ) Of JMX Remote monitoring :
             example : Tomcat Of JMX Remote monitoring configuration
             modify /usr/local/tomcat7/bin/catalina.sh file :
             Find the following line :
            #-----ExecuteTheRequestedCommand-----
             Add the following information above this line :
            CATALINA_OPTS="$CATALINA_OPTS
            -Dcom.sun.management.jmxremote
            -Djava.rmi.server.hostname=XXX.XXX.XXX.XXX
            -Dcom.sun.management.jmxremote.port=8888
            -Dcom.sun.management.jmxremote.ssl=false
            -Dcom.sun.management.jmxremote.authenticate=false"

             example : Java Locally applied JMX Remote monitoring configuration
             modify start.sh file
             Add downlink information
            JAVA_OPTS="$JAVA_OPTS
            -Dcom.sun.management.jmxremote
            -Djava.rmi.server.hostname=XXX.XXX.XXX.XXX
            -Dcom.sun.management.jmxremote.port=8888
            -Dcom.sun.management.jmxremote.ssl=false
            -Dcom.sun.management.jmxremote.authenticate=false"

             example : SpringBoot Applied JMX Remote monitoring configuration
             modify /etc/telegraf/telegraf.conf file
             Modifiable
            [[inputs.jolokia]]
            [[inputs.jolokia2_agent]]
            [[inputs.jolokia2_proxy]]
             Several configurations
  2. modify /opt/collectd/etc/collectd.conf as follows :
            <Plugin network>
                # client setup:
                    Server “InfluxDB The server IP”“25826”
            </Plugin>
            ---- monitor Java Before application , You need to turn it on JMX Remote monitoring
            <Connection>
                Host “Java application server IP:JMX port ”
                ServiceURL “service:jmx:rmi:///jndi/rmi:// Java application server IP:JMX port /jmxrmi”
            Collect “memory-heap”
                Collect “memory-nonheap”
                Collect “memory_pool”
                Collect “cpu”
                Collect “thread”
                Collect “gc-count”
                Collect “gc-time”
            </Connection>
            ---- If multiple... Need to be monitored Java application , Then add multiple Connection node
    3. function

        /opt/collectd/sbin/collectd

    4. Report errors
  1. Error message : lt_dlopen (“/opt/collectd/lib/collectd/java.so”) failed: file
    not found.
    reason : libjvm.so Not loaded .
    solve :
    (1) open /etc/profile, Add the following :
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/java/jdk1.7.0_79/jre/lib/amd64/server:/home/java/jdk1.7.0_79/jre/lib/amd64
    Tips : About jdk Of lib Library loading , Need to see jdk Where is the default installation path , Then modify it according to the actual path
    (2) Input :source /etc/profile , Make the above configuration effective
    (3) restart collectd:/opt/collectd/sbin/collectd
  2. Error message “ Did not find java.so”( Completion of 1 After modifying the environment variable in question )
    Question why : libjvm.so Not loaded .
    resolvent : stay /etc/ld.so.conf Add libjvm.so Location path , Carry out orders
    ldconfig Restart and take effect to solve the problem of missing java.so The problem of
    remarks : ld.so.conf Execution priority is greater than profile
  3. Error message :
     Insert picture description here resolvent : Copy a lib Put it in var Folder path
  4. Error message :.Lookingup “hostname”failed.  Insert picture description here
    reason : Hostname is not bound IP
    solve :
    (1) open /etc/hosts, Add the following :
    host IP Address host Host name
    Tips : About hostname , By order hostname Inquire about
    (2) Input :source /etc/hosts , Make the above configuration effective
    (3) close & restart collectd
  5. Error message :Grafana Unable to monitor data and InfluxDB Is not written in JVM Monitoring data .
    reason : JMX Server stay Qi use JVM ginseng Count with Set up
    -Dcom.sun.management.jmxremote.port=XXXX Monitor the port XXXX when , Another random data communication port will be enabled YYYY
    solve : Fix this random port , stay JVM Add the following configuration based on the parameters :
    -Dcom.sun.management.jmxremote.rmi.port=XXXX( The same port as the previous configuration )

4. Granafa

    1. install

    yum localinstall granafa-5.4.2-1.x86_64.rpm

    2. To configure

     modify /etc/grafana/grafana.ini as follows :
        [smtp]
            enabled = true

    3. function

        service grafana-server start

    4. monitor
  1. land
         Open in browser http:// The server IP:3000, Input admin/admin, Click on 【Login】 Sign in ;
  2. Report errors
    1. If during operation , Find out grafana-server Startup failed or port number 3000 non-existent , Please check
      see /var/log/grafana The running logs under the directory are as follows :
      t=2018-08-10T10:36:34+0800 lvl=eror msg=“Failed to verify pid directory” lo
      gger=server error=“mkdir /var/run/grafana: permission denied”
      reason : Unable to create directory due to permissions /var/run/grafana, This leads to failure to generate
      grafana-server.pid file ,grafana Without this file, you will not be able to start normally .
      solve : Execute command in turn :
      mkdir /var/run/grafana
      chmod -R 777 /var/run/grafana
    2. Monitoring data , There is data on the page mouse diagram , But there is no graphic display , Tips Data points
      outside time range:
      reason : The browser is on the wrong operating system , The data cannot be displayed
      solve : Modify the system time of the machine where the browser is located , Change to the current correct time , Close the browser and try again
      Open monitoring page , According to the normal
    3. The monitoring page has no data , see InfluxDB data ,time Field time plus 8 Hours later and now
      Time is not equal
      reason : InfluxDB The time zone of the server system is incorrect , It needs to be changed to CST The time zone .
      solve : Modify the system time zone
      timedatectl set-timezone Asia/Shanghai
原网站

版权声明
本文为[Government officials]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/03/202203010618043865.html