当前位置:网站首页>Performance test - GTI application service performance monitoring platform
Performance test - GTI application service performance monitoring platform
2022-06-12 05:22:00 【Government officials】
GTI Application service performance monitoring platform
0. Version acquisition
- InfluxDB: https://portal.influxdata.com/downloads
- Grafana: https://grafana.com/grafana/download?platform=linux
- Telegraf: https://github.com/influxdata/telegraf/releases
- Collectd: http://collectd.org/download.shtml
1. GTI Introduction to application service performance monitoring platform
Want to build New Relic & One APM Such a beautiful real-time monitoring platform , We just need InfluxDB/Collectd&Telegraf/Grafana, The relationship between these tools is as follows :
Collect data (Collectd&Telegraf) -> Store the data (InfluxDB) -> Display the data (Grafana)
- InfluxDB yes Go Language development of an open source distributed time series database , Ideal for storing metrics , event , Analysis and other data , Can be deployed independently at will ;
- Collectd yes C Language to write a system performance collection tool , Can be deployed independently at will , We use it to monitor Java application (WebApp, NativeApp) Performance of , Its working principle is as follows :

- Telegraf yes Go Language to write a system acquisition tool , It needs to be deployed on the system or middleware server to be monitored , We use it to monitor the performance of server systems and mainstream middleware ;
- Grafana Is pure JavaScript Developed front-end tools , Used to access the InfluxDB, Custom report , Display charts, etc , Can be deployed independently at will ;

1. InfluxDB
- install
- yum localinstall influxdb-1.2.4.x86_64.rpm
- Copy typs.db to /usr/local/share/collected/ ( If it does not exist , Create the directory , The path can be customized )
- chmod 777 /usr/local/share/collectd/types.db
- ps: influxdb The later version removes the graphical management page , It is recommended to install if there is no special need 1.2.4 edition
- To configure
- modify /etc/influxdb/influxdb.conf as follows :
- reporting-disabled = true ( Ban InfluxDB Upload relevant information to its official website )
- bind-address = “:8090” ( Add this node , Use command at the same time : netstat -Inp | grep 8090 Check whether the port already exists )
- [admin]
enabled = true
bind-adress = “:8083” - [http]
enabled = true
bind-adress = “:8086” - [[collectd]]
enabled = true
bind-adress = “ The server IP:25826”
database = “ktv” ( It is recommended to name by business )
typesdb = “/usr/local/share/collectd/types.db”
batch-size = 5000
batch-pending = 10
batch-timeout = “10s”
read-buffer = 0
- modify /etc/influxdb/influxdb.conf as follows :
- function
- service influxdb start
- Report errors
- Error message : Redirecting to /bin/systemctl start influxdb.service
reason : At present Linux The operating system uses RedHat7&CentOS7&Fedora, I won't support it service How to start the service , Need to use systemctl Command to start
remarks : View the current Linux Operating system version information : lsb_release -a If the command does not exist , By order : yum install lsb install
solve : systemctl start influxdb
- Error message : Redirecting to /bin/systemctl start influxdb.service
- Create database
- Open... On the browser http:// The server IP:8083
- stay Query Enter... In the column CREATE DATABASE "ktv" enter
- Influxdb1.2.4 Later versions remove the management page , You need to log in through the background command
- Create database : influx -host 127.0.0.1:8086
- Create database : CREATE DATABASE “ktv”
2. Telegraf
1. Linux edition
- install
yum localinstall telegraf-1.10.1-1.x86_64.rpm - upgrade
- Backup configuration files : /etc/telegraf/telegraf.conf
- Query installed version : rpm -qa | grep telegraf
- Uninstall the installed version : rpm -e --nodeps telegraf-1.2.1-1.x86_64
- Install the latest version : yum localinstall telegraf-1.10.1-1.x86_64.rpm
- To configure
- Basic configuration
(1) modify /etc/telegraf/telegraf.conf as follows :
[agent]
logfile = “/var/log/telegraf/telegraf.log”
[[outputs.influxdb]]
urls = [“http://InfluxDB The server IP:8086”]
database = “ Actual database name ” - Middleware configuration
If you need to monitor any middleware , You need to configure IP, Port and access rights , If you need to monitor multiple instances , Please separate them with commas
example :
[[inputs.redis]]
servers = [“tcp://192.168.57.10:6379”, “tcp://192.168.57.10:6378”]
[[inputs.zookeeper]]
servers = [“192.168.57.10:2181”, “192.168.57.10:2182”] - Network monitoring configuration
take /etc/telegraf/telegraf.conf Remove the comments before the following two indicator configurations in , Turn on network monitoring :
[[input.net]]
[[input.netstat]] - Process monitoring configuration
- Start as follows , Monitor single or multiple processes through the process list
modify /etc/telegraf/telegraf.conf as follows :
[[inputs.procstat]]
pattern = “processName”
You can use pgrep -f processName The command checks whether the process name corresponds to the target monitoring process , Ensure that there is only one process number returned by the command execution and that it corresponds to the process to be monitored . It can be modified by processName Debugging by , Ensure the uniqueness of the execution result of the command , If you need to monitor multiple processes , Then add multiple [[inputs.procatat]] node , Such as :
[[inputs.procstat]]
pattern = “processName1”
[[inputs.procstat]]
pattern = “processName2”
- Start as follows , Monitor single or multiple processes through the process list
- GPU Monitoring configuration
Telegraf The current support nvidia Card GPU monitor , telegraf call GPU The built-in information display program in the driver collects information , Need to be in telegraf.conf New configuration in :
[[inputs.nvidia_smi]]
## Optional: path to nvidia-smi binary, defaults to $PATH viaexec.LookPath
# bin_path = “/usr/bin/nvidia-smi”
## Optional: timeout for GPU polling
# timeout = “5s” - Monitoring configuration extensions
When telegraf The performance indicators of some application components or middleware cannot be monitored , When you need to write a monitoring script or a monitoring command to complete , have access to [[inputs.exec]] Configuration item , Monitor multiple instances , Multiple can be added [[inputs.exec]] node , At the same time name_suffix Parameters distinguish different instances
As shown below :
[[inputs.exec]]
---- If you execute multiple commands or scripts , You need to use commas to separate , redis Configured password , There are special characters that need to be escaped
commands = [
“/redis-3.0.5-master/src/redis-cil -h 192.168.57.10 -p 6379 slowlog len”
]
---- table prefix , The default is exec, Customizable
name_override = “redis_slowlog”
timeout = “5s”
---- Table name suffix , Used to distinguish different monitoring instances , If you only monitor a single instance , Configuration can be ignored
name_suffix = “_192.168.57.10”
---- Formatting Data
data_fromat = “value”
data_type = “integer”
- Basic configuration
- function
service telegraf start - Report errors
If during operation , It is found that there is no monitoring data for startup failure or successful startup , Please check the screen log or /var/log/telegraf Running logs under the directory .
3. Collectd
1. install
Copy collectd to /opt
2. To configure
- Turn on WebAPP( Container class ) or NativeAPP( Local quotation ) Of JMX Remote monitoring :
example : Tomcat Of JMX Remote monitoring configuration
modify /usr/local/tomcat7/bin/catalina.sh file :
Find the following line :
#-----ExecuteTheRequestedCommand-----
Add the following information above this line :
CATALINA_OPTS="$CATALINA_OPTS
-Dcom.sun.management.jmxremote
-Djava.rmi.server.hostname=XXX.XXX.XXX.XXX
-Dcom.sun.management.jmxremote.port=8888
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false"
example : Java Locally applied JMX Remote monitoring configuration
modify start.sh file
Add downlink information
JAVA_OPTS="$JAVA_OPTS
-Dcom.sun.management.jmxremote
-Djava.rmi.server.hostname=XXX.XXX.XXX.XXX
-Dcom.sun.management.jmxremote.port=8888
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false"
example : SpringBoot Applied JMX Remote monitoring configuration
modify /etc/telegraf/telegraf.conf file
Modifiable
[[inputs.jolokia]]
[[inputs.jolokia2_agent]]
[[inputs.jolokia2_proxy]]
Several configurations - modify /opt/collectd/etc/collectd.conf as follows :
<Plugin network>
# client setup:
Server “InfluxDB The server IP”“25826”
</Plugin>
---- monitor Java Before application , You need to turn it on JMX Remote monitoring
<Connection>
Host “Java application server IP:JMX port ”
ServiceURL “service:jmx:rmi:///jndi/rmi:// Java application server IP:JMX port /jmxrmi”
Collect “memory-heap”
Collect “memory-nonheap”
Collect “memory_pool”
Collect “cpu”
Collect “thread”
Collect “gc-count”
Collect “gc-time”
</Connection>
---- If multiple... Need to be monitored Java application , Then add multiple Connection node
3. function
/opt/collectd/sbin/collectd
4. Report errors
- Error message : lt_dlopen (“/opt/collectd/lib/collectd/java.so”) failed: file
not found.
reason : libjvm.so Not loaded .
solve :
(1) open /etc/profile, Add the following :
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/java/jdk1.7.0_79/jre/lib/amd64/server:/home/java/jdk1.7.0_79/jre/lib/amd64
Tips : About jdk Of lib Library loading , Need to see jdk Where is the default installation path , Then modify it according to the actual path
(2) Input :source /etc/profile , Make the above configuration effective
(3) restart collectd:/opt/collectd/sbin/collectd - Error message “ Did not find java.so”( Completion of 1 After modifying the environment variable in question )
Question why : libjvm.so Not loaded .
resolvent : stay /etc/ld.so.conf Add libjvm.so Location path , Carry out orders
ldconfig Restart and take effect to solve the problem of missing java.so The problem of
remarks : ld.so.conf Execution priority is greater than profile - Error message :
resolvent : Copy a lib Put it in var Folder path - Error message :.Lookingup “hostname”failed.

reason : Hostname is not bound IP
solve :
(1) open /etc/hosts, Add the following :
host IP Address host Host name
Tips : About hostname , By order hostname Inquire about
(2) Input :source /etc/hosts , Make the above configuration effective
(3) close & restart collectd - Error message :Grafana Unable to monitor data and InfluxDB Is not written in JVM Monitoring data .
reason : JMX Server stay Qi use JVM ginseng Count with Set up
-Dcom.sun.management.jmxremote.port=XXXX Monitor the port XXXX when , Another random data communication port will be enabled YYYY
solve : Fix this random port , stay JVM Add the following configuration based on the parameters :
-Dcom.sun.management.jmxremote.rmi.port=XXXX( The same port as the previous configuration )
4. Granafa
1. install
yum localinstall granafa-5.4.2-1.x86_64.rpm
2. To configure
modify /etc/grafana/grafana.ini as follows :
[smtp]
enabled = true
3. function
service grafana-server start
4. monitor
- land
Open in browser http:// The server IP:3000, Input admin/admin, Click on 【Login】 Sign in ; - Report errors
- If during operation , Find out grafana-server Startup failed or port number 3000 non-existent , Please check
see /var/log/grafana The running logs under the directory are as follows :
t=2018-08-10T10:36:34+0800 lvl=eror msg=“Failed to verify pid directory” lo
gger=server error=“mkdir /var/run/grafana: permission denied”
reason : Unable to create directory due to permissions /var/run/grafana, This leads to failure to generate
grafana-server.pid file ,grafana Without this file, you will not be able to start normally .
solve : Execute command in turn :
mkdir /var/run/grafana
chmod -R 777 /var/run/grafana - Monitoring data , There is data on the page mouse diagram , But there is no graphic display , Tips Data points
outside time range:
reason : The browser is on the wrong operating system , The data cannot be displayed
solve : Modify the system time of the machine where the browser is located , Change to the current correct time , Close the browser and try again
Open monitoring page , According to the normal - The monitoring page has no data , see InfluxDB data ,time Field time plus 8 Hours later and now
Time is not equal
reason : InfluxDB The time zone of the server system is incorrect , It needs to be changed to CST The time zone .
solve : Modify the system time zone
timedatectl set-timezone Asia/Shanghai
- If during operation , Find out grafana-server Startup failed or port number 3000 non-existent , Please check
边栏推荐
- JS controls the display and hiding of tags through class
- 60. points of N dice
- Development of video preview for main interface of pupanvr-ui
- 67. convert string to integer
- How to deploy dolphin scheduler 1.3.1 on cdh5
- Big manufacturers compete to join rust, performance and safety are the key, and the 2021 rust developer survey report is announced
- JS disable mobile sharing
- 57 - II. Continuous positive sequence with sum s
- Detailed usage of vim editor
- org. apache. ibatis. binding. BindingException: Invalid bound statement (not found)
猜你喜欢

Qinglong wool - Kaka

The most commonly used objective weighting method -- entropy weight method

When the build When gradle does not load the dependencies, and you need to add a download path in libraries, the path in gradle is not a direct downloadable path

2022“高考记忆” 已打包完成,请查收!

How Bi makes SaaS products have a "sense of security" and "sensitivity" (Part I)

Harris corner detection principle-

week7

Quickly get PCA (principal component analysis) (principle code case)

【cjson】根节点注意事项

Introduction to MMS memory optimization of Hisilicon MPP service
随机推荐
[backtracking method] backtracking method to solve the problem of Full Permutation
BI 如何让SaaS产品具有 “安全感”和“敏锐感”(上)
Uview customer management JS
Soil type, soil texture, soil nutrient and change data, soil organic matter, soil pH, soil nitrogen, phosphorus and potassium
Link: fatal error lnk1168: cannot open debug/test Solution of exe for writing
加速訓練之並行化 tf.data.Dataset 生成器
Google reinforcement learning framework seed RL environment deployment
4.3 模拟浏览器操作和页面等待(显示等待和隐式等待、句柄)
IC验证中的force/release 学习整理(5)研究对 reg类型信号的影响
Design of a simple embedded web service application
[backtracking based on bit operation] queen n problem 2
Calculation method notes for personal use
Stm32f4 ll library multi-channel ADC
Codec of ASoC framework driven by alsa
Summary of problems in rv1109/rv1126 product development
Three. JS import model demo analysis (with notes)
Variables and data types
Ecosystem type distribution data, land use data, vegetation type distribution and nature reserve distribution data
The emergence of new ides and the crisis of programmers?
Abstract methods and interfaces