当前位置:网站首页>Grafana启动失败报错:Grafana-server Init Failed: Could not find config defaults, make sure homepath command
Grafana启动失败报错:Grafana-server Init Failed: Could not find config defaults, make sure homepath command
2022-06-10 08:21:00 【羌俊恩】
一、问题描述
某项目监控服务器prometheus+grafana,但因前期规划问题,服务器磁盘空间配置不够,一天的数据量就占满了根分区,导致prometheus和grafana宕机,清理空间后,重启grafana却无法启动,报如下错误:
Jun 09 11:40:49 2-bc-hb-56-centos7 systemd[1]: grafana-server.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Jun 09 11:40:49 2-bc-hb-56-centos7 systemd[1]: Failed to start Grafana instance.
-- Subject: Unit grafana-server.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit grafana-server.service has failed.
--
-- The result is failed.
Jun 09 11:40:49 2-bc-hb-56-centos7 systemd[1]: Unit grafana-server.service entered failed state.
Jun 09 11:40:49 2-bc-hb-56-centos7 systemd[1]: grafana-server.service failed.
Jun 09 11:40:49 2-bc-hb-56-centos7 systemd[1]: grafana-server.service holdoff time over, scheduling restart.
Jun 09 11:40:49 2-bc-hb-56-centos7 systemd[1]: start request repeated too quickly for grafana-server.service
Jun 09 11:40:49 2-bc-hb-56-centos7 systemd[1]: Failed to start Grafana instance.
-- Subject: Unit grafana-server.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit grafana-server.service has failed.
--
-- The result is failed.
Jun 09 11:40:49 2-bc-hb-56-centos7 systemd[1]: Unit grafana-server.service entered failed state.
Jun 09 11:40:49 2-bc-hb-56-centos7 systemd[1]: grafana-server.service failed.
现场启动grafanna是通过systemctl start grafana-server.service来进行的。
二、分析处理
1、查看日志journalctl -xe未看到明显报错,但有如下程序性异常:
Jun 09 11:40:49 2-bc-hb-56-centos7 grafana-server[28554]: /drone/src/pkg/services/sqlstore/sqlstore.go:135 +0x6f
Jun 09 11:40:49 2-bc-hb-56-centos7 grafana-server[28554]: github.com/grafana/grafana/pkg/services/sqlstore.ProvideService(0xc0005da000, 0x18, {
0x3bb1
Jun 09 11:40:49 2-bc-hb-56-centos7 grafana-server[28554]: /drone/src/pkg/services/sqlstore/sqlstore.go:67 +0xdc
Jun 09 11:40:49 2-bc-hb-56-centos7 grafana-server[28554]: github.com/grafana/grafana/pkg/server.Initialize({
{
0x7fff108b8cc3, 0x18}, {
0x0, 0x0}, {
0xc0
Jun 09 11:40:49 2-bc-hb-56-centos7 grafana-server[28554]: /drone/src/pkg/server/wire_gen.go:190 +0x1f8
Jun 09 11:40:49 2-bc-hb-56-centos7 grafana-server[28554]: github.com/grafana/grafana/pkg/cmd/grafana-server/commands.executeServer({
0x7fff108b8cc3, 0
Jun 09 11:40:49 2-bc-hb-56-centos7 grafana-server[28554]: /drone/src/pkg/cmd/grafana-server/commands/cli.go:170 +0x625
Jun 09 11:40:49 2-bc-hb-56-centos7 grafana-server[28554]: github.com/grafana/grafana/pkg/cmd/grafana-server/commands.RunServer({
{
0x3b22c68, 0x5}, {
0x
Jun 09 11:40:49 2-bc-hb-56-centos7 grafana-server[28554]: /drone/src/pkg/cmd/grafana-server/commands/cli.go:107 +0x785
Jun 09 11:40:49 2-bc-hb-56-centos7 grafana-server[28554]: main.main()
Jun 09 11:40:49 2-bc-hb-56-centos7 grafana-server[28554]: /drone/src/pkg/cmd/grafana-server/main.go:16 +0xc5
Jun 09 11:40:49 2-bc-hb-56-centos7 systemd[1]: grafana-server.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Jun 09 11:40:49 2-bc-hb-56-centos7 systemd[1]: Failed to start Grafana instance.
-- Subject: Unit grafana-server.service has failed

如上图所示,疑似程序错误导致了OS异常(内存读写异常)。
2、查看grafana应用日志:/var/log/grafana/grafana.log;也未看到业务类报错,基本全是info类信息;
3、直接执行命令行启动:
/usr/sbin/grafana-server --config=${CONF_FILE} --pidfile=${PID_FILE_DIR}/grafana-server.pid --packaging=rpm cfg:default.paths.logs=${LOG_DIR} cfg:default.paths.data=${DATA_DIR} cfg:default.paths.plugins=${PLUGINS_DIR} cfg:default.paths.provisioning=${PROVISIONING_CFG_DIR}
报错:
Grafana server is running with elevated privileges. This is not recommended
Grafana-server Init Failed: Could not find config defaults, make sure homepath command line parameter is set or working directory is homepath
但执行如下简化命令可正常启动:
/usr/sbin/grafana-server --config=/etc/grafana/grafana.ini --homepath=/usr/share/grafana
4、检查自启动文件:/usr/lib/systemd/system/grafana-server.service;程序需以grafana身份启动,且该用户无系统登录权限;因此上文root下启动时,会有相关提权运行程序的警示信息。但关于提示中的:grafana找不到默认的配置文件,检查配置并未发现异常,路径均正确;
/etc/sysconfig/grafana-server相关配置解释如下:
GRAFANA_USER=grafana #系统用户
GRAFANA_GROUP=grafana #系统组
GRAFANA_HOME=/usr/share/grafana #家目录,静态资源默认存放位置,升级时建议
备份
LOG_DIR=/var/log/grafana #日志目录
DATA_DIR=/var/lib/grafana #数据默认存放目录,升级时建议备份
MAX_OPEN_FILES=10000 #最大支持打开文件数
CONF_DIR=/etc/grafana #配置文件目录,升级时建议备份
CONF_FILE=/etc/grafana/grafana.ini #主配置文件
RESTART_ON_UPGRADE=true #更新时就重启
PLUGINS_DIR=/var/lib/grafana/plugins #读取插件存目录
PROVISIONING_CFG_DIR=/etc/grafana/provisioning #通过读取配置文件方式来配置
datasource和dashboard,而不是在grafana图形窗口中操作
#Only used on systemd systems
PID_FILE_DIR=/var/run/grafana #进程存放目录
/etc/sysconfig/grafana-server配置项:
GRAFANA_USER=grafana
GRAFANA_GROUP=grafana
GRAFANA_HOME=/usr/share/grafana
#修改的logs和data
LOG_DIR=/opt/grafana/logs
DATA_DIR=/opt/grafana/data
MAX_OPEN_FILES=10000
#配置文件的路径修改
CONF_DIR=/opt/grafana/conf
CONF_FILE=/opt/grafana/conf/grafana.ini
RESTART_ON_UPGRADE=true
#修改的plugins
PLUGINS_DIR=/opt/grafana/plugins
PROVISIONING_CFG_DIR=/etc/grafana/provisioning
#Only used on systemd systems
#进程文件路径修改
PID_FILE_DIR=/opt/grafana
5、检查相关grafana文件属性,部分为root,修改为grafana后未果。
6、替换自启动脚本里的变量(变量取值于环境配置文件/etc/sysconfig/grafana-server):
/usr/sbin/grafana-server \
> --config=/etc/grafana/grafana.ini \
> --pidfile=/var/run/grafana/grafana-server.pid \
> --packaging=rpm \
> cfg:default.paths.logs=/var/log/grafana \
> cfg:default.paths.data=/var/lib/grafana \
> cfg:default.paths.plugins=/var/lib/grafana/plugins \
> cfg:default.paths.provisioning=/etc/grafana/provisioning
依然报同样的错:
Grafana server is running with elevated privileges. This is not recommended
Grafana-server Init Failed: Could not find config defaults, make sure homepath command line parameter is set or working directory is homepath
修改加入homepath参数,重新启动:
/usr/sbin/grafana-server \
> --config=/etc/grafana/grafana.ini \
> --pidfile=/var/run/grafana/grafana-server.pid \
> --homepath=/usr/share/grafana \
> --packaging=rpm \
> cfg:default.paths.logs=/var/log/grafana \
> cfg:default.paths.data=/var/lib/grafana \
> cfg:default.paths.plugins=/var/lib/grafana/plugins \
> cfg:default.paths.provisioning=/etc/grafana/provisioning
发现不再报上述错误,反而报Panic,如下所示:
但实际追加force_migration=true到配置文件后,执行未果。
/usr/sbin/grafana-server --config=/etc/grafana/grafana.ini --pidfile=/var/run/grafana/grafana-server.pid --homepath=/usr/share/grafana --packaging=rpm cfg:default.paths.logs=/var/log/grafana
执行:
grafana-cli admin reset-admin-password "123456" --homepath "/usr/share/grafana"

另外web方式修改密码如下:
curl -X PUT -H "Content-Type: application/json" -d '{ "oldPassword": "admin", "newPassword": "newpass", "confirmNew": "newpass" }' http://admin:[email protected]<your_grafana_host>:3000/api/user/password
其中ggrafana使用的SQLite database文件/var/lib/grafana/grafana.db权限就是640的,无需修改。
7、相关经验表明grafana数据文件不一致也会导致启动失败,检查如下:
du -sh /usr/share/grafana/data/grafana.db /var/lib/grafana/grafana.db
944K /usr/share/grafana/data/grafana.db
2.5M /var/lib/grafana/grafana.db
#执行同步
rsync -av /var/lib/grafana/grafana.db /usr/share/grafana/data/grafana.db
sending incremental file list
grafana.db
sent 2,560,734 bytes received 35 bytes 5,121,538.00 bytes/sec
total size is 2,560,000 speedup is 1.00
#再次检查
du -sh /usr/share/grafana/data/grafana.db /var/lib/grafana/grafana.db
2.5M /usr/share/grafana/data/grafana.db
2.5M /var/lib/grafana/grafana.db
#重启验证
systemctl start grafana-server #未果
修改grafana.data目录为/usr/share/grafana尝试启动:
/usr/sbin/grafana-server --config=/etc/grafana/grafana.ini --pidfile=/var/run/grafana/grafana-server.pid --homepath=/usr/share/grafana --packaging=rpm cfg:default.paths.logs=/var/log/grafana cfg:default.paths.data=/usr/share/grafana
可以启动,但是登录报错:
再次执行:
/usr/sbin/grafana-server --config=/etc/grafana/grafana.ini --pidfile=/var/run/grafana/grafana-server.pid --homepath=/usr/share/grafana --packaging=rpm cfg:default.paths.logs=/var/log/grafana cfg:default.paths.data=/usr/share/grafana cfg:default.paths.plugins=/var/lib/grafana/plugins cfg:default.paths.provisioning=/etc/grafana/provisioning

综上,本次报错出现在cfg:default.paths.data=/usr/share/grafana 配置中。对比检查发现/var/lib/grafana/少了一个默认的data目录,怀疑这是导致无法读取文件的原因
mkdir /var/lib/grafana/data
chown -R grafana.grafana ./data/
#命令行调试,启动正常
/usr/sbin/grafana-server --config=/etc/grafana/grafana.ini --pidfile=/var/run/grafana/grafana-server.pid --homepath=/usr/share/grafana --packaging=rpm cfg:default.paths.logs=/var/log/grafana cfg:default.paths.data=/var/lib/grafana cfg:default.paths.plugins=/var/lib/grafana/plugins cfg:default.paths.provisioning=/etc/grafana/provisioning

登录失败的问题,更改数据目录权限即可。另外/etc/grafana/grafana.ini的优先级高于默认的目录下的配置文件。将原先的grafana.db复制到新的data目录后,依然无法启动。从而,我们可得知,grafana.db文件损坏或异常导致了grafana无法启动。
删除/var/lib/grafana/下的所有文件,/etc/grafana/grafana.ini指定data目录为/var/lib/grafana/,重启grafana:
重置admin密码后,登录操作即可,重新配置数据源和dashboard。
三、附录
1)Grafana配置文件加载:
Grafana主要从三个配置文件读取配置:默认是$WORKING_DIR/conf/defaults.ini,其次是用户配置的$WORKING_DIR/conf/custom.ini,也可以在命令行启动grafana时通过–config参数重新指定配置文件来覆盖。如果是以deb或者rpm安装的,则默认的配置文件是/etc/grafana/grafana.ini,这个文件即在init.d的启动脚本中通过–config参数指定的。
脚本示例:
### END INIT INFO
# tested on
# 1. New lsb that define start-stop-daemon
# 3. Centos with initscripts package installed
PATH=/bin:/usr/bin:/sbin:/usr/sbin
NAME=grafana-server
DESC="Grafana Server"
GRAFANA_USER=grafana
GRAFANA_GROUP=grafana
GRAFANA_HOME=/usr/share/grafana
DATA_DIR=/var/lib/grafana
PLUGINS_DIR=/var/lib/grafana/plugins
LOG_DIR=/var/log/grafana
CONF_FILE=$CONF_DIR/grafana.ini
PROVISIONING_CFG_DIR=$CONF_DIR/provisioning
MAX_OPEN_FILES=10000
PID_FILE=/var/run/$NAME.pid
DAEMON=/usr/sbin/$NAME
if [ ! -x $DAEMON ]; then
echo "Program not installed or not executable"
exit 5
fi
#
# init.d / servicectl compatibility (openSUSE)
#
if [ -f /etc/rc.status ]; then
. /etc/rc.status
rc_reset
fi
#
# Source function library.
#
if [ -f /etc/rc.d/init.d/functions ]; then
. /etc/rc.d/init.d/functions
fi
# overwrite settings from default file
[ -e /etc/sysconfig/$NAME ] && . /etc/sysconfig/$NAME
function isRunning() {
status -p $PID_FILE $NAME > /dev/null 2>&1
}
function checkUser() {
if [ `id -u` -ne 0 ]; then
echo "You need root privileges to run this script"
exit 4
fi
}
case "$1" in
start)
checkUser
isRunning
if [ $? -eq 0 ]; then
echo "Already running."
exit 0
fi
# Prepare environment
mkdir -p "$LOG_DIR" "$DATA_DIR" && chown "$GRAFANA_USER":"$GRAFANA_GROUP" "$LOG_DIR" "$DATA_DIR"
touch "$PID_FILE" && chown "$GRAFANA_USER":"$GRAFANA_GROUP" "$PID_FILE"
if [ -n "$MAX_OPEN_FILES" ]; then
ulimit -n $MAX_OPEN_FILES
fi
# Start Daemon
cd $GRAFANA_HOME
action $"Starting $DESC: ..." su -s /bin/sh -c "nohup ${DAEMON} ${DAEMON_OPTS} >> /dev/null 3>&1 &" $GRAFANA_USER 2> /dev/null
return=$?
if [ $return -eq 0 ]
then
sleep 1
# check if pid file has been written to
if ! [[ -s $PID_FILE ]]; then
echo "FAILED"
exit 1
fi
i=0
timeout=10
# Wait for the process to be properly started before exiting
until {
cat "$PID_FILE" | xargs kill -0; } >/dev/null 2>&1
do
sleep 1
i=$(($i + 1))
if [ $i -gt $timeout ]; then
echo "FAILED"
exit 1
fi
done
fi
exit $return
;;
stop)
checkUser
echo -n "Stopping $DESC: ..."
if [ -f "$PID_FILE" ]; then
killproc -p $PID_FILE -d 20 $NAME
if [ $? -eq 1 ]; then
echo "$DESC is not running but pid file exists, cleaning up"
elif [ $? -eq 3 ]; then
PID="`cat $PID_FILE`"
echo "Failed to stop $DESC (pid $PID)"
exit 1
fi
rm -f "$PID_FILE"
echo ""
exit 0
else
echo "(not running)"
fi
exit 0
;;
status)
status -p $PID_FILE $NAME
exit $?
;;
restart|force-reload)
if [ -f "$PID_FILE" ]; then
$0 stop
sleep 1
fi
$0 start
;;
*)
echo "Usage: $0 {start|stop|restart|force-reload|status}"
exit 3
;;
esac
配置文件示例说明:
app_mode: ;应用名称,默认是production
[path]
data:一个grafana用来存储sqlite3、临时文件、回话的地址路径
logs:grafana存储logs的路径
[server]
http_addr:监听的ip地址,,默认是0.0.0.0
http_port:监听的端口,默认是3000
protocol:http或者https,,默认是http
domain:这个设置是root_url的一部分,当你通过浏览器访问grafana时的公开的domian名称,默认是localhost
enforce_domain:如果主机的header不匹配domian,则跳转到一个正确的domain上,默认是false
root_url:这是一个web上访问grafana的全路径url,默认是%(protocol)s://%(domain)s:%(http_port)s/
router_logging:是否记录web请求日志,默认是false
cert_file:如果使用https则需要设置
cert_key:如果使用https则需要设置
[database]
grafana默认需要使用数据库存储用户和dashboard信息,默认使用sqlite3来存储,你也可以换成其他数据库
type:可以是mysql、postgres、sqlite3,默认是sqlite3
path:只是sqlite3需要,定义sqlite3的存储路径
host:只是mysql、postgres需要,默认是127.0.0.1:3306
name:grafana的数据库名称,默认是grafana
user:连接数据库的用户
password:数据库用户的密码
ssl_mode:只是postgres使用
[security]
admin_user:grafana默认的admin用户,默认是admin
admin_password:grafana admin的默认密码,默认是admin
login_remember_days:多少天内保持登录状态
secret_key:保持登录状态的签名
disable_gravatar:
[users]
allow_sign_up:是否允许普通用户登录,如果设置为false,则禁止用户登录,默认是true,则admin可以创建用户,并登录grafana
allow_org_create:如果设置为false,则禁止用户创建新组织,默认是true
auto_assign_org:当设置为true的时候,会自动的把新增用户增加到id为1的组织中,当设置为false的时候,新建用户的时候会新增一个组织
auto_assign_org_role:新建用户附加的规则,默认是Viewer,还可以是Admin、Editor
[auth.anonymous]
enabled:设置为true,则开启允许匿名访问,默认是false
org_name:为匿名用户设置组织名称
org_role:为匿名用户设置的访问规则,默认是Viewer
[auth.github]
针对github项目的,很明显,呵呵
enabled = false
allow_sign_up = false
client_id = some_id
client_secret = some_secret
scopes = user:email
auth_url = https://github.com/login/oauth/authorize
token_url = https://github.com/login/oauth/access_token
api_url = https://api.github.com/user
team_ids =
allowed_domains =
allowed_organizations =
[auth.google]
针对google app的,呵呵
enabled = false
allow_sign_up = false
client_id = some_client_id
client_secret = some_client_secret
scopes = https://www.googleapis.com/auth/userinfo.profile https://www.googleapis.com/auth/userinfo.email
auth_url = https://accounts.google.com/o/oauth2/auth
token_url = https://accounts.google.com/o/oauth2/token
api_url = https://www.googleapis.com/oauth2/v1/userinfo
allowed_domains =
[auth.basic]
enabled:当设置为true,则http api开启基本认证
[auth.ldap]
enabled:设置为true则开启LDAP认证,默认是false
config_file:如果开启LDAP,指定LDAP的配置文件/etc/grafana/ldap.toml
[auth.proxy]
允许你在一个HTTP反向代理上进行认证设置
enabled:默认是false
header_name:默认是X-WEBAUTH-USER
header_property:默认是个名称username
auto_sign_up:默认是true。开启自动注册,如果用户在grafana DB中不存在
[analytics]
reporting_enabled:如果设置为true,则会发送匿名使用分析到stats.grafana.org,主要用于跟踪允许实例、版本、dashboard、错误统计。默认是true
google_analytics_ua_id:使用GA进行分析,填写你的GA ID即可
[dashboards.json]
如果你有一个系统自动产生json格式的dashboard,则可以开启这个特性试试
enabled:默认是false
path:一个全路径用来包含你的json dashboard,默认是/var/lib/grafana/dashboards
[session]
provider:默认是file,值还可以是memory、mysql、postgres
provider_config:这个值的配置由provider的设置来确定,如果provider是file,则是data/xxxx路径类型,如果provider是mysql,则是user:[email protected](127.0.0.1:3306)/database_name,如果provider是postgres,则是user=a password=b host=localhost port=5432 dbname=c sslmode=disable
cookie_name:grafana的cookie名称
cookie_secure:如果设置为true,则grafana依赖https,默认是false
session_life_time:session过期时间,默认是86400秒,24小时
#以下是官方文档没有,配置文件中有的
[smtp]
enabled = false
host = localhost:25
user =
password =
cert_file =
key_file =
skip_verify = false
from_address = [email protected]
[emails]
welcome_email_on_sign_up = false
templates_pattern = emails/*.html
[log]
mode:可以是console、file,默认是console、file,也可以设置多个,用逗号隔开
buffer_len:channel的buffer长度,默认是10000
level:可以是"Trace", "Debug", "Info", "Warn", "Error", "Critical",默认是info
[log.console]
level:设置级别
[log.file]
level:设置级别
log_rotate:是否开启自动轮转
max_lines:单个日志文件的最大行数,默认是1000000
max_lines_shift:单个日志文件的最大大小,默认是28,表示256MB
daily_rotate:每天是否进行日志轮转,默认是true
max_days:日志过期时间,默认是7,7天后删除
注:上述配置文件中的配置都可以通过环境变量来覆盖,使用的语法如下:
GF_<SectionName>_<KeyName>
eg:
export GF_AUTH_GOOGLE_CLIENT_SECRET=newS3cretKey
边栏推荐
- 完美人生PerfectLife——角色:Chenyuxin
- Global industry analysis report of high purity molybdenum in 2022
- [chapter 65 of the flutter problem series] a solution to setting the maximum height of showmodalbottomsheet in the flutter is invalid
- Swin-Unet最强分割网络
- Notice on the issuance of Shenzhen action plan for cultivating and developing precision instrument and equipment industry clusters (2022-2025)
- Research Report on water jet cutting equipment industry - market status analysis and development prospect forecast
- UE全景图,碰到the outpout directory时的问题
- [lingo] linear programming
- 目前技术圈最全面的 Layer2 研究总结
- Wechat applet bidirectional data binding, parent-child parameter transfer
猜你喜欢

OS实验六【设备管理】

当你的华强北耳机掉水里了怎么办?怎么恢复音质?

大佬们,帮帮我吧!重装MySQL,到设置密码就出现current root password

10 useful flutter widgets

The latest Jilin construction safety officer simulation question bank and answers in 2022

Link Time Optimizations: New Way to Do Compiler Optimizations

Liste et ensemble des types de données redis et triés Définir le tri

伦敦旅游必去博物馆推荐:伦敦自然历史博物馆

Using wechat games to achieve Dragon Boat battle - making zongzi

How to prevent virus in business system
随机推荐
One's deceased father grind adjusts, read this you will understand!
Renewable energy consulting 2022 Global Industry Analysis Report
OS实验七【文件管理】
[adjustment] South China Normal University (211, double first-class) zhuangzhengfei research group of biophotonics Research Institute, master enrollment
[homeassistant shakes hands with 28byj-48 stepping motor]
华为设备配置Hub and Spoke
DataGridView数据的增删改查,XML保存参数(C#)
Qt5.9.5+jetson nano development: unknown module (s) in QT Designer
What are the test case design methods?
Guys, help me! Reinstall mysql, and the current root password appears when the password is set
Online legal aid service 2022 Global Industry Analysis Report
Introduction to temporal database incluxdb
Huawei device configuration hub and spoke
What objects are suitable for automated testing?
Using wechat games to achieve Dragon Boat battle - making zongzi
Web page test of software test
格式化日期和文本长度的过滤器
Perfect life - role: ChenYuXin
Using fast and slow pointer to find the midpoint of linked list
2022.06.04 learning contents