当前位置:网站首页>苏宁基于知识图谱的大规模告警收敛和根因定位实践

苏宁基于知识图谱的大规模告警收敛和根因定位实践

2020-11-09 12:09:00 InfoQ

{"type":"doc","content":[{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"一、概述"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"知识图谱有较强的知识表达能力、直观的信息呈现能力和较好的推理可解释性,因此知识图谱在推荐系统、问答系统、搜索引擎、医疗健康、生物制药等领域有着广泛的应用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"运维知识图谱构建相对于其他领域的知识图谱构建而言,具有天然的优势,网络设备固有的拓扑结构、系统应用的调用关系可以快速的构成软硬件知识图谱中的实体和关系。历史的告警数据蕴含着大量的相关、因果关系,使用因果发现算法,也可以有效的构建告警知识图谱。基于知识图谱上的权重进行路径搜索,可以给出根因的传播路径,便于运维人员快速的做出干预决策。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"苏宁通过CMDB、调用链等数据构建软硬件知识图谱,在此基础上通过历史告警数据构建告警知识图谱,并最终应用知识图谱进行告警收敛和根因定位。本文主要包括运维知识图谱构建、知识图谱存储、告警收敛及根因定位等内容。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"二、 痛点及产品对策演进"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"痛点"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"苏宁内部系统和服务的复杂性:"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"6000+系统,数量还在增加;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"系统间调用方式复杂:大部分使用RSF,也有HTTP、HESSIAN等;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"苏宁业务的复杂:既有线上新业务又有线下老业务,这些业务系统之间会有大量的关联。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":2,"normalizeStart":2},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"基础环境的复杂性:"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"多数据中心,每个数据中心会划分多个逻辑机房和部署环境;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"服务器规模30w+,例如,缓存服务器就有可能有上千台服务器;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"设备复杂性:多品牌的交换机,路由器,负载均衡,OpenStack, KVM, k8s下docker,swarm下的docker等。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}

版权声明
本文为[InfoQ]所创,转载请带上原文链接,感谢
https://www.infoq.cn/article/kVCO7V7fdPkSkbN3hP2j?utm_source=rss&utm_medium=article