当前位置:网站首页>【知识图谱】实践篇——基于医疗知识图谱的问答系统实践(Part5-完结):信息检索与结果组装
【知识图谱】实践篇——基于医疗知识图谱的问答系统实践(Part5-完结):信息检索与结果组装
2022-07-25 16:38:00 【科皮子菊】
前序文章:
- 【知识图谱】实践篇——基于医疗知识图谱的问答系统实践(Part1):项目介绍与环境准备
- 【知识图谱】实践篇——基于医疗知识图谱的问答系统实践(Part2):图谱数据准备与导入
- 【知识图谱】实践篇——基于医疗知识图谱的问答系统实践(Part3):基于规则的问题分类
- 【知识图谱】实践篇——基于医疗知识图谱的问答系统实践(Part4):结合问题分类的问题解析与检索语句生成
背景
在前面几个模块中我们已经完成了问题分类、问题解析以及问题所需要的信息检索语句的生成。下面就把这些模块串起来,然后将结果进行组装。
结果组装
结果组装就是根据不同类型的问题输出对应的结果,具体实现如下:
KGQAMedicine\answer_search\raw_answer_search.py
from utils.config import SysConfig
from py2neo import Graph
class RawAnswerSearcher(object):
def __init__(self):
self.graph = Graph(SysConfig.NEO4J_HOST + ":" + str(SysConfig.NEO4J_PORT), auth=(SysConfig.NEO4J_USER,
SysConfig.NEO4J_PASSWORD))
self.num_limit = 20
def search(self, sql_list: list):
final_answers = []
for sql in sql_list:
question_kind = sql['question_kind']
answers = []
for query in sql['sql']:
query_result = self.graph.run(query).data()
answers += query_result
final_answer = self._answer_standard(question_kind, answers)
if final_answer:
final_answers.append(final_answer)
return final_answers
def _answer_standard(self, question_kind, answers):
final_answer = []
if not answers:
return ''
if question_kind == 'disease_symptom':
desc = [i['n.name'] for i in answers]
subject = answers[0]['m.name']
final_answer = '{0}的症状包括:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_kind == 'symptom_disease':
desc = [i['m.name'] for i in answers]
subject = answers[0]['n.name']
final_answer = '症状{0}可能染上的疾病有:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_kind == 'disease_cause':
desc = [i['m.cause'] for i in answers]
subject = answers[0]['m.name']
final_answer = '{0}可能的成因有:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_kind == 'disease_prevent':
desc = [i['m.prevent'] for i in answers]
subject = answers[0]['m.name']
final_answer = '{0}的预防措施包括:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_kind == 'disease_lasttime':
desc = [i['m.cure_lasttime'] for i in answers]
subject = answers[0]['m.name']
final_answer = '{0}治疗可能持续的周期为:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_kind == 'disease_cureway':
desc = [';'.join(i['m.cure_way']) for i in answers]
subject = answers[0]['m.name']
final_answer = '{0}可以尝试如下治疗:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_kind == 'disease_cureprob':
desc = [i['m.cured_prob'] for i in answers]
subject = answers[0]['m.name']
final_answer = '{0}治愈的概率为(仅供参考):{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_kind == 'disease_easyget':
desc = [i['m.easy_get'] for i in answers]
subject = answers[0]['m.name']
final_answer = '{0}的易感人群包括:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_kind == 'disease_desc':
desc = [i['m.desc'] for i in answers]
subject = answers[0]['m.name']
final_answer = '{0},熟悉一下:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_kind == 'disease_acompany':
desc1 = [i['n.name'] for i in answers]
desc2 = [i['m.name'] for i in answers]
subject = answers[0]['m.name']
desc = [i for i in desc1 + desc2 if i != subject]
final_answer = '{0}的症状包括:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_kind == 'disease_not_food':
desc = [i['n.name'] for i in answers]
subject = answers[0]['m.name']
final_answer = '{0}忌食的食物包括有:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_kind == 'disease_do_food':
do_desc = [i['n.name'] for i in answers if i['r.name'] == '宜吃']
recommand_desc = [i['n.name'] for i in answers if i['r.name'] == '推荐食谱']
subject = answers[0]['m.name']
final_answer = '{0}宜食的食物包括有:{1}\n推荐食谱包括有:{2}'.format(subject, ';'.join(list(set(do_desc))[:self.num_limit]),
';'.join(list(set(recommand_desc))[:self.num_limit]))
elif question_kind == 'food_not_disease':
desc = [i['m.name'] for i in answers]
subject = answers[0]['n.name']
final_answer = '患有{0}的人最好不要吃{1}'.format(';'.join(list(set(desc))[:self.num_limit]), subject)
elif question_kind == 'food_do_disease':
desc = [i['m.name'] for i in answers]
subject = answers[0]['n.name']
final_answer = '患有{0}的人建议多试试{1}'.format(';'.join(list(set(desc))[:self.num_limit]), subject)
elif question_kind == 'disease_drug':
desc = [i['n.name'] for i in answers]
subject = answers[0]['m.name']
final_answer = '{0}通常的使用的药品包括:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_kind == 'drug_disease':
desc = [i['m.name'] for i in answers]
subject = answers[0]['n.name']
final_answer = '{0}主治的疾病有{1},可以试试'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_kind == 'disease_check':
desc = [i['n.name'] for i in answers]
subject = answers[0]['m.name']
final_answer = '{0}通常可以通过以下方式检查出来:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_kind == 'check_disease':
desc = [i['m.name'] for i in answers]
subject = answers[0]['n.name']
final_answer = '通常可以通过{0}检查出来的疾病有{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
return final_answer
各模块组装与问答类构建
该模块就是将pipline中的各个模块组装起来。具体如下:
KGQAMedicine\chatbot.py
from question_classify.rule_question_classify import RuleQuestionClassifier
from question_parser.rule_question_parser import RuleQuestionParser
from answer_search.raw_answer_search import RawAnswerSearcher
class ChatBot(object):
def __init__(self):
self.classifier = RuleQuestionClassifier()
self.parser = RuleQuestionParser()
self.answer_generate = RawAnswerSearcher()
self.common_answer = "您好,我是科皮子菊的医药私人助手,希望可以为您解答。如果答案不满意,可以通过:https://github.com/Htring 联系我哦。祝您身体健康,远离我哦!"
def answer(self, question):
question_classify = self.classifier.classify(question)
if not question_classify:
return self.common_answer
res_sql = self.parser.parser(question_classify)
final_answers = self.answer_generate.search(res_sql)
if not final_answers:
return self.common_answer
else:
return "\n".join(final_answers)
if __name__ == '__main__':
chat_bot = ChatBot()
while True:
question = input("用户:")
answer = chat_bot.answer(question)
print("科皮子菊:", answer)
效果展示:
总结
总得来说,这个项目把使用知识图谱进行QA的一些流程介绍的比较清楚,但是在完成问答的过程中技术相对老旧,不过效果依然还不错。源码已经放到我的github上:https://github.com/Htring/KGQAMedicine,有兴趣的可以下载运行看看哦,上面有运行介绍哦。
为了能够进一步提升效果的话可以引入很多新技术。例如在问题分类环节可以引入基于深度学习的问题分类方法,在进行问题解析的时候,可以引入基于深度学习的NER实体识别方式以及进一步处进行实体对齐等,这里不作进一步展开。
除此之外,知识图谱在构建是需要结合业务需求,也就是在接到业务的时候以及对现有数据进行分析然后构建基于业务的schema,再通过自然语言处理相关技术进行知识图谱的构建。在原项目中,使用爬虫的方式进行数据爬取,其也可以使用NLP相关的基础,优化提取的数据等等。
雄关漫道真如铁,而今迈步重头越。刚入门,一个新的开始。往后和增加更多基于深度学习算法的内容到知识图谱的建设,应用等。
边栏推荐
猜你喜欢

今天去 OPPO 面试,被问麻了

终极套娃 2.0 | 云原生交付的封装

复旦大学EMBA2022毕业季丨毕业不忘初心 荣耀再上征程

【obs】转载:OBS直播严重延迟和卡顿怎么办?

Is the win11 dynamic tile gone? Method of restoring dynamic tile in Win 11

IaaS基础架构云 —— 云网络

easyui修改以及datagrid dialog form控件使用

Differences between cookies, cookies and sessions

吴恩达逻辑回归2

为什么 4EVERLAND 是 Web 3.0 的最佳云计算平台
随机推荐
MySQL view
MyBaits
【ZeloEngine】反射系统填坑小结
MySQL之联表查询、常用函数、聚合函数
Communication between processes (pipeline details)
The presentation logic of mail sending and receiving inbox outbox and reply to the problem of broken chain
【小5聊】公众号排查<该公众号提供的服务出现故障,请稍后>
Test framework unittest command line operation and assertion method
IaaS基础架构云 —— 云网络
【读书会第13期】+FFmpeg开源项目
Roson的Qt之旅#100 QML四种标准对话框(颜色、字体、文件、提升)
MySQL linked table query, common functions, aggregate functions
Exception handling mechanism topic 1
微信小程序不使用插件,渲染富文本中的视频,图片自适应,plus版本
Today, I went to oppo for an interview and was asked numbly
Sum arrays with recursion
[OBS] frame loss and frame priority before transmission
[image denoising] image denoising based on bicube interpolation and sparse representation matlab source code
如何使用 4EVERLAND CLI 在 IPFS 上部署应用程序
02. Limit the parameter props to a list of types