当前位置:网站首页>基于apache-jena的知识问答
基于apache-jena的知识问答
2022-07-06 09:13:00 【zkkkkkkkkkkkkk】
前言
这篇文章主要写如何使用Python对apache-jena进行交互查询。具体三元组数据建立、转换、导入内容请看:知识问答三元组数据准备阶段 。本文在知识问答三元组数据准备阶段的基础上,接着往下写。
注:本文案例代码使用 https://github.com/zhangtao-seu/Jay_KG 中的代码
目录
一、代码目录结构
下图为Jay_KG项目的代码目录结构。其中重要的是query_main.py和question_temp.py文件,其中前者为程序主入口,后者为知识问答模板的定义。
二、知识问答实现
2.1、三元组数据获取
本文案例使用potege软件建模,最后导出owl文件,后继续根据 知识问答三元组数据准备阶段 文中的2.2小节内容一步一步做。本案例使用potege建模后如下图所示
2.1、定义实体
在Jay_KG/KB_query/external_dict中有一个sanguo.txt,这就是博主定义的实体文件,内容如下图所示
2.2、定义问题模板
打开 question_temp.py文件,定义模板,如下代码所示,QuestionSet类下的o_name、rides函数定义了关羽字(别名、小名)是什么?被、和关羽的坐骑(骑、战马)是什么?两个模板。
# encoding=utf-8
"""
@desc:
设置问题模板,为每个模板设置对应的SPARQL语句。demo提供如下模板:
"""
from refo import finditer, Predicate, Star, Any, Disjunction
import re
# # TODO SPARQL前缀和模板
# SPARQL_PREXIX = u"""
# PREFIX owl: <http://www.w3.org/2002/07/owl#>
# PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
# PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
# PREFIX : <http://www.semanticweb.org/张涛/ontologies/2019/1/untitled-ontology-32#>
# """
SPARQL_PREXIX = u"""
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX : <http://www.semanticweb.org/dell/ontologies/2022/0/untitled-ontology-3#>
"""
SPARQL_SELECT_TEM = u"{prefix}\n" + \
u"SELECT {select} WHERE {
{\n" + \
u"{expression}\n" + \
u"}}\n"
class W(Predicate):
def __init__(self, token=".*", pos=".*"):
self.token = re.compile(token + "$")
self.pos = re.compile(pos + "$")
super(W, self).__init__(self.match)
def match(self, word):
m1 = self.token.match(word.token.decode("utf-8"))
m2 = self.pos.match(word.pos)
return m1 and m2
class Rule(object):
def __init__(self, condition_num, condition=None, action=None):
assert condition and action
self.condition = condition
self.action = action
self.condition_num = condition_num
# word_object : [词,词性]
def apply1(self, sentence):
matches = []
# 【person_entity】
for m in finditer(self.condition, sentence):
i, j = m.span()
matches.extend(sentence[i:j])
return self.action(matches), self.condition_num
class QuestionSet:
def __init__(self):
pass
@staticmethod
def o_name(word_object):
#关羽字什么?
select = u"?o"
sparql = None
for w in word_object:
if w.pos == pos_person:
e = u" :{person} :字 ?o.".format(person=w.token.decode('utf-8'))
sparql = SPARQL_SELECT_TEM.format(prefix=SPARQL_PREXIX,
select=select,
expression=e)
print(sparql)
break
return sparql
@staticmethod
def rides(word_object):
#关羽战马是什么?
select = u"?o"
sparql = None
for w in word_object:
if w.pos == pos_person:
e = u" :{person} :骑 ?o.".format(person=w.token.decode('utf-8'))
sparql = SPARQL_SELECT_TEM.format(prefix=SPARQL_PREXIX,
select=select,
expression=e)
print(sparql)
break
return sparql
# TODO 定义关键词
pos_person = "nr"
person_entity = (W(pos=pos_person))
other_name = (W("字") | W("别名") | W("小名"))
ride = (W("骑") | W("坐骑") | W("战马"))
# TODO 问题模板/匹配规则
"""
# 关羽字什么?
# 关羽的战马是什么?
"""
rules = [
# 关羽字什么?
Rule(condition_num=0, condition=person_entity + Star(Any(), greedy=False) + other_name + Star(Any(), greedy=False), action=QuestionSet.o_name),
Rule(condition_num=0, condition=person_entity + Star(Any(), greedy=False) + ride + Star(Any(), greedy=False), action=QuestionSet.rides),
]
三、运行代码进行问答
进入query_main文件运行,后在控制台输入之前定义好的模板,即可收到返回。
边栏推荐
- [recommended by bloggers] C WinForm regularly sends email (with source code)
- csdn-Markdown编辑器
- MySQL other hosts cannot connect to the local database
- 解决:log4j:WARN Please initialize the log4j system properly.
- CSDN问答标签技能树(二) —— 效果优化
- CSDN问答标签技能树(五) —— 云原生技能树
- JDBC principle
- Mysql27 index optimization and query optimization
- Global and Chinese market of operational amplifier 2022-2028: Research Report on technology, participants, trends, market size and share
- CSDN Q & a tag skill tree (V) -- cloud native skill tree
猜你喜欢
Django运行报错:Error loading MySQLdb module解决方法
Mysql27 index optimization and query optimization
【博主推荐】C#生成好看的二维码(附源码)
API learning of OpenGL (2002) smooth flat of glsl
Idea import / export settings file
Install mysql5.5 and mysql8.0 under windows at the same time
How to find the number of daffodils with simple and rough methods in C language
MySQL26-性能分析工具的使用
Redis的基础使用
CSDN-NLP:基于技能树和弱监督学习的博文难度等级分类 (一)
随机推荐
[recommended by bloggers] C # generate a good-looking QR code (with source code)
How to find the number of daffodils with simple and rough methods in C language
MySQL19-Linux下MySQL的安装与使用
Install MySQL for Ubuntu 20.04
Opencv uses freetype to display Chinese
Esp8266 at+cipstart= "", "", 8080 error closed ultimate solution
A trip to Macao - > see the world from a non line city to Macao
Mysql22 logical architecture
Win10: how to modify the priority of dual network cards?
MySQL28-数据库的设计规范
Csdn-nlp: difficulty level classification of blog posts based on skill tree and weak supervised learning (I)
CSDN博文摘要(一) —— 一个简单的初版实现
MySQL23-存储引擎
Navicat 導出錶生成PDM文件
Data dictionary in C #
Global and Chinese market of transfer switches 2022-2028: Research Report on technology, participants, trends, market size and share
Principes JDBC
API learning of OpenGL (2005) gl_ MAX_ TEXTURE_ UNITS GL_ MAX_ TEXTURE_ IMAGE_ UNITS_ ARB
LeetCode #461 汉明距离
Moteur de stockage mysql23