当前位置:网站首页>Knowledge Engineering Assignment 2: Introduction to Knowledge Engineering Related Fields
Knowledge Engineering Assignment 2: Introduction to Knowledge Engineering Related Fields
2022-08-02 03:33:00 【woshicaiji12138】
Natural Language Processing
Knowledge engineering is a research field formed from the construction of expert systems, and has now become an interdisciplinary comprehensive discipline.His main research fields include soft computing, natural language processing, logic and reasoning, etc. [1].This article mainly focuses on the field of natural language processing for relevant introductions.
I. Overview of Natural Language Processing
NLP for short, it is a science that integrates linguistics, computer science, and art.Therefore, it is very closely related to the study of linguistics, but has important differences.The research object of natural language processing is not the natural language in daily life, but the development of computer systems, especially the software systems, which can effectively realize natural language communication.Hence it is part of computer science [2].
At present, there are several classic tasks in the field of natural language processing: stemming, lemmatization, word vectorization, part-of-speech tagging, named entity disambiguation, named subject recognition, sentiment analysis, text semantic similarity analysis andText summary, etc.
Second, stem extraction
This paper mainly selects stem extraction for a detailed introduction.Stemming is the process of removing the inflection or derivative form of words and converting the original words into stems [3].(For example, in English: the stems of "beautiful" and "beautifully" are both "beauti"; "stemmer", "stemming" and "stemmed" are based on the stem "stem".)
Three.Porter Stemming
The classic approach to this task is Martin Porter's Porter stemming algorithm.Dr. Porter also received the 2000 Tony Kents Award for his work in stemming and information retrieval.This method first requires us to define a file containing a to store stemThe class to extract the array of words before the formal processing of the algorithm can begin.The first step of the algorithm is to deal with plurals, and words ending in "ed" and "ing".The second step is to find out if there is a word that contains a vowel and ends in "y"; after finding it, change "y" to "i".The third step is to map words with double suffixes, such as preferred; to single suffixes.The fourth step is to deal with suffixes such as "-ic", "-full", and "-ness".Then remove "-ant", "-ence", "-e", etc. at the end of the word where appropriate.Finally, a stem() method is used to obtain the stem of the word transformation.
Fourth, the latest method
Since the advent of the Porter stemming method, new stemming methods have been emerging, and there are improved methods based on the Porter stemming method [4]; otherThere are some new and smarter methods, such as the method of n-gram parsing.The method exploits the context of a word to extract the correct stem, which undoubtedly greatly improves the practicality.
References
[1] Huang Ronghuai, Li Maoguo, Sha Jingrong, Knowledge Engineering: A New Important Research Field [E], Electronic Education Research, 2004, (10): 1-7
[2] Li Changyun, Wang Zhibing, Intelligent perception technology and its application in electrical engineering, University of Electronic Science and Technology of China Press, 2017.05, p. 163)
[3] Common 10 Natural Language Processing Technologies, September 2, 2021,https://blog.csdn.net/Harrytsz/article/details/120053267
[4] Widjaja M, Seng H. Implementation of Modified Porter Stemming Algorithm to Indonesian Word Error Detection Plugin Application[J]. Int J Hum CultStud, 2015, 6(2):139.
边栏推荐
- RHCSA第三天
- (Reposted) The relationship between hashcode and equals
- MySQL两阶段提交串讲
- 【博学谷学习记录】超强总结,用心分享 | 软件测试 接口测试基础
- debian 10 nat 与路由转发
- [Basic Tutorial of Remote Control Development 1] Crazy Shell Open Source Formation Drone-GPIO (Remote Control Indicator Light Control)
- Mysql8创建用户以及赋权操作
- Good Key, Bad Key (thinking, temporary exchange, classic method)
- HCIP-第十一天-MPLS+BGP
- 2022ACM夏季集训周报(五)
猜你喜欢
基于libmodbus库实现modbus TCP/RTU通信
LeetCode:1161. 最大层内元素和【BFS层序遍历】
[Remote Control Development Basic Tutorial 3] Crazy Shell Open Source Formation UAV-ADC (Joystick Control)
脚手架安装
OD-Model【4】:SSD
DSPE-PEG-PDP,DSPE-PEG-OPSS,磷脂-聚乙二醇-巯基吡啶供应,MW:5000
HCIP第十一天_MPLS实验
oracle inner join and outer join
[详解C语言]一文带你玩转C语言小游戏---三子棋
(转帖)HashCode总结(2)
随机推荐
基本运算符
@Configuration详解
科研试剂DMPE-PEG-Mal 二肉豆蔻酰磷脂酰乙醇胺-聚乙二醇-马来酰亚胺
活体检测 Adaptive Normalized Representation Learning for GeneralizableFace Anti-Spoofing 阅读笔记
AttributeError: Can't get attribute 'SPPF' on
磷脂-聚乙二醇-酰肼,DSPE-PEG-Hydrazide,DSPE-PEG-HZ,MW:5000
基于libmodbus库实现modbus TCP/RTU通信
线性代数学习笔记2-2:向量空间、子空间、最大无关组、基、秩与空间维数
@Autowired与@Resource区别
磷脂-聚乙二醇-醛基 DSPE-PEG-Aldehyde DSPE-PEG-CHO MW:5000
错误:with open(txt_path,‘r‘) as f: FileNotFoundError: [Errno 2] No such file or directory:
支付通道对接常见的问题有哪些?
mysql中exists的用法详解
「PHP基础知识」空值(null)的使用
Mysql8创建用户以及赋权操作
磷脂-聚乙二醇-叠氮,DSPE-PEG-Azide,DSPE-PEG-N3,MW:5000
Deveco studio 鸿蒙app访问网络详细过程(js)
MySQL两阶段提交串讲
Day34 LeetCode
【C语言万字长文】 宏定义 结构体 共用体 内存对齐知识点总结