当前位置:网站首页>CSDN question and answer tag skill tree (I) -- Construction of basic framework
CSDN question and answer tag skill tree (I) -- Construction of basic framework
2022-07-06 10:41:00 【Alexxinlu】
Catalog
Series articles
- CSDN Q & a tag skill tree ( One ) —— Construction of basic framework
- CSDN Q & a tag skill tree ( Two ) —— Effect optimization
- CSDN Q & a tag skill tree ( 3、 ... and ) —— Python The skill tree
- CSDN Q & a tag skill tree ( Four ) —— Java The skill tree
- CSDN Q & a tag skill tree ( 5、 ... and ) —— Cloud native skill tree
Team blog : CSDN AI team
1 Problem definition
1.1 background
At present CSDN The questions in the question and answer module are simply classified , for example :Python、Java、C Languages and other categories , Instead of mapping questions to specific knowledge points in the general category , For example, in the example below , The problem belongs to Python Data visualization in language .
Fine grained classification and division of problems , It can make the questioner more clearly understand the position of the question in the knowledge system , It is also convenient for the system to more accurately recommend relevant materials to the questioner for learning and reference .
In order to solve the above problems , This paper first builds a programming language skill tree for each category , Then map the previously adopted questions to specific nodes in the skill tree , Finally, for a new question , Based on the constructed skill tree , Match to the most similar node , And recommend the adopted questions on this node .
2 Solution
2.1 Knowledge gathering
To build a programming skill tree , First of all, we need to collect relevant knowledge , This paper starts with Python For example, programming languages , Carry out specific implementation .
Through online search and research , Summarize the following two channels :
Climb to the directory from a certain East
- By keyword "python" Search from a certain East , And select Top N Books
- Extract the contents of the directory field from the details page , Get the unprocessed directory
Learning path on website forum :
- for example : Liao Xuefeng's official website 、 Novice tutorial 、 Geek college
- Use web crawlers to crawl directories on websites , With Liao Xuefeng's official website For example , As shown in the figure below :
2.2 Construction of skill tree
After obtaining the corresponding knowledge resources , You need to store resources in a tree structure , In this paper treelib Package implementation .
To facilitate the merging of trees in the next section , This article limits the directory to 4 The layer structure :
- Chapter title . for example : The first part
- Sub chapter title . for example : The first 1 Chapter
- Section title . for example :1.1
- Section title . for example :1.1.1
The structured tree structure is shown in the following figure :
2.3 Merging of skill trees
After building a skill tree based on directories and knowledge system resources from different sources , You need to merge several different skill trees , Form a same Python The skill tree .
For the merging of trees , This paper mainly considers the following aspects :
- Merge by layer starting from the root node
- Use recursive method to merge multiple trees
- Similar nodes in the same layer need to be merged
- Use heuristic clustering methods ( There is no need to determine the number of clusters in advance ), Divide nodes into multiple clusters
- The similarity calculation method in clustering uses Longest common subsequence ratio + Levin steinby ( Edit distance ratio ) The method of calculation
- New node after merging , Use the longest common subsequence of multiple sentences to replace , for example :3 Nodes if Statements use 、if Statement processing list settings 、if Format of statement The longest common subsequence of is if sentence , Finally using if sentence As the value of the merge node .
- Remove useless nodes
- Use tree pruning + The method of dictionary , Remove useless nodes from the skill tree , for example : Summary of this chapter 、 Extended reading 、 project And other chapter nodes .
The merged skill tree is shown in the figure below :
2.4 Match the problem with the skill tree
After the skill tree is built , Need to put Python All adopted problems in the field are mapped to the corresponding nodes , And for a new question , Based on the constructed skill tree , Match to the most similar node , And recommend the adopted questions on this node .
The matching algorithm used in this paper is Levin steinby ( Edit distance ratio ), By calculating the levinstein ratio between the question and the node , Determine the node that best matches the question .
3 Summary and next step plan
summary
This paper mainly realizes the construction and merging of programming language skill tree , And the matching between questions and nodes in the skill tree . Now only the preliminary functions have been realized , The effect needs further optimization . The current problems mainly include :
- The removal of irrelevant nodes is not clean enough
- Similarity calculation method in clustering , And it is unreasonable to use the longest common subsequence to replace the new node after multiple nodes are merged , for example : Python Version run and Python code snippet Be divided into the same cluster , And merged into Python
- There is a big difference between the description style of questions and nodes in the skill tree , One is asking questions , One is knowledge , Use when asking questions and matching nodes Levin steinby ( Edit distance ratio ) The method of calculating similarity is unreasonable
- ……
Next step
For the current problems , Next, consider :
- Further improve the quality of the synthesized skill tree
- Improve the matching effect of problem and tree
边栏推荐
- Time in TCP state_ The role of wait?
- Complete web login process through filter
- Transactions have four characteristics?
- Baidu Encyclopedia data crawling and content classification and recognition
- Ueeditor internationalization configuration, supporting Chinese and English switching
- Chrome浏览器端跨域不能访问问题处理办法
- Pytoch LSTM implementation process (visual version)
- windows无法启动MYSQL服务(位于本地计算机)错误1067进程意外终止
- Mysql33 multi version concurrency control
- API learning of OpenGL (2001) gltexgen
猜你喜欢
Mysql21 - gestion des utilisateurs et des droits
Implement context manager through with
MySQL22-逻辑架构
MySQL19-Linux下MySQL的安装与使用
MySQL real battle optimization expert 11 starts with the addition, deletion and modification of data. Review the status of buffer pool in the database
该不会还有人不懂用C语言写扫雷游戏吧
MySQL20-MySQL的数据目录
CSDN问答模块标题推荐任务(一) —— 基本框架的搭建
ByteTrack: Multi-Object Tracking by Associating Every Detection Box 论文阅读笔记()
Water and rain condition monitoring reservoir water and rain condition online monitoring
随机推荐
Mysql28 database design specification
Anaconda3 安装cv2
MySQL flush operation
How to find the number of daffodils with simple and rough methods in C language
MySQL20-MySQL的数据目录
[unity] simulate jelly effect (with collision) -- tutorial on using jellysprites plug-in
MySQL33-多版本并发控制
该不会还有人不懂用C语言写扫雷游戏吧
@controller,@service,@repository,@component区别
Mysql27 - Optimisation des index et des requêtes
ZABBIX introduction and installation
[leectode 2022.2.13] maximum number of "balloons"
Pytorch RNN actual combat case_ MNIST handwriting font recognition
CSDN-NLP:基于技能树和弱监督学习的博文难度等级分类 (一)
windows无法启动MYSQL服务(位于本地计算机)错误1067进程意外终止
第一篇博客
Global and Chinese market of wafer processing robots 2022-2028: Research Report on technology, participants, trends, market size and share
CSDN问答标签技能树(二) —— 效果优化
Texttext data enhancement method data argument
Super detailed steps to implement Wechat public number H5 Message push