当前位置:网站首页>CSDN question and answer tag skill tree (I) -- Construction of basic framework
CSDN question and answer tag skill tree (I) -- Construction of basic framework
2022-07-06 10:41:00 【Alexxinlu】
Catalog
Series articles
- CSDN Q & a tag skill tree ( One ) —— Construction of basic framework
- CSDN Q & a tag skill tree ( Two ) —— Effect optimization
- CSDN Q & a tag skill tree ( 3、 ... and ) —— Python The skill tree
- CSDN Q & a tag skill tree ( Four ) —— Java The skill tree
- CSDN Q & a tag skill tree ( 5、 ... and ) —— Cloud native skill tree
Team blog : CSDN AI team
1 Problem definition
1.1 background
At present CSDN The questions in the question and answer module are simply classified , for example :Python、Java、C Languages and other categories , Instead of mapping questions to specific knowledge points in the general category , For example, in the example below , The problem belongs to Python Data visualization in language .
Fine grained classification and division of problems , It can make the questioner more clearly understand the position of the question in the knowledge system , It is also convenient for the system to more accurately recommend relevant materials to the questioner for learning and reference .
In order to solve the above problems , This paper first builds a programming language skill tree for each category , Then map the previously adopted questions to specific nodes in the skill tree , Finally, for a new question , Based on the constructed skill tree , Match to the most similar node , And recommend the adopted questions on this node .
2 Solution
2.1 Knowledge gathering
To build a programming skill tree , First of all, we need to collect relevant knowledge , This paper starts with Python For example, programming languages , Carry out specific implementation .
Through online search and research , Summarize the following two channels :
Climb to the directory from a certain East
- By keyword "python" Search from a certain East , And select Top N Books
- Extract the contents of the directory field from the details page , Get the unprocessed directory
Learning path on website forum :
- for example : Liao Xuefeng's official website 、 Novice tutorial 、 Geek college
- Use web crawlers to crawl directories on websites , With Liao Xuefeng's official website For example , As shown in the figure below :
2.2 Construction of skill tree
After obtaining the corresponding knowledge resources , You need to store resources in a tree structure , In this paper treelib Package implementation .
To facilitate the merging of trees in the next section , This article limits the directory to 4 The layer structure :
- Chapter title . for example : The first part
- Sub chapter title . for example : The first 1 Chapter
- Section title . for example :1.1
- Section title . for example :1.1.1
The structured tree structure is shown in the following figure :
2.3 Merging of skill trees
After building a skill tree based on directories and knowledge system resources from different sources , You need to merge several different skill trees , Form a same Python The skill tree .
For the merging of trees , This paper mainly considers the following aspects :
- Merge by layer starting from the root node
- Use recursive method to merge multiple trees
- Similar nodes in the same layer need to be merged
- Use heuristic clustering methods ( There is no need to determine the number of clusters in advance ), Divide nodes into multiple clusters
- The similarity calculation method in clustering uses Longest common subsequence ratio + Levin steinby ( Edit distance ratio ) The method of calculation
- New node after merging , Use the longest common subsequence of multiple sentences to replace , for example :3 Nodes if Statements use 、if Statement processing list settings 、if Format of statement The longest common subsequence of is if sentence , Finally using if sentence As the value of the merge node .
- Remove useless nodes
- Use tree pruning + The method of dictionary , Remove useless nodes from the skill tree , for example : Summary of this chapter 、 Extended reading 、 project And other chapter nodes .
The merged skill tree is shown in the figure below :
2.4 Match the problem with the skill tree
After the skill tree is built , Need to put Python All adopted problems in the field are mapped to the corresponding nodes , And for a new question , Based on the constructed skill tree , Match to the most similar node , And recommend the adopted questions on this node .
The matching algorithm used in this paper is Levin steinby ( Edit distance ratio ), By calculating the levinstein ratio between the question and the node , Determine the node that best matches the question .
3 Summary and next step plan
summary
This paper mainly realizes the construction and merging of programming language skill tree , And the matching between questions and nodes in the skill tree . Now only the preliminary functions have been realized , The effect needs further optimization . The current problems mainly include :
- The removal of irrelevant nodes is not clean enough
- Similarity calculation method in clustering , And it is unreasonable to use the longest common subsequence to replace the new node after multiple nodes are merged , for example : Python Version run and Python code snippet Be divided into the same cluster , And merged into Python
- There is a big difference between the description style of questions and nodes in the skill tree , One is asking questions , One is knowledge , Use when asking questions and matching nodes Levin steinby ( Edit distance ratio ) The method of calculating similarity is unreasonable
- ……
Next step
For the current problems , Next, consider :
- Further improve the quality of the synthesized skill tree
- Improve the matching effect of problem and tree
边栏推荐
- MySQL25-索引的创建与设计原则
- Yum prompt another app is currently holding the yum lock; waiting for it to exit...
- Breadth first search rotten orange
- MySQL30-事务基础知识
- Global and Chinese markets of static transfer switches (STS) 2022-2028: Research Report on technology, participants, trends, market size and share
- Implement sending post request with form data parameter
- Copy constructor template and copy assignment operator template
- 评估方法的优缺点
- Set shell script execution error to exit automatically
- Security design verification of API interface: ticket, signature, timestamp
猜你喜欢
Mysql30 transaction Basics
Just remember Balabala
Const decorated member function problem
MySQL 20 MySQL data directory
【C语言】深度剖析数据存储的底层原理
Moteur de stockage mysql23
What is the current situation of the game industry in the Internet world?
In fact, the implementation of current limiting is not complicated
Emotional classification of 1.6 million comments on LSTM based on pytoch
A brief introduction to the microservice technology stack, the introduction and use of Eureka and ribbon
随机推荐
The underlying logical architecture of MySQL
MySQL26-性能分析工具的使用
Mysql32 lock
MySQL22-逻辑架构
基于Pytorch的LSTM实战160万条评论情感分类
API learning of OpenGL (2001) gltexgen
Kubesphere - deploy the actual combat with the deployment file (3)
MySQL storage engine
使用OVF Tool工具从Esxi 6.7中导出虚拟机
[after reading the series] how to realize app automation without programming (automatically start Kwai APP)
Bytetrack: multi object tracking by associating every detection box paper reading notes ()
C语言标准的发展
Mysql21 user and permission management
MySQL25-索引的创建与设计原则
Mysql33 multi version concurrency control
Time complexity (see which sentence is executed the most times)
Mysql30 transaction Basics
First blog
Anaconda3 安装cv2
数据库中间件_Mycat总结