当前位置:网站首页>CSDN question and answer tag skill tree (I) -- Construction of basic framework
CSDN question and answer tag skill tree (I) -- Construction of basic framework
2022-07-06 10:41:00 【Alexxinlu】
Catalog
Series articles
- CSDN Q & a tag skill tree ( One ) —— Construction of basic framework
- CSDN Q & a tag skill tree ( Two ) —— Effect optimization
- CSDN Q & a tag skill tree ( 3、 ... and ) —— Python The skill tree
- CSDN Q & a tag skill tree ( Four ) —— Java The skill tree
- CSDN Q & a tag skill tree ( 5、 ... and ) —— Cloud native skill tree
Team blog : CSDN AI team
1 Problem definition
1.1 background
At present CSDN The questions in the question and answer module are simply classified , for example :Python、Java、C Languages and other categories , Instead of mapping questions to specific knowledge points in the general category , For example, in the example below , The problem belongs to Python Data visualization in language .
Fine grained classification and division of problems , It can make the questioner more clearly understand the position of the question in the knowledge system , It is also convenient for the system to more accurately recommend relevant materials to the questioner for learning and reference .
In order to solve the above problems , This paper first builds a programming language skill tree for each category , Then map the previously adopted questions to specific nodes in the skill tree , Finally, for a new question , Based on the constructed skill tree , Match to the most similar node , And recommend the adopted questions on this node .
2 Solution
2.1 Knowledge gathering
To build a programming skill tree , First of all, we need to collect relevant knowledge , This paper starts with Python For example, programming languages , Carry out specific implementation .
Through online search and research , Summarize the following two channels :
Climb to the directory from a certain East
- By keyword "python" Search from a certain East , And select Top N Books
- Extract the contents of the directory field from the details page , Get the unprocessed directory

Learning path on website forum :
- for example : Liao Xuefeng's official website 、 Novice tutorial 、 Geek college
- Use web crawlers to crawl directories on websites , With Liao Xuefeng's official website For example , As shown in the figure below :

2.2 Construction of skill tree
After obtaining the corresponding knowledge resources , You need to store resources in a tree structure , In this paper treelib Package implementation .
To facilitate the merging of trees in the next section , This article limits the directory to 4 The layer structure :
- Chapter title . for example : The first part
- Sub chapter title . for example : The first 1 Chapter
- Section title . for example :1.1
- Section title . for example :1.1.1
The structured tree structure is shown in the following figure :

2.3 Merging of skill trees
After building a skill tree based on directories and knowledge system resources from different sources , You need to merge several different skill trees , Form a same Python The skill tree .
For the merging of trees , This paper mainly considers the following aspects :
- Merge by layer starting from the root node
- Use recursive method to merge multiple trees
- Similar nodes in the same layer need to be merged
- Use heuristic clustering methods ( There is no need to determine the number of clusters in advance ), Divide nodes into multiple clusters
- The similarity calculation method in clustering uses Longest common subsequence ratio + Levin steinby ( Edit distance ratio ) The method of calculation
- New node after merging , Use the longest common subsequence of multiple sentences to replace , for example :3 Nodes if Statements use 、if Statement processing list settings 、if Format of statement The longest common subsequence of is if sentence , Finally using if sentence As the value of the merge node .
- Remove useless nodes
- Use tree pruning + The method of dictionary , Remove useless nodes from the skill tree , for example : Summary of this chapter 、 Extended reading 、 project And other chapter nodes .
The merged skill tree is shown in the figure below :

2.4 Match the problem with the skill tree
After the skill tree is built , Need to put Python All adopted problems in the field are mapped to the corresponding nodes , And for a new question , Based on the constructed skill tree , Match to the most similar node , And recommend the adopted questions on this node .
The matching algorithm used in this paper is Levin steinby ( Edit distance ratio ), By calculating the levinstein ratio between the question and the node , Determine the node that best matches the question .
3 Summary and next step plan
summary
This paper mainly realizes the construction and merging of programming language skill tree , And the matching between questions and nodes in the skill tree . Now only the preliminary functions have been realized , The effect needs further optimization . The current problems mainly include :
- The removal of irrelevant nodes is not clean enough
- Similarity calculation method in clustering , And it is unreasonable to use the longest common subsequence to replace the new node after multiple nodes are merged , for example : Python Version run and Python code snippet Be divided into the same cluster , And merged into Python
- There is a big difference between the description style of questions and nodes in the skill tree , One is asking questions , One is knowledge , Use when asking questions and matching nodes Levin steinby ( Edit distance ratio ) The method of calculating similarity is unreasonable
- ……
Next step
For the current problems , Next, consider :
- Further improve the quality of the synthesized skill tree
- Improve the matching effect of problem and tree
边栏推荐
- Anaconda3 安装cv2
- Mysql33 multi version concurrency control
- Pytorch RNN actual combat case_ MNIST handwriting font recognition
- MySQL28-数据库的设计规范
- 评估方法的优缺点
- Advantages and disadvantages of evaluation methods
- Not registered via @EnableConfigurationProperties, marked(@ConfigurationProperties的使用)
- Isn't there anyone who doesn't know how to write mine sweeping games in C language
- ZABBIX introduction and installation
- Export virtual machines from esxi 6.7 using OVF tool
猜你喜欢

Mysql24 index data structure

MySQL 20 MySQL data directory

MySQL33-多版本并发控制

API learning of OpenGL (2002) smooth flat of glsl

MySQL18-MySQL8其它新特性

Emotional classification of 1.6 million comments on LSTM based on pytoch

Unicode decodeerror: 'UTF-8' codec can't decode byte 0xd0 in position 0 successfully resolved

ByteTrack: Multi-Object Tracking by Associating Every Detection Box 论文阅读笔记()

Mysql22 logical architecture

MySQL real battle optimization expert 11 starts with the addition, deletion and modification of data. Review the status of buffer pool in the database
随机推荐
Global and Chinese markets for aprotic solvents 2022-2028: Research Report on technology, participants, trends, market size and share
CSDN问答模块标题推荐任务(二) —— 效果优化
UnicodeDecodeError: ‘utf-8‘ codec can‘t decode byte 0xd0 in position 0成功解决
MySQL24-索引的数据结构
CSDN博文摘要(一) —— 一个简单的初版实现
实现微信公众号H5消息推送的超级详细步骤
Global and Chinese market of transfer switches 2022-2028: Research Report on technology, participants, trends, market size and share
What is the current situation of the game industry in the Internet world?
Security design verification of API interface: ticket, signature, timestamp
MySQL21-用户与权限管理
Mysql26 use of performance analysis tools
Win10: how to modify the priority of dual network cards?
Just remember Balabala
ByteTrack: Multi-Object Tracking by Associating Every Detection Box 论文阅读笔记()
MySQL combat optimization expert 12 what does the memory data structure buffer pool look like?
Mysql32 lock
CSDN问答标签技能树(五) —— 云原生技能树
Mysql28 database design specification
[after reading the series of must know] one of how to realize app automation without programming (preparation)
Adaptive Bezier curve network for real-time end-to-end text recognition