当前位置:网站首页>CSDN question and answer tag skill tree (I) -- Construction of basic framework
CSDN question and answer tag skill tree (I) -- Construction of basic framework
2022-07-06 10:41:00 【Alexxinlu】
Catalog
Series articles
- CSDN Q & a tag skill tree ( One ) —— Construction of basic framework
- CSDN Q & a tag skill tree ( Two ) —— Effect optimization
- CSDN Q & a tag skill tree ( 3、 ... and ) —— Python The skill tree
- CSDN Q & a tag skill tree ( Four ) —— Java The skill tree
- CSDN Q & a tag skill tree ( 5、 ... and ) —— Cloud native skill tree
Team blog : CSDN AI team
1 Problem definition
1.1 background
At present CSDN The questions in the question and answer module are simply classified , for example :Python、Java、C Languages and other categories , Instead of mapping questions to specific knowledge points in the general category , For example, in the example below , The problem belongs to Python Data visualization in language .
Fine grained classification and division of problems , It can make the questioner more clearly understand the position of the question in the knowledge system , It is also convenient for the system to more accurately recommend relevant materials to the questioner for learning and reference .
In order to solve the above problems , This paper first builds a programming language skill tree for each category , Then map the previously adopted questions to specific nodes in the skill tree , Finally, for a new question , Based on the constructed skill tree , Match to the most similar node , And recommend the adopted questions on this node .
2 Solution
2.1 Knowledge gathering
To build a programming skill tree , First of all, we need to collect relevant knowledge , This paper starts with Python For example, programming languages , Carry out specific implementation .
Through online search and research , Summarize the following two channels :
Climb to the directory from a certain East
- By keyword "python" Search from a certain East , And select Top N Books
- Extract the contents of the directory field from the details page , Get the unprocessed directory

Learning path on website forum :
- for example : Liao Xuefeng's official website 、 Novice tutorial 、 Geek college
- Use web crawlers to crawl directories on websites , With Liao Xuefeng's official website For example , As shown in the figure below :

2.2 Construction of skill tree
After obtaining the corresponding knowledge resources , You need to store resources in a tree structure , In this paper treelib Package implementation .
To facilitate the merging of trees in the next section , This article limits the directory to 4 The layer structure :
- Chapter title . for example : The first part
- Sub chapter title . for example : The first 1 Chapter
- Section title . for example :1.1
- Section title . for example :1.1.1
The structured tree structure is shown in the following figure :

2.3 Merging of skill trees
After building a skill tree based on directories and knowledge system resources from different sources , You need to merge several different skill trees , Form a same Python The skill tree .
For the merging of trees , This paper mainly considers the following aspects :
- Merge by layer starting from the root node
- Use recursive method to merge multiple trees
- Similar nodes in the same layer need to be merged
- Use heuristic clustering methods ( There is no need to determine the number of clusters in advance ), Divide nodes into multiple clusters
- The similarity calculation method in clustering uses Longest common subsequence ratio + Levin steinby ( Edit distance ratio ) The method of calculation
- New node after merging , Use the longest common subsequence of multiple sentences to replace , for example :3 Nodes if Statements use 、if Statement processing list settings 、if Format of statement The longest common subsequence of is if sentence , Finally using if sentence As the value of the merge node .
- Remove useless nodes
- Use tree pruning + The method of dictionary , Remove useless nodes from the skill tree , for example : Summary of this chapter 、 Extended reading 、 project And other chapter nodes .
The merged skill tree is shown in the figure below :

2.4 Match the problem with the skill tree
After the skill tree is built , Need to put Python All adopted problems in the field are mapped to the corresponding nodes , And for a new question , Based on the constructed skill tree , Match to the most similar node , And recommend the adopted questions on this node .
The matching algorithm used in this paper is Levin steinby ( Edit distance ratio ), By calculating the levinstein ratio between the question and the node , Determine the node that best matches the question .
3 Summary and next step plan
summary
This paper mainly realizes the construction and merging of programming language skill tree , And the matching between questions and nodes in the skill tree . Now only the preliminary functions have been realized , The effect needs further optimization . The current problems mainly include :
- The removal of irrelevant nodes is not clean enough
- Similarity calculation method in clustering , And it is unreasonable to use the longest common subsequence to replace the new node after multiple nodes are merged , for example : Python Version run and Python code snippet Be divided into the same cluster , And merged into Python
- There is a big difference between the description style of questions and nodes in the skill tree , One is asking questions , One is knowledge , Use when asking questions and matching nodes Levin steinby ( Edit distance ratio ) The method of calculating similarity is unreasonable
- ……
Next step
For the current problems , Next, consider :
- Further improve the quality of the synthesized skill tree
- Improve the matching effect of problem and tree
边栏推荐
- 该不会还有人不懂用C语言写扫雷游戏吧
- MySQL27-索引优化与查询优化
- pytorch的Dataset的使用
- 评估方法的优缺点
- Emotional classification of 1.6 million comments on LSTM based on pytoch
- Use JUnit unit test & transaction usage
- Technology | diverse substrate formats
- @controller,@service,@repository,@component区别
- 数据库中间件_Mycat总结
- Global and Chinese market of thermal mixers 2022-2028: Research Report on technology, participants, trends, market size and share
猜你喜欢

Case identification based on pytoch pulmonary infection (using RESNET network structure)

UnicodeDecodeError: ‘utf-8‘ codec can‘t decode byte 0xd0 in position 0成功解决

Mysql22 logical architecture

Mysql36 database backup and recovery

Not registered via @enableconfigurationproperties, marked (@configurationproperties use)

Ueeditor internationalization configuration, supporting Chinese and English switching

Mysql34 other database logs

UEditor国际化配置,支持中英文切换

MySQL26-性能分析工具的使用

Record the first JDBC
随机推荐
text 文本数据增强方法 data argumentation
MySQL26-性能分析工具的使用
Isn't there anyone who doesn't know how to write mine sweeping games in C language
Database middleware_ MYCAT summary
Super detailed steps for pushing wechat official account H5 messages
February 13, 2022-2-climbing stairs
MySQL combat optimization expert 09 production experience: how to deploy a monitoring system for a database in a production environment?
How to change php INI file supports PDO abstraction layer
MySQL real battle optimization expert 11 starts with the addition, deletion and modification of data. Review the status of buffer pool in the database
[unity] simulate jelly effect (with collision) -- tutorial on using jellysprites plug-in
February 13, 2022 - Maximum subarray and
Windchill configure remote Oracle database connection
Complete web login process through filter
Good blog good material record link
评估方法的优缺点
API learning of OpenGL (2001) gltexgen
[paper reading notes] - cryptographic analysis of short RSA secret exponents
In fact, the implementation of current limiting is not complicated
Mysql33 multi version concurrency control
PyTorch RNN 实战案例_MNIST手写字体识别