当前位置:网站首页>CSDN question and answer tag skill tree (I) -- Construction of basic framework
CSDN question and answer tag skill tree (I) -- Construction of basic framework
2022-07-06 10:41:00 【Alexxinlu】
Catalog
Series articles
- CSDN Q & a tag skill tree ( One ) —— Construction of basic framework
- CSDN Q & a tag skill tree ( Two ) —— Effect optimization
- CSDN Q & a tag skill tree ( 3、 ... and ) —— Python The skill tree
- CSDN Q & a tag skill tree ( Four ) —— Java The skill tree
- CSDN Q & a tag skill tree ( 5、 ... and ) —— Cloud native skill tree
Team blog : CSDN AI team
1 Problem definition
1.1 background
At present CSDN The questions in the question and answer module are simply classified , for example :Python、Java、C Languages and other categories , Instead of mapping questions to specific knowledge points in the general category , For example, in the example below , The problem belongs to Python Data visualization in language .
Fine grained classification and division of problems , It can make the questioner more clearly understand the position of the question in the knowledge system , It is also convenient for the system to more accurately recommend relevant materials to the questioner for learning and reference .
In order to solve the above problems , This paper first builds a programming language skill tree for each category , Then map the previously adopted questions to specific nodes in the skill tree , Finally, for a new question , Based on the constructed skill tree , Match to the most similar node , And recommend the adopted questions on this node .
2 Solution
2.1 Knowledge gathering
To build a programming skill tree , First of all, we need to collect relevant knowledge , This paper starts with Python For example, programming languages , Carry out specific implementation .
Through online search and research , Summarize the following two channels :
Climb to the directory from a certain East
- By keyword "python" Search from a certain East , And select Top N Books
- Extract the contents of the directory field from the details page , Get the unprocessed directory
Learning path on website forum :
- for example : Liao Xuefeng's official website 、 Novice tutorial 、 Geek college
- Use web crawlers to crawl directories on websites , With Liao Xuefeng's official website For example , As shown in the figure below :
2.2 Construction of skill tree
After obtaining the corresponding knowledge resources , You need to store resources in a tree structure , In this paper treelib Package implementation .
To facilitate the merging of trees in the next section , This article limits the directory to 4 The layer structure :
- Chapter title . for example : The first part
- Sub chapter title . for example : The first 1 Chapter
- Section title . for example :1.1
- Section title . for example :1.1.1
The structured tree structure is shown in the following figure :
2.3 Merging of skill trees
After building a skill tree based on directories and knowledge system resources from different sources , You need to merge several different skill trees , Form a same Python The skill tree .
For the merging of trees , This paper mainly considers the following aspects :
- Merge by layer starting from the root node
- Use recursive method to merge multiple trees
- Similar nodes in the same layer need to be merged
- Use heuristic clustering methods ( There is no need to determine the number of clusters in advance ), Divide nodes into multiple clusters
- The similarity calculation method in clustering uses Longest common subsequence ratio + Levin steinby ( Edit distance ratio ) The method of calculation
- New node after merging , Use the longest common subsequence of multiple sentences to replace , for example :3 Nodes if Statements use 、if Statement processing list settings 、if Format of statement The longest common subsequence of is if sentence , Finally using if sentence As the value of the merge node .
- Remove useless nodes
- Use tree pruning + The method of dictionary , Remove useless nodes from the skill tree , for example : Summary of this chapter 、 Extended reading 、 project And other chapter nodes .
The merged skill tree is shown in the figure below :
2.4 Match the problem with the skill tree
After the skill tree is built , Need to put Python All adopted problems in the field are mapped to the corresponding nodes , And for a new question , Based on the constructed skill tree , Match to the most similar node , And recommend the adopted questions on this node .
The matching algorithm used in this paper is Levin steinby ( Edit distance ratio ), By calculating the levinstein ratio between the question and the node , Determine the node that best matches the question .
3 Summary and next step plan
summary
This paper mainly realizes the construction and merging of programming language skill tree , And the matching between questions and nodes in the skill tree . Now only the preliminary functions have been realized , The effect needs further optimization . The current problems mainly include :
- The removal of irrelevant nodes is not clean enough
- Similarity calculation method in clustering , And it is unreasonable to use the longest common subsequence to replace the new node after multiple nodes are merged , for example : Python Version run and Python code snippet Be divided into the same cluster , And merged into Python
- There is a big difference between the description style of questions and nodes in the skill tree , One is asking questions , One is knowledge , Use when asking questions and matching nodes Levin steinby ( Edit distance ratio ) The method of calculating similarity is unreasonable
- ……
Next step
For the current problems , Next, consider :
- Further improve the quality of the synthesized skill tree
- Improve the matching effect of problem and tree
边栏推荐
- CSDN博文摘要(一) —— 一个简单的初版实现
- [untitled]
- Just remember Balabala
- Pytoch LSTM implementation process (visual version)
- Not registered via @enableconfigurationproperties, marked (@configurationproperties use)
- Yum prompt another app is currently holding the yum lock; waiting for it to exit...
- [paper reading notes] - cryptographic analysis of short RSA secret exponents
- MySQL25-索引的创建与设计原则
- API learning of OpenGL (2004) gl_ TEXTURE_ MIN_ FILTER GL_ TEXTURE_ MAG_ FILTER
- Win10: how to modify the priority of dual network cards?
猜你喜欢
Mysql28 database design specification
Use xtrabackup for MySQL database physical backup
MySQL combat optimization expert 12 what does the memory data structure buffer pool look like?
Adaptive Bezier curve network for real-time end-to-end text recognition
【C语言】深度剖析数据存储的底层原理
A necessary soft skill for Software Test Engineers: structured thinking
MySQL27-索引優化與查詢優化
MySQL29-数据库其它调优策略
Not registered via @enableconfigurationproperties, marked (@configurationproperties use)
Just remember Balabala
随机推荐
UnicodeDecodeError: ‘utf-8‘ codec can‘t decode byte 0xd0 in position 0成功解决
实现微信公众号H5消息推送的超级详细步骤
[paper reading notes] - cryptographic analysis of short RSA secret exponents
Mysql32 lock
Anaconda3 安装cv2
Mysql25 index creation and design principles
MySQL20-MySQL的数据目录
MySQL storage engine
Not registered via @enableconfigurationproperties, marked (@configurationproperties use)
Const decorated member function problem
Pytorch RNN actual combat case_ MNIST handwriting font recognition
MNIST implementation using pytoch in jupyter notebook
Complete web login process through filter
CSDN-NLP:基于技能树和弱监督学习的博文难度等级分类 (一)
Technology | diverse substrate formats
Case identification based on pytoch pulmonary infection (using RESNET network structure)
MySQL flush operation
Sed text processing
PyTorch RNN 实战案例_MNIST手写字体识别
Pytorch LSTM实现流程(可视化版本)