当前位置:网站首页>CSDN question and answer module Title Recommendation task (I) -- Construction of basic framework
CSDN question and answer module Title Recommendation task (I) -- Construction of basic framework
2022-07-06 10:41:00 【Alexxinlu】
Catalog
Series articles
- CSDN Q & a module Title recommended tasks ( One ) —— Construction of basic framework
- CSDN Q & a module Title recommended tasks ( Two ) —— Effect optimization
Team blog : CSDN AI team
1 Problem definition
1.1 background
stay CSDN Of Q & a module in , Many beginners' question titles lack effective information , for example :
Help the children !
Boss, help me !!! Help me finish this topic
Ask for the great God !!!

In the example above , A better title would be “ How to add scroll bars to mobile pages ?”
In the title of this kind of question , It contains very little useful information , Unable to quickly understand the meaning of the problem from the title , To a certain extent, it will affect the efficiency of question respondents and the user experience . Besides , Such data will further affect the downstream NLP The effect of the mission , for example : Problem classification , Question recommendation, etc .
Therefore, in order to improve the quality of question titles , Need information based on problem description , After the user enters the title and problem description , Recommend more accurate titles to users , And prompt the user to change .
1.2 Input
The available input information is shown below :
"id": 998678,
"title": “ For help , Very simple. C# problem ”,
"body": “ The teacher asked to build a score management system , Classification login is required , But this cheng’xu If the user is a student , Even if comboBox You can log in smoothly without selecting students ,admin There is no such problem , Ask the boss why ?\r\n\t\r\n\t\r\n\tprivate void button1_Click(object sender, EventArgs e)\r\n {\r\n string sUser = txtUser.Text.ToString();\r\n string sPassword = txtPassword.Text.ToString();\r\n\r\n if (sUser == “admin” && sPassword == “1234” && comboBoxLeixing.Text == “ Administrators ”)\r\n {\r\n Menuadmin main = new Menuadmin();\r\n main.Show();\r\n this.Hide();\r\n }\r\n\r\n if (sUser == “ Xuguangrui ” || sUser == “ Cao Guang ” || sUser == “ Cao ziyue ” || sUser == “ Chen Sijia ” || sUser == “ Chen Xu ” || sUser == “ Huang Wenguang ” ||\r\n sUser == “ Lei Zhangshu ” || sUser == “ Liuqingqing ” || sUser == “ Qi SHIMENG ” || sUser == “ Shen bin ” || sUser == “ Shuaixing ” || sUser == “ Sun Quanwei ” ||\r\n sUser == “ Wang Heng ” || sUser == “ Wang Rui ” || sUser == “ Xiang Meng ” || sUser == “ Zhang Guoliang ” || sUser == “ Zhangzongyou ” || sUser == “ Zhang Shumin ”\r\n && sPassword == “1234” && comboBoxLeixing.Text == “ Student ”)\r\n {\r\n Menustudent main = new Menustudent();\r\n main.Show();\r\n this.Hide();\r\n }\r\n\r\n if (sUser == “ Liu Zhaoliang ” || sUser == “ Longlong ” || sUser == “ Feng Wei ” || sUser == “ Liu shanyong ” ||\r\n sUser == “ India forest ” || sUser == “ Cheng Leli ” || sUser == “ Liu Yan ” || sUser == “ Zhao Junwei ”\r\n && sPassword == “1234” && comboBoxLeixing.Text== “ Teachers' ”)\r\n {\r\n Menuteacher main = new Menuteacher();\r\n main.Show();\r\n this.Hide();\r\n }\r\n \r\n else\r\n label3.Text = “ Wrong user name or password , Please re-enter !”;\r\n }”,
"tag_id": 95,
"tag_name": “c Language ”
The input mainly includes the above five fields , among title Is the title that needs improvement .
Currently only title and body Two fields as input .
1.3 Output
Improved question Title .
2 Solution
This paper further abstracts the problem as NLP Text summary task in , The specific implementation steps are as follows :
2.1 Data preprocessing
At present, the following preprocessing operations are mainly done :
- Remove irrelevant information . for example : Code segment 、URL、 Irrelevant characters, etc ;
- Cut the paragraph into sentences . Segmentation based on delimiters , for example : A newline 、 Full stop 、 question mark 、 Exclamation marks, etc .
2.2 Model
2.2.1 Rough sort
The current scheme uses classic Extraction model TextRank, Rank all sentences entered , The final choice TopN Sentence to recommend .
2.2.2 Fine sorting
Because this article is to recommend the title of the question , Therefore, questions should be given priority .
A dictionary based approach is used here , Identify all questions in the input . Then the result of rough sorting , Put the questions at the top .
2.3 Experimental results and error data analysis
The preliminary analysis results are shown in the figure below :
It can be seen from the above figure , At present, the main problems include :
- Sample question : Some questions body There are only pictures in 、 Code snippets and so on , It does not contain useful Chinese text information .
- The title is too long : The current preprocessing method is too simple , Lead to segmentation , Some sentences are too long , And the current model is the extraction text summarization algorithm , The input sentence will not be modified . Therefore, some recommended titles are too long . And the title of the question is generally more concise .
3 Next step
- Classify the samples , For samples with only images or code snippets , You need to identify and judge the information , Then make a title recommendation .
- Simplify the title , Consider using problem templates or generative text summarization methods for improvement .
P.S.
This series of articles will be continuously updated . What we are doing now is too simple , The effect is not satisfactory , hope NLP Colleagues in other fields 、 Teachers and experts can provide valuable advice , thank you !
边栏推荐
- A brief introduction to the microservice technology stack, the introduction and use of Eureka and ribbon
- Use xtrabackup for MySQL database physical backup
- MySQL combat optimization expert 12 what does the memory data structure buffer pool look like?
- UnicodeDecodeError: ‘utf-8‘ codec can‘t decode byte 0xd0 in position 0成功解决
- MySQL底层的逻辑架构
- MySQL combat optimization expert 04 uses the execution process of update statements in the InnoDB storage engine to talk about what binlog is?
- Time complexity (see which sentence is executed the most times)
- MySQL combat optimization expert 10 production experience: how to deploy visual reporting system for database monitoring system?
- [Julia] exit notes - Serial
- MySQL33-多版本并发控制
猜你喜欢

Complete web login process through filter

MySQL26-性能分析工具的使用

Not registered via @enableconfigurationproperties, marked (@configurationproperties use)

Mysql22 logical architecture
![[unity] simulate jelly effect (with collision) -- tutorial on using jellysprites plug-in](/img/1f/93a6c6150ec2b002f667a882569b7b.jpg)
[unity] simulate jelly effect (with collision) -- tutorial on using jellysprites plug-in

Nanny hand-in-hand teaches you to write Gobang in C language

MySQL19-Linux下MySQL的安装与使用

CSDN博文摘要(一) —— 一个简单的初版实现

API learning of OpenGL (2003) gl_ TEXTURE_ WRAP_ S GL_ TEXTURE_ WRAP_ T

Export virtual machines from esxi 6.7 using OVF tool
随机推荐
MySQL24-索引的数据结构
Global and Chinese markets for aprotic solvents 2022-2028: Research Report on technology, participants, trends, market size and share
How to find the number of daffodils with simple and rough methods in C language
Introduction tutorial of typescript (dark horse programmer of station B)
Adaptive Bezier curve network for real-time end-to-end text recognition
ByteTrack: Multi-Object Tracking by Associating Every Detection Box 论文阅读笔记()
评估方法的优缺点
Mysql21 user and permission management
Const decorated member function problem
解决扫描不到xml、yml、properties文件配置
Mysql21 - gestion des utilisateurs et des droits
如何搭建接口自动化测试框架?
Not registered via @EnableConfigurationProperties, marked(@ConfigurationProperties的使用)
Use xtrabackup for MySQL database physical backup
第一篇博客
[leectode 2022.2.13] maximum number of "balloons"
What is the difference between TCP and UDP?
PyTorch RNN 实战案例_MNIST手写字体识别
MySQL27-索引優化與查詢優化
Good blog good material record link