当前位置:网站首页>CSDN question and answer module Title Recommendation task (I) -- Construction of basic framework
CSDN question and answer module Title Recommendation task (I) -- Construction of basic framework
2022-07-06 10:41:00 【Alexxinlu】
Catalog
Series articles
- CSDN Q & a module Title recommended tasks ( One ) —— Construction of basic framework
- CSDN Q & a module Title recommended tasks ( Two ) —— Effect optimization
Team blog : CSDN AI team
1 Problem definition
1.1 background
stay CSDN Of Q & a module in , Many beginners' question titles lack effective information , for example :
Help the children !
Boss, help me !!! Help me finish this topic
Ask for the great God !!!

In the example above , A better title would be “ How to add scroll bars to mobile pages ?”
In the title of this kind of question , It contains very little useful information , Unable to quickly understand the meaning of the problem from the title , To a certain extent, it will affect the efficiency of question respondents and the user experience . Besides , Such data will further affect the downstream NLP The effect of the mission , for example : Problem classification , Question recommendation, etc .
Therefore, in order to improve the quality of question titles , Need information based on problem description , After the user enters the title and problem description , Recommend more accurate titles to users , And prompt the user to change .
1.2 Input
The available input information is shown below :
"id": 998678,
"title": “ For help , Very simple. C# problem ”,
"body": “ The teacher asked to build a score management system , Classification login is required , But this cheng’xu If the user is a student , Even if comboBox You can log in smoothly without selecting students ,admin There is no such problem , Ask the boss why ?\r\n\t\r\n\t\r\n\tprivate void button1_Click(object sender, EventArgs e)\r\n {\r\n string sUser = txtUser.Text.ToString();\r\n string sPassword = txtPassword.Text.ToString();\r\n\r\n if (sUser == “admin” && sPassword == “1234” && comboBoxLeixing.Text == “ Administrators ”)\r\n {\r\n Menuadmin main = new Menuadmin();\r\n main.Show();\r\n this.Hide();\r\n }\r\n\r\n if (sUser == “ Xuguangrui ” || sUser == “ Cao Guang ” || sUser == “ Cao ziyue ” || sUser == “ Chen Sijia ” || sUser == “ Chen Xu ” || sUser == “ Huang Wenguang ” ||\r\n sUser == “ Lei Zhangshu ” || sUser == “ Liuqingqing ” || sUser == “ Qi SHIMENG ” || sUser == “ Shen bin ” || sUser == “ Shuaixing ” || sUser == “ Sun Quanwei ” ||\r\n sUser == “ Wang Heng ” || sUser == “ Wang Rui ” || sUser == “ Xiang Meng ” || sUser == “ Zhang Guoliang ” || sUser == “ Zhangzongyou ” || sUser == “ Zhang Shumin ”\r\n && sPassword == “1234” && comboBoxLeixing.Text == “ Student ”)\r\n {\r\n Menustudent main = new Menustudent();\r\n main.Show();\r\n this.Hide();\r\n }\r\n\r\n if (sUser == “ Liu Zhaoliang ” || sUser == “ Longlong ” || sUser == “ Feng Wei ” || sUser == “ Liu shanyong ” ||\r\n sUser == “ India forest ” || sUser == “ Cheng Leli ” || sUser == “ Liu Yan ” || sUser == “ Zhao Junwei ”\r\n && sPassword == “1234” && comboBoxLeixing.Text== “ Teachers' ”)\r\n {\r\n Menuteacher main = new Menuteacher();\r\n main.Show();\r\n this.Hide();\r\n }\r\n \r\n else\r\n label3.Text = “ Wrong user name or password , Please re-enter !”;\r\n }”,
"tag_id": 95,
"tag_name": “c Language ”
The input mainly includes the above five fields , among title Is the title that needs improvement .
Currently only title and body Two fields as input .
1.3 Output
Improved question Title .
2 Solution
This paper further abstracts the problem as NLP Text summary task in , The specific implementation steps are as follows :
2.1 Data preprocessing
At present, the following preprocessing operations are mainly done :
- Remove irrelevant information . for example : Code segment 、URL、 Irrelevant characters, etc ;
- Cut the paragraph into sentences . Segmentation based on delimiters , for example : A newline 、 Full stop 、 question mark 、 Exclamation marks, etc .
2.2 Model
2.2.1 Rough sort
The current scheme uses classic Extraction model TextRank, Rank all sentences entered , The final choice TopN Sentence to recommend .
2.2.2 Fine sorting
Because this article is to recommend the title of the question , Therefore, questions should be given priority .
A dictionary based approach is used here , Identify all questions in the input . Then the result of rough sorting , Put the questions at the top .
2.3 Experimental results and error data analysis
The preliminary analysis results are shown in the figure below :
It can be seen from the above figure , At present, the main problems include :
- Sample question : Some questions body There are only pictures in 、 Code snippets and so on , It does not contain useful Chinese text information .
- The title is too long : The current preprocessing method is too simple , Lead to segmentation , Some sentences are too long , And the current model is the extraction text summarization algorithm , The input sentence will not be modified . Therefore, some recommended titles are too long . And the title of the question is generally more concise .
3 Next step
- Classify the samples , For samples with only images or code snippets , You need to identify and judge the information , Then make a title recommendation .
- Simplify the title , Consider using problem templates or generative text summarization methods for improvement .
P.S.
This series of articles will be continuously updated . What we are doing now is too simple , The effect is not satisfactory , hope NLP Colleagues in other fields 、 Teachers and experts can provide valuable advice , thank you !
边栏推荐
- 解决在window中远程连接Linux下的MySQL
- [paper reading notes] - cryptographic analysis of short RSA secret exponents
- How to build an interface automation testing framework?
- 数据库中间件_Mycat总结
- MySQL21-用户与权限管理
- MySQL real battle optimization expert 11 starts with the addition, deletion and modification of data. Review the status of buffer pool in the database
- [paper reading notes] - cryptographic analysis of short RSA secret exponents
- MySQL28-数据库的设计规范
- Copy constructor template and copy assignment operator template
- Set shell script execution error to exit automatically
猜你喜欢

How to find the number of daffodils with simple and rough methods in C language

Solve the problem of remote connection to MySQL under Linux in Windows

Security design verification of API interface: ticket, signature, timestamp

MySQL27-索引優化與查詢優化

MySQL18-MySQL8其它新特性

Super detailed steps for pushing wechat official account H5 messages

Pytorch RNN actual combat case_ MNIST handwriting font recognition

API learning of OpenGL (2003) gl_ TEXTURE_ WRAP_ S GL_ TEXTURE_ WRAP_ T

Database middleware_ MYCAT summary

Complete web login process through filter
随机推荐
[Julia] exit notes - Serial
Advantages and disadvantages of evaluation methods
MySQL33-多版本并发控制
MySQL29-数据库其它调优策略
MySQL real battle optimization expert 11 starts with the addition, deletion and modification of data. Review the status of buffer pool in the database
Implement context manager through with
In fact, the implementation of current limiting is not complicated
MySQL combat optimization expert 05 production experience: how to plan the database machine configuration in the real production environment?
Mysql21 user and permission management
Unicode decodeerror: 'UTF-8' codec can't decode byte 0xd0 in position 0 successfully resolved
高并发系统的限流方案研究,其实限流实现也不复杂
MySQL transaction log
[C language] deeply analyze the underlying principle of data storage
Valentine's Day is coming, are you still worried about eating dog food? Teach you to make a confession wall hand in hand. Express your love to the person you want
MySQL30-事务基础知识
Mysql36 database backup and recovery
Texttext data enhancement method data argument
Mysql33 multi version concurrency control
Ueeditor internationalization configuration, supporting Chinese and English switching
[paper reading notes] - cryptographic analysis of short RSA secret exponents