当前位置:网站首页>I didn't know it until I graduated -- the principle of HowNet duplication check and examples of weight reduction
I didn't know it until I graduated -- the principle of HowNet duplication check and examples of weight reduction
2022-07-07 05:43:00 【Panda aiqia rice】
Hey, everyone, good duck ! I'm a panda
Although I have graduated , But I can still remember the days when I was firmly controlled by my graduation thesis , Anyway, I'm free , I went to slip around the corner ~ ~ ~
Now colleges and universities adopt the detection system for master's and doctoral dissertations , It was developed by HowNet . But the specific algorithm of the software , Judging standard , I didn't know before ,
This article was obtained from the internal staff of HowNet , The algorithm of HowNet anti plagiarism detection system is revealed , How to judge whether a paper is plagiarized , And how to modify the secret script to pass . Send it out to benefit everyone .
Let's collect it for ourselves Okay? ?
quote :
1、 Requirements for format
The degree thesis of HowNet is detected as the whole article upload , The format may affect the test results , The final submission format needs to be submitted for testing , Minimize the impact , This effect may not be detected for small segments of dozens of words . Will not affect the passage . The algorithm of the system is complex , Every time the paper is revised and retested, there may be a small piece of plagiarism that is not detected for the first time ( the 2 Years of practical experience has proved , This paragraph will not exceed 200 word , And the second repair
The plagiarism rate of the modified papers will generally be greatly reduced )
2、 Comparison Library
The comparison library is : General library of Chinese Academic Journals Online Publishing , China doctoral dissertation full text database / China excellent master's thesis full text database , Full text database of papers of important conferences in China , Full text database of important Chinese newspapers , China patent full text database , Personal comparison Library , Other comparison Libraries , Some books are not in HowNet , Plagiarism cannot be detected . The HowNet library is the national designated paper detection comparison Library , The state designates the University thesis detection system as the HowNet degree thesis detection system , This system is the best at present 、 The most extensive official detection system , All colleges and universities are knowledge network detection systems , This is implemented by the Ministry of education in consideration of the fairness of national academic misconduct .
3、 The results of segmentation and subchapter
After uploading the paper , The system will automatically detect the chapter information of the paper , If the directory setting of your school meets the chapter judgment conditions built in the HowNet system , The system will detect according to the chapter , Chapter by chapter , Otherwise, the results will be segmented . About subsection or subchapter, it mainly involves 4 Threshold in . Integrity paper reminder , Whether it is sub chapter or sub paragraph , Just keep consistent with the school .
4、 Can the quoted be detected ?
Some students asked :“ I clearly quoted other people's paragraphs or sentences , Why it's not detected ?” Some students also asked :“ My quotation is marked with the source , Why is it plagiarism ?” First , Is quotation plagiarism , It has nothing to do with the marked source , Can references be detected , It has nothing to do with the accuracy of the system . All these depend on the threshold of the system . CNKI has set a threshold for the sensitivity of the detection system , The The threshold is 3%, In paragraphs ( Or chapter ) To calculate the number of words , A single document is lower than 3% The plagiarism or citation of is undetectable , This situation is common in small sentences or concepts in large paragraphs .
for instance : If you test a paragraph 1( Chapter one ) Yes 10000 word , Then quote A The literature 300 word (10000 multiply 3%=300) within , It's not going to be detected . If quoted B Literature exceeds 300 word , that B Plagiarism in the first chapter of the literature will be marked in red , No matter where in the first chapter , Even break into sentences , Just over 20 The words will be marked .① In fact, here is a way to modify it , That is, plagiarize paragraphs, never choose an article to quote , Select as many documents as possible , Take a few sentences from one article , It won't be detected .② Some students asked why the quotation is also plagiarism , This is mainly because of the threshold of HowNet , higher than 3% The unification of is plagiarism , That is to say, the critical point of quoting plagiarism is 3% Between . Once you exceed the standard , Even if you mark the quotation, it won't help . So please pay attention . Let's give an example of : The first chapter of a paper has 5000 word , Then in the first chapter , We can only quote A The literature 150 Words below , Otherwise, it will be regarded as plagiarism by the system . Chapter two 4000 word , Then we can only quote A The literature 120 Words below , Otherwise, it will be regarded as plagiarism by the system . The third chapter 8000 word , Chapter four 7000 word , Respectively 240 Below words and 210 Words below , And so on .
in summary , The calculation method of quotation exceeding the standard is to calculate by chapter , This is the same as plagiarism .
5、 How can the system plagiarize a sentence ?
How can plagiarism of a paper be detected ? The condition of HowNet paper detection is 20 The similarity or plagiarism above the character unit will be marked by the red letter , But it has to meet 4 The premise inside : What you quote or plagiarize A The sum of the literature text is in your various detection paragraphs ( Each chapter ) In order to achieve 3%.
6、 Modification of plagiarism
In addition to 3 In addition to , also Change words 、 Sentence change 、 Change the description ( Change the original sentence into inverted sentence 、 Passive sentence 、 Active sentences, etc )、 Disorganize paragraphs 、 Delete key words 、 Key sentences, etc . It has been proved by practice that , Use the above methods in combination with , It can effectively reduce the replication ratio , Ensure the smooth passage of .
On the whole , We need to ensure the smoothness of the revised sentence , Try to be different from the original sentence literally .
example 1: For example, the following sentence :
There is a difference between overheating in overheating fault and heating under normal operation of transformer , The heat source during normal operation comes from the winding and iron core , Namely copper loss and iron loss , The overheating fault of transformer is the accelerated deterioration of insulation caused by effective thermal stress , It has a medium level of energy density .
Almost marked red , It shows that there is overlap and high similarity with similar literature , Through the combination of the above methods , This sentence can be changed to :
Overheating in overheating fault is easy to be confused with heating under normal operation of transformer , The latter is due to the copper loss and iron loss of its windings and iron cores , This is the heating during normal operation , The overheating fault of transformer is the accelerated deterioration of insulation caused by effective thermal stress …
① In this case 300 Word is a rough value , Not a critical value . The lower the number of references , The less likely it is to be detected .
② Updated CNKI The academic misconduct detection system has adjusted this threshold to 3%, It used to be 5%, It means that the detection system has stricter requirements for reference , But it is not very difficult to use the method we mentioned later . Have a medium level of capability density .
This modification can reduce the plagiarism rate by almost half .
example 2: Look at the following example sentence :
Put a small amount of fiber into the clear water of the transparent water cup to stir , It can be intuitively found that the fibers are dispersed in a three-dimensional suspension , And it won't change much after being placed for a long time , It shows that the quality of synthetic fiber is better ; Poor quality fibers may disperse after agitation , But after a short time, it will float up into a flocculent layer . Poor quality fibers are not easy to be evenly dispersed in the actual preparation of concrete .
This paragraph is completely marked red , There is only one way to modify , Is to disrupt the order , Reorganize .
Put a small amount of fiber into a transparent container containing water , Observe the fiber changes while stirring , If the quality of synthetic fiber is better , Then you can intuitively see that the fibers are dispersed in a three-dimensional suspension , as time goes on , The position will not change significantly ; If the quality of synthetic fiber is poor , In the process of stirring , Fibers may disperse , And it is easy to float up to form a flocculent layer . Poor quality fibers are not easy to be evenly dispersed in the actual preparation of concrete .
I think back to those years when I painfully changed my thesis in school , It's really painful … It's really hard to get an idea to graduate smoothly …
I'm a panda , I hope this article will help you , I'll see you in the next article (*◡‿◡)
边栏推荐
- 淘宝商品详情页API接口、淘宝商品列表API接口,淘宝商品销量API接口,淘宝APP详情API接口,淘宝详情API接口
- Taobao store release API interface (New), Taobao oauth2.0 store commodity API interface, Taobao commodity release API interface, Taobao commodity launch API interface, a complete set of launch store i
- Web Authentication API兼容版本信息
- 爬虫练习题(三)
- How digitalization affects workflow automation
- 淘宝店铺发布API接口(新),淘宝oAuth2.0店铺商品API接口,淘宝商品发布API接口,淘宝商品上架API接口,一整套发布上架店铺接口对接分享
- JVM (19) -- bytecode and class loading (4) -- talk about class loader again
- async / await
- Design, configuration and points for attention of network arbitrary source multicast (ASM) simulation using OPNET
- 论文阅读【MM21 Pre-training for Video Understanding Challenge:Video Captioning with Pretraining Techniqu】
猜你喜欢
JSP setting header information export to excel
Paper reading [open book video captioning with retrieve copy generate network]
得物客服一站式工作台卡顿优化之路
Mysql database learning (8) -- MySQL content supplement
Différenciation et introduction des services groupés, distribués et microservices
AI人脸编辑让Lena微笑
ssm框架的简单案例
Lombok插件
导航栏根据路由变换颜色
SQL query: subtract the previous row from the next row and make corresponding calculations
随机推荐
Talk about mvcc multi version concurrency controller?
集群、分布式、微服務的區別和介紹
sql优化常用技巧及理解
基于NCF的多模块协同实例
论文阅读【Semantic Tag Augmented XlanV Model for Video Captioning】
Differences and introduction of cluster, distributed and microservice
When deleting a file, the prompt "the length of the source file name is greater than the length supported by the system" cannot be deleted. Solution
Distributed global ID generation scheme
张平安:加快云上数字创新,共建产业智慧生态
K6el-100 leakage relay
SAP webservice 测试出现404 Not found Service cannot be reached
Senior programmers must know and master. This article explains in detail the principle of MySQL master-slave synchronization, and recommends collecting
WEB架构设计过程
1. AVL tree: left-right rotation -bite
nodejs获取客户端ip
分布式事务解决方案之TCC
不同网段之间实现GDB远程调试功能
How Alibaba cloud's DPCA architecture works | popular science diagram
Codeforces Round #416 (Div. 2) D. Vladik and Favorite Game
Mybaits multi table query (joint query, nested query)