当前位置:网站首页>I didn't know it until I graduated -- the principle of HowNet duplication check and examples of weight reduction
I didn't know it until I graduated -- the principle of HowNet duplication check and examples of weight reduction
2022-07-07 05:43:00 【Panda aiqia rice】
Hey, everyone, good duck ! I'm a panda
Although I have graduated , But I can still remember the days when I was firmly controlled by my graduation thesis , Anyway, I'm free , I went to slip around the corner ~ ~ ~
Now colleges and universities adopt the detection system for master's and doctoral dissertations , It was developed by HowNet . But the specific algorithm of the software , Judging standard , I didn't know before ,
This article was obtained from the internal staff of HowNet , The algorithm of HowNet anti plagiarism detection system is revealed , How to judge whether a paper is plagiarized , And how to modify the secret script to pass . Send it out to benefit everyone .
Let's collect it for ourselves Okay? ?
quote :
1、 Requirements for format
The degree thesis of HowNet is detected as the whole article upload , The format may affect the test results , The final submission format needs to be submitted for testing , Minimize the impact , This effect may not be detected for small segments of dozens of words . Will not affect the passage . The algorithm of the system is complex , Every time the paper is revised and retested, there may be a small piece of plagiarism that is not detected for the first time ( the 2 Years of practical experience has proved , This paragraph will not exceed 200 word , And the second repair
The plagiarism rate of the modified papers will generally be greatly reduced )
2、 Comparison Library
The comparison library is : General library of Chinese Academic Journals Online Publishing , China doctoral dissertation full text database / China excellent master's thesis full text database , Full text database of papers of important conferences in China , Full text database of important Chinese newspapers , China patent full text database , Personal comparison Library , Other comparison Libraries , Some books are not in HowNet , Plagiarism cannot be detected . The HowNet library is the national designated paper detection comparison Library , The state designates the University thesis detection system as the HowNet degree thesis detection system , This system is the best at present 、 The most extensive official detection system , All colleges and universities are knowledge network detection systems , This is implemented by the Ministry of education in consideration of the fairness of national academic misconduct .
3、 The results of segmentation and subchapter
After uploading the paper , The system will automatically detect the chapter information of the paper , If the directory setting of your school meets the chapter judgment conditions built in the HowNet system , The system will detect according to the chapter , Chapter by chapter , Otherwise, the results will be segmented . About subsection or subchapter, it mainly involves 4 Threshold in . Integrity paper reminder , Whether it is sub chapter or sub paragraph , Just keep consistent with the school .
4、 Can the quoted be detected ?
Some students asked :“ I clearly quoted other people's paragraphs or sentences , Why it's not detected ?” Some students also asked :“ My quotation is marked with the source , Why is it plagiarism ?” First , Is quotation plagiarism , It has nothing to do with the marked source , Can references be detected , It has nothing to do with the accuracy of the system . All these depend on the threshold of the system . CNKI has set a threshold for the sensitivity of the detection system , The The threshold is 3%, In paragraphs ( Or chapter ) To calculate the number of words , A single document is lower than 3% The plagiarism or citation of is undetectable , This situation is common in small sentences or concepts in large paragraphs .
for instance : If you test a paragraph 1( Chapter one ) Yes 10000 word , Then quote A The literature 300 word (10000 multiply 3%=300) within , It's not going to be detected . If quoted B Literature exceeds 300 word , that B Plagiarism in the first chapter of the literature will be marked in red , No matter where in the first chapter , Even break into sentences , Just over 20 The words will be marked .① In fact, here is a way to modify it , That is, plagiarize paragraphs, never choose an article to quote , Select as many documents as possible , Take a few sentences from one article , It won't be detected .② Some students asked why the quotation is also plagiarism , This is mainly because of the threshold of HowNet , higher than 3% The unification of is plagiarism , That is to say, the critical point of quoting plagiarism is 3% Between . Once you exceed the standard , Even if you mark the quotation, it won't help . So please pay attention . Let's give an example of : The first chapter of a paper has 5000 word , Then in the first chapter , We can only quote A The literature 150 Words below , Otherwise, it will be regarded as plagiarism by the system . Chapter two 4000 word , Then we can only quote A The literature 120 Words below , Otherwise, it will be regarded as plagiarism by the system . The third chapter 8000 word , Chapter four 7000 word , Respectively 240 Below words and 210 Words below , And so on .
in summary , The calculation method of quotation exceeding the standard is to calculate by chapter , This is the same as plagiarism .
5、 How can the system plagiarize a sentence ?
How can plagiarism of a paper be detected ? The condition of HowNet paper detection is 20 The similarity or plagiarism above the character unit will be marked by the red letter , But it has to meet 4 The premise inside : What you quote or plagiarize A The sum of the literature text is in your various detection paragraphs ( Each chapter ) In order to achieve 3%.
6、 Modification of plagiarism
In addition to 3 In addition to , also Change words 、 Sentence change 、 Change the description ( Change the original sentence into inverted sentence 、 Passive sentence 、 Active sentences, etc )、 Disorganize paragraphs 、 Delete key words 、 Key sentences, etc . It has been proved by practice that , Use the above methods in combination with , It can effectively reduce the replication ratio , Ensure the smooth passage of .
On the whole , We need to ensure the smoothness of the revised sentence , Try to be different from the original sentence literally .
example 1: For example, the following sentence :
There is a difference between overheating in overheating fault and heating under normal operation of transformer , The heat source during normal operation comes from the winding and iron core , Namely copper loss and iron loss , The overheating fault of transformer is the accelerated deterioration of insulation caused by effective thermal stress , It has a medium level of energy density .
Almost marked red , It shows that there is overlap and high similarity with similar literature , Through the combination of the above methods , This sentence can be changed to :
Overheating in overheating fault is easy to be confused with heating under normal operation of transformer , The latter is due to the copper loss and iron loss of its windings and iron cores , This is the heating during normal operation , The overheating fault of transformer is the accelerated deterioration of insulation caused by effective thermal stress …
① In this case 300 Word is a rough value , Not a critical value . The lower the number of references , The less likely it is to be detected .
② Updated CNKI The academic misconduct detection system has adjusted this threshold to 3%, It used to be 5%, It means that the detection system has stricter requirements for reference , But it is not very difficult to use the method we mentioned later . Have a medium level of capability density .
This modification can reduce the plagiarism rate by almost half .
example 2: Look at the following example sentence :
Put a small amount of fiber into the clear water of the transparent water cup to stir , It can be intuitively found that the fibers are dispersed in a three-dimensional suspension , And it won't change much after being placed for a long time , It shows that the quality of synthetic fiber is better ; Poor quality fibers may disperse after agitation , But after a short time, it will float up into a flocculent layer . Poor quality fibers are not easy to be evenly dispersed in the actual preparation of concrete .
This paragraph is completely marked red , There is only one way to modify , Is to disrupt the order , Reorganize .
Put a small amount of fiber into a transparent container containing water , Observe the fiber changes while stirring , If the quality of synthetic fiber is better , Then you can intuitively see that the fibers are dispersed in a three-dimensional suspension , as time goes on , The position will not change significantly ; If the quality of synthetic fiber is poor , In the process of stirring , Fibers may disperse , And it is easy to float up to form a flocculent layer . Poor quality fibers are not easy to be evenly dispersed in the actual preparation of concrete .
I think back to those years when I painfully changed my thesis in school , It's really painful … It's really hard to get an idea to graduate smoothly …
I'm a panda , I hope this article will help you , I'll see you in the next article (*◡‿◡)
边栏推荐
- Getting started with DES encryption
- Polynomial locus of order 5
- Zero sequence aperture of leakage relay jolx-gs62 Φ one hundred
- Introduction to distributed transactions
- "Multimodal" concept
- Pinduoduo product details interface, pinduoduo product basic information, pinduoduo product attribute interface
- Photo selector collectionview
- Message queuing: how to ensure that messages are not lost
- 纪念下,我从CSDN搬家到博客园啦!
- 1.AVL树:左右旋-bite
猜你喜欢
随机推荐
力扣102题:二叉树的层序遍历
Design, configuration and points for attention of network unicast (one server, multiple clients) simulation using OPNET
Message queue: how to deal with message backlog?
[PM products] what is cognitive load? How to adjust cognitive load reasonably?
淘宝店铺发布API接口(新),淘宝oAuth2.0店铺商品API接口,淘宝商品发布API接口,淘宝商品上架API接口,一整套发布上架店铺接口对接分享
SAP webservice 测试出现404 Not found Service cannot be reached
How Alibaba cloud's DPCA architecture works | popular science diagram
Web architecture design process
[论文阅读] A Multi-branch Hybrid Transformer Network for Corneal Endothelial Cell Segmentation
Getting started with DES encryption
sql优化常用技巧及理解
5. Data access - entityframework integration
拼多多新店如何获取免费流量,需要从哪些环节去优化,才能有效提升店内免费流量
导航栏根据路由变换颜色
How does mapbox switch markup languages?
Go 语言的 Context 详解
上海字节面试问题及薪资福利
Educational Codeforces Round 22 B. The Golden Age
JVM (19) -- bytecode and class loading (4) -- talk about class loader again
分布式事务解决方案之2PC