当前位置:网站首页>How to verify whether the contents of two files are the same
How to verify whether the contents of two files are the same
2022-07-01 03:01:00 【Lingxiyun】
Do file upload function today , The requirements require that the contents of the documents are the same and cannot be uploaded repeatedly . I feel this demand is very simple, so I gave it to a new classmate who has just entered the industry . When merging the code, I found that the student actually used the same file name and the same file size as the basis for the same two files . Is this condition reliable ?
In terms of probability, the probability of encountering two files with the same name and size is really too small . This kind of judgment can run stably in the production environment for a while , But even the lowest possibility is possible , If we can do it 100% Just fine .
I believe the students have downloaded some small tools developed by kind people , Some gadgets come with a verifier that allows you to verify the provided checksum value , Prevent someone from maliciously tampering with gadgets , Ensure that gadgets can be used safely .

If the contents of the two files are the same , Then their summaries should be the same . Can this principle help us identify whether the two documents are the same ?
Java Implement file summary
/** * Extract files checksum * * @param path File full path * @param algorithm Algorithm name for example MD5、SHA-1、SHA-256 etc. * @return checksum * @throws NoSuchAlgorithmException the no such algorithm exception * @throws IOException the io exception */
public static String extractChecksum(String path, String algorithm) throws NoSuchAlgorithmException, IOException {
// Initialize the digest algorithm according to the algorithm name
MessageDigest digest = MessageDigest.getInstance(algorithm);
// Read all bits of the file
byte[] fileBytes = Files.readAllBytes(Paths.get(path));
// Summary update
digest.update(fileBytes);
// Complete the hash summary calculation and return the characteristic value
byte[] digested = digest.digest();
// Hexadecimal output
return HexUtils.toHexString(digested);
}
The content remains the same
First of all, it is necessary to prove whether the summary of a document changes when the content remains unchanged , Execute the following code multiple times , Assertions are always true.
String path = "C:\\Users\\s1\\IdeaProjects\\demo\\src\\main\\resources\\application.yml";
String checksum = extractChecksum(path, "SHA-1");
String hash = "6bf4d6c101b4a7821226d3ec1f8d778a531bf265";
Assertions.assertEquals(hash,checksum);
And I changed the file name to application-dev.yml, even to the extent that application-dev.txt The summaries are the same . I have put the yml The contents of the document have been changed , Assertion on false 了 . This proves the case of a single document , The content remains the same ,hash It is the same. .
File replication
I put yml A copy of the document , Changed the file name and type , The unchanged contents coexist in another directory , To test whether their summaries have changed .
String path1 = "C:\\Users\\s1\\IdeaProjects\\demo\\src\\main\\resources\\application.yml";
String path2 = "C:\\Users\\s1\\IdeaProjects\\demo\\src\\main\\resources\\templates\\application-dev.txt";
String checksum1 = extractChecksum(path1, "SHA-1");
String checksum2 = extractChecksum(path2, "SHA-1");
String hash = "6bf4d6c101b4a7821226d3ec1f8d778a531bf265";
Assertions.assertEquals(hash,checksum1);
Assertions.assertEquals(hash,checksum2);
The result assertion passed , However, after changing the content of one of the files, the assertion will not pass .
New empty file
The new empty file here refers to the new empty file without any operation . The new empty file will return a... According to a specific algorithm Fixed value , such as SHA-1 The empty file value under the algorithm is :
da39a3ee5e6b4b0d3255bfef95601890afd80709
Conclusion
It is proved by experiments that :
- Under the same algorithm , whatever
The summary values of new empty files are fixed. - whatever
The summary values of two files with the same content are the same, And the path 、 file name 、 File type independent . - Of documents
The summary value will change with the content of the file.
边栏推荐
- Sampling Area Lights
- Mouse over effect III
- Record a service deployment failure troubleshooting
- 园区运营效率提升,小程序容器技术加速应用平台化管理
- [linear DP] longest common subsequence
- Nacos configuration center tutorial
- 咱就是说 随便整几千个表情包为我所用一下
- Evaluation of the entry-level models of 5 mainstream smart speakers: apple, Xiaomi, Huawei, tmall, Xiaodu, who is better?
- 【小程序项目开发-- 京东商城】uni-app之首页商品楼层
- Classic programming problem: finding the number of daffodils
猜你喜欢
![[applet project development -- JD mall] uni app commodity classification page (Part 2)](/img/f3/752f41f5b5cc16c8a71498ea9cabb5.png)
[applet project development -- JD mall] uni app commodity classification page (Part 2)

Optimal transport Series 1

Sampling Area Lights
![Install vcenter6.7 [vcsa6.7 (vCenter server appliance 6.7)]](/img/83/e3c9d8eda9d5351d4c54928d3b090b.png)
Install vcenter6.7 [vcsa6.7 (vCenter server appliance 6.7)]
性能测试常见面试题

Huawei operator level router configuration example | BGP VPLS configuration example

Network address translation (NAT) technology

VMware vSphere 6.7 virtualization cloud management 12. Vcsa6.7 update vCenter server license
![[linear DP] longest common subsequence](/img/47/c3172422e997009facbada929adb1a.jpg)
[linear DP] longest common subsequence

通信协议——分类及其特征介绍
随机推荐
Huawei operator level router configuration example | BGP VPLS configuration example
Why are strings immutable in many programming languages? [repeated] - why are strings immutable in many programming languages? [duplicate]
Network address translation (NAT) technology
Borrowing constructor inheritance and composite inheritance
调试定位导航遇到的问题总结
Mouse over effect 8
咱就是说 随便整几千个表情包为我所用一下
js中的原型和原型链
Restcloud ETL data realizes incremental data synchronization through timestamp
Mouse over effect 10
The operation efficiency of the park is improved, and the application platform management of applet container technology is accelerated
Codeforces Round #416 (Div. 2) C. Vladik and Memorable Trip
【Qt】添加第三方库的知识补充
STM32 - DS18B20 temperature sampling of first-line protocol
JS to find duplicate elements in two arrays
If I am in Beijing, where is a better place to open an account? In addition, is it safe to open a mobile account?
Nacos configuration center tutorial
安装VCenter6.7【VCSA6.7(vCenter Server Appliance 6.7) 】
Redis高效点赞与取消功能
性能测试常见面试题