当前位置:网站首页>Microservice architecture | how to solve the problem of fragment uploading of large attachments?
Microservice architecture | how to solve the problem of fragment uploading of large attachments?
2022-06-23 21:11:00 【Code farming architecture】
Reading guide : Patch uploading 、 Breakpoint continuation , These two nouns should not be unfamiliar to friends who have done or are familiar with file uploading , Summarize this article, hoping to be helpful or enlightening to students engaged in related work .
When our files are very large , Does it take a long time to upload , Such a long connection , What if the network fluctuates ? The intermediate network is disconnected ? In such a long process, if there is instability , All the content uploaded this time has failed , Upload again .
Patch uploading , Is to upload the file , According to a certain size , Separate the entire file into blocks ( We call it Part) To upload separately , After uploading, the server will summarize all the uploaded files and integrate them into the original files . Piecemeal upload can not only avoid the problem of always having to upload from the starting position of the file due to the poor network environment , Multithreading can also be used to send different block data concurrently , Improve transmission efficiency , Reduce sending time .
One 、 background
After the sudden increase in the number of system users , In order to better adapt to the customized needs of various groups . Business support is slowly realized C End user defined layout and configuration , Causes configuration data to be read IO a surge .
To better optimize such scenarios , Manage user-defined configuration statically ! The configuration file that will be generated is the static configuration file , There are thorny problems in the process of generating static files , The configuration file is too large, resulting in long waiting time in the file upload server , As a result, the overall performance of the whole business scenario declines .
Two 、 Generate configuration files
Three elements of generating files
- file name
- The contents of the document
- File store Format
The contents of the document 、 File storage formats are easy to understand and handle , Of course, I've sorted out the encryption methods commonly used in microservices
- Microservice architecture | What are the common encryption methods for microservices ( One )
- Microservice architecture | What are the common encryption methods for data encryption ( Two )
Here is a supplementary explanation , If you want to encrypt the file content, you can consider . However, the case scenario in this paper has a low degree of confidentiality for the configuration information , There's no expansion here .
The naming criteria for file names are determined in combination with business scenarios , It's usually based on a profile + Timestamp format is the main format . However, such naming conventions can easily lead to file name conflicts , Cause unnecessary follow-up trouble .
So I made a special treatment for the naming of file names here , I have handled the front end Route Routing experience should be associated with , The filename can be accessed through Content based generation Hash value Instead of .
stay Spring 3.0 Then the method of calculating the summary is provided .
DigestUtils#md
Returns the of the given byte MD5 Hexadecimal string representation of the abstract .
md5DigestAsHex Source code
/**
* Calculate the bytes of the summary
* @param A hexadecimal summary character
* @return String returns the of a given byte MD5 Hexadecimal string representation of the abstract .
*/
public static String md5DigestAsHex(byte[] bytes) {
return digestAsHexString(MD5_ALGORITHM_NAME, bytes);
}file name 、 Content 、 suffix ( Storage format ) Generate the file directly after confirmation
/**
* Generate... Directly from content file
*/
public static void generateFile(String destDirPath, String fileName, String content) throws FileZipException {
File targetFile = new File(destDirPath + File.separator + fileName);
// Ensure that the parent directory exists
if (!targetFile.getParentFile().exists()) {
if (!targetFile.getParentFile().mkdirs()) {
throw new FileZipException(" path is not found ");
}
}
// Set file encoding format
try (PrintWriter writer = new PrintWriter(new BufferedWriter(new OutputStreamWriter(new FileOutputStream(targetFile), ENCODING)))
) {
writer.write(content);
return;
} catch (Exception e) {
throw new FileZipException("create file error",e);
}
}The advantages of generating files through content are self-evident , It can greatly reduce our initiative to generate new files based on content comparison 、 If the file content is large and the corresponding file name is the same, it means that the content has not been adjusted , At this time, we don't need to do subsequent file update operations .
3、 ... and 、 Upload attachments in pieces
The so-called fragment upload , Is to upload the file , According to a certain size , Separate the entire file into blocks ( We call it Part) To upload separately , After uploading, the server will summarize all the uploaded files and integrate them into the original files . Piecemeal upload can not only avoid the problem of always having to upload from the starting position of the file due to the poor network environment , Multithreading can also be used to send different block data concurrently , Improve transmission efficiency , Reduce sending time .
Fragment upload is mainly applicable to the following scenarios :
- The network environment is not good : When the upload fails , You can deal with failed Part Make an independent retry , There is no need to upload other Part.
- Breakpoint continuation : After a pause , It can be uploaded from the last time Part Continue to upload .
- Speed up upload : To upload to OSS When your local file is large , You can upload multiple files in parallel Part To speed up upload .
- Stream upload : You can start uploading when the size of the file you want to upload is uncertain . This kind of scene is quite common in video surveillance and other industry applications .
- Larger files : Generally, when the file is relatively large , By default, fragment upload is generally adopted .
The whole process of fragment upload is roughly as follows :
- Will need to upload the file according to certain segmentation rules , Split into blocks of the same size ;
- Initialize a fragment upload task , Return to the unique identification of this fragment upload ;
- Follow a certain strategy ( Serial or parallel ) Send each piece of data block ;
- After sending , The server judges whether the data upload is complete according to the data , If it's complete , Then the data block is synthesized to get the original file
▐ Define the fragment rule size
By default, files are used to achieve 20MB Perform forced segmentation
/** * Force fragment file size (20MB) */ long FORCE_SLICE_FILE_SIZE = 20L* 1024 * 1024;
For the convenience of debugging , Force the threshold of fragmented files to be adjusted to 1KB
▐ Define fragment upload objects
As shown in the figure above, the file fragment with red serial number , Define the basic attributes of the fragment upload object, including the attachment file name 、 Original file size 、 The original document MD5 value 、 Total number of segments 、 Each slice size 、 Current slice size 、 Current slice serial number, etc
The basis of definition is to facilitate the reasonable division of documents in the future 、 Business expansion such as slice merger , Of course, you can define expansion attributes according to business scenarios .
- Total number of segments
long totalSlices = fileSize % forceSliceSize == 0 ?
fileSize / forceSliceSize : fileSize / forceSliceSize + 1;- Each slice size
long eachSize = fileSize % totalSlices == 0 ?
fileSize / totalSlices : fileSize / totalSlices + 1;- Of the original document MD5 value
MD5Util.hex(file)
Such as :
The current attachment size is :3382KB, The forced partition size is limited to 1024KB
By the above calculation : The number of slices is 4 individual , The size of each slice is 846KB
▐ Read the data bytes of each partition
Mark the current byte subscript , Cyclic reading 4 Fragmented data bytes
try (InputStream inputStream = new FileInputStream(uploadVO.getFile())) {
for (int i = 0; i < sliceBytesVO.getFdTotalSlices(); i++) {
// Read the data bytes of each partition
this.readSliceBytes(i, inputStream, sliceBytesVO);
// Call fragment upload API Function of
String result = sliceApiCallFunction.apply(sliceBytesVO);
if (StringUtils.isEmpty(result)) {
continue;
}
return result;
}
} catch (IOException e) {
throw e;
}3、 ... and 、 summary
The so-called fragment upload , Is to upload the file , According to a certain size , Separate the entire file into blocks ( We call it Part) To upload separately .
Dealing with large files and slicing, the main core is to determine three points
- File fragmentation granularity
- How to read slices
- How to store slices
This article mainly analyzes and deals with how to compare the contents of large files in the process of uploading large files 、 Shard processing . Reasonably set the segmentation threshold and how to read and mark the segmentation . I hope it can help or inspire students engaged in related work . Later, we will discuss how to store the fragments 、 Mark 、 Combine documents for detailed interpretation .
边栏推荐
- The background receives the post data passed by the fetch
- How to deal with unclear pictures? How to deal with color balance?
- Script tag attributes and & lt; noscript&gt; label
- How to dispose of the words on the picture? How do I add text to a picture?
- How to build a personal cloud game server? How many games can the cloud game platform install?
- Row height, (top line, middle line, baseline, bottom line), vertical align
- Easyplayer player error 502 bad gateway problem analysis
- Applet development framework recommendation
- Implementation of flashback query for PostgreSQL database compatible with Oracle Database
- 【5分钟玩转Lighthouse】快速使用长安链
猜你喜欢

3000 frame animation illustrating why MySQL needs binlog, redo log and undo log

JS advanced programming version 4: generator learning

How PMO uses two dimensions for performance appraisal

Four aspects of PMO Department value assessment
Application of JDBC in performance test

Steps for formulating the project PMO strategic plan

Applet development framework recommendation

I am 30 years old, no longer young, and have nothing
随机推荐
JS to get the screen size, current web page and browser window
JS remove tabs and line breaks
How to separate image processing? What should I pay attention to when separating layers?
Cloudbase init considerations
This article introduces you to the necessity of database connection pooling
Steps for formulating the project PMO strategic plan
Dart series: your site is up to you. Use extension to extend classes
What is the role of short video AI intelligent audit? Why do I need intelligent auditing?
[JS reverse hundred examples] anti climbing training platform for netizens question 6: JS encryption, environment simulation detection
Realize vscode to write markdown documents + pictures to be automatically uploaded to Tencent cloud cos
From AIPL to grow, talking about the marketing analysis model of Internet manufacturers
What technology is used for video intelligent audit? Difficulties encountered in video audit?
Talk about how to customize data desensitization
Memory patch amsi bypass
Digital procurement transformation solution: SaaS procurement management platform promotes enterprise sunshine procurement
How to deal with unclear pictures? What are the techniques for taking clear pictures?
How to evaluate performance optimization? Covering too much knowledge?
Encryption and decryption analysis of returned data of an e-commerce app (IV)
Making CSR file for face core
Disaster recovery series (VII) -- hybrid cloud public network export disaster recovery construction