当前位置:网站首页>System design: File Hosting Service
System design: File Hosting Service
2022-06-24 02:54:00 【Xiaochengxin post station】
demand
Let's design a file hosting service , such as Dropbox or Google Drive. Cloud file storage allows users to store data on remote servers . Usually , These servers are maintained by cloud storage providers , And through the Internet ( Usually through the Internet ) For users . Users pay cloud data storage fees every month . Similar services :OneDrive、Google Drive
Difficulty level : secondary
1. Why cloud storage ?
Cloud file storage services have become very popular recently , Because they simplify the storage and exchange of digital resources between multiple devices . From using a single PC to using multiple devices with different platforms and operating systems , Like smartphones and tablets , Each can be accessed from different geographical locations at any time , This is considered to be the reason for the great popularity of cloud storage services . Here are some of the main benefits of such services :
Usability : The motto of cloud storage services is to provide data availability anytime, anywhere . Users can access their files from any device anytime, anywhere / Photo .
Reliability and durability : Another benefit of cloud storage is that it provides 100% Data reliability and durability . Cloud storage stores multiple copies of data on servers in different geographical locations , Ensure that users never lose data .
Extensibility : Users never have to worry about insufficient storage space . Using cloud storage , As long as you are willing to pay , You can have unlimited storage space .
If you haven't used it before dropbox.com, It is highly recommended to create an account there , Upload / Edit a file , And see the different options offered by their services . This will help you understand the knowledge points of this article .
2. System requirements and objectives
You should make it clear at the beginning of the interview that . Be sure to ask questions , Find out the exact scope of the system in the interviewer's mind
What we hope to achieve through cloud storage systems ? The following are the top requirements of our system :
1. Users should be able to upload and download their files from any device / Photo .
2. Users should be able to share files or folders with other users .
3. Our service should support automatic synchronization between devices , That is, on a device after updating the file , It should be synchronized on all devices .
4. The system shall support storage up to GB Large files of .
5.ACID It's necessary . Atomicity of all file operations 、 Uniformity 、 Isolation and persistence should be guaranteed .
6. Our system should support offline editing . The user should be able to add / Delete / Modify file : offline , And once online , All changes should be synchronized to remote servers and other online devices .
Extension requirements
• The system shall support data snapshot , So that the user can return to any version of the file .
3. Some design considerations
• We should have huge read and write capacity .
• The expected read-write ratio is almost the same .
• In the internal , Files can be stored in small sections or blocks ( such as 4MB); This can provide many information benefits , That is, all failed operations can only be retried for a small part of the file . If the user fails to upload the file , Then only the failed blocks will be retried .
• We can reduce the amount of data exchanged by transferring only updated data blocks .
• By removing duplicate blocks , We can save storage space and bandwidth .
• Put metadata ( file name 、 Size, etc ) Saving a local copy of the on the client can save us a lot of time to go back and forth to the server .
• For small changes , The client can upload differences intelligently , Not the entire block , That is, incremental upload
4. Capacity estimates and limitations
• Suppose we have 5 Billion total users and 1 Billion daily active users (DAU).
• Suppose that each user connects from three different devices on average .
• On average, , If the user owns 200 File / Photo , We will have 1000 Billion documents .
• Assume that the average file size is 100KB, This will provide us with 10 PB Total storage space .
100B*100KB=>10PB
• We also assume that there will be onemillion active connections per minute .
5. Advanced design
Users will specify a folder as the workspace on their device . Any files placed in this folder / Photo / Folders will be uploaded to the cloud , Whenever a file is modified or deleted , Will be reflected in cloud storage in the same way . Users can specify similar workspaces on all their devices , And any changes made on one device will be propagated to all other devices , So that you have the same workspace view everywhere .
At a higher level , We need to store the file and its metadata information , Such as file name 、 file size 、 Directory etc. , And who to share this file with . therefore , We need something that can help the client upload files / Download to the cloud storage server , And servers that can help update files and user metadata . We also need some mechanisms , To notify all clients when an update occurs , So they can synchronize their files .
As shown in the figure below , The block server will upload from cloud storage together with the client / Download the file , The metadata server will be in SQL or NoSQL Update the metadata of the file in the database . The synchronization server processes a workflow that notifies all clients of different synchronization changes .
6. Component design
Let's introduce the main components of the system one by one :
A. client
Client applications monitor workspace folders on the user's computer , And all the files in it / Folders are synchronized with remote cloud storage . The client application will work with the storage server , Upload the actual file 、 Download and modify to backend cloud storage . The client also interacts with the remote server
Synchronization service , Used to handle any file metadata updates , For example, file name 、 size 、 Change of modification date, etc .
Here are some basic customer operations :
1. Upload and download files .
2. Detect file changes in workspace folders .
3. Handle conflicts caused by offline or concurrent updates .
How do we handle file transfers effectively ?
As mentioned above , We can break each file into smaller chunks , So that only modified blocks are transferred , Not the whole document . Suppose we divide each file into fixed size 4MB block . We can use 1) We use storage devices in the cloud to optimize space utilization and input per second / Output operation (IOPS)2) network bandwidth 3) Statically calculate the optimal block size, such as the average file size in the storage . In our metadata , We should also record each file and the blocks that make up it .
Should we keep a copy of the metadata on the client side ?
Keeping a local copy of the metadata not only enables us to update offline , It also saves a lot of round trip time for updating remote metadata .
How clients can effectively listen for changes to other clients ?
One solution is , The client periodically checks with the server for any changes . The problem with this approach is , There is a delay when we reflect changes locally , Because the client will periodically check for changes , The server notifies when changes occur . If the client frequently checks for changes to the server , It's not just a waste of bandwidth , Because the server must return an empty response most of the time , And keep the server busy . Extracting information in this way is not scalable .
The solution to the above problem can be to use HTTP Long polling .
By long polling , The client requests information from the server , Expect the server not to respond immediately . If the server does not have new data from the client when polling is received , The server will keep the request open and wait for the response information to become available , Instead of sending an empty response . Once you have new information , The server immediately sends... To the client HTTP/S Respond to , Complete the open HTTP/S request . After receiving the response from the server , The client can immediately make another server request , For future updates
Based on the above considerations , We can divide our customers into the following four parts :
I. Internal metadata database , All files will be tracked 、 block 、 Its version and its location in the file system .
Two ,.Chunker Split the file into smaller blocks . It will also be responsible for rebuilding files from file blocks . Our blocking algorithm will detect the part of the file modified by the user , And only transfer these parts to cloud storage ; This will save us bandwidth and synchronization time .
3、 ... and 、 Watcher Local workspace folders will be monitored , And any operation performed by the user ( for example , When users create 、 When deleting or updating a file or folder ) Notification indexer ( As follows ).Watcher Also listens for any changes that occur on other clients that are broadcast by the synchronization service .
Four 、 Indexer The events received from the observer will be processed , And update the internal metadata database with information about modifying file blocks . Once the block is successfully submitted / Download to cloud storage , The indexer will communicate with the remote synchronization service , Broadcast changes to other clients and update the remote metadata database .
How clients should handle slower servers ?
If the server is busy / No response , The client should exit exponentially . It means , If the server response is too slow , The client should delay retrying , And the delay should multiply .
Whether the mobile client should immediately synchronize remote changes ?
With desktop or web The client is different , Mobile clients usually synchronize on demand to save users' bandwidth and space .
B Meta database
The metadata database is responsible for maintaining relevant documents / block 、 Version control and metadata information for users and workspaces . The metadata database can be a relational database ( Such as MySQL) or NoSQL Database services ( Such as DynamoDB). Regardless of the type of database , Synchronization services should be able to provide a consistent view of files using databases , Especially when multiple users use the same file at the same time . because NoSQL Datastores do not support the benefits of scalability and performance ACID attribute , So we need to programmatically ACID Property support is incorporated into the logic of the synchronization service , in case
Choose this database . however , Using a relational database can simplify the implementation of synchronization services , Because they natively support ACID attribute .
The metadata database should store information about the following objects :
1.Chunks
2. Folder
3. Users
4. device
5. work area ( Sync folder )
C Synchronization service
The synchronization service is a component that processes file updates made by the client and applies these changes to other subscription clients . It also synchronizes the client's local database with the information stored in the remote metadata database . Synchronization service is the most important part of the system architecture , Because it plays a key role in managing metadata and synchronizing user files . The desktop client communicates with the synchronization service , To get updates from cloud storage , Or send files and updates to cloud storage , And may be sent to other users . If the client is offline for a period of time , It will poll the system immediately after the new update goes online . When the synchronization service receives an update request , It checks the consistency of the metadata database , Then continue to update . And then , Notifications will be sent to all subscribed users or devices , Update with report file
The design of the synchronization service should ensure that less data is transferred between the client and cloud storage , For better response time . In order to achieve this design goal , Synchronization services can use differential algorithms to reduce the amount of data that needs to be synchronized . We can transfer only the differences between the two versions of the file , Instead of transferring the entire file from the client to the server , Or vice versa . therefore , Transfer only the portion of the file that has changed . This also reduces end-user bandwidth consumption and cloud data storage . As mentioned above , We will divide the document into 4MB The block , And only the modified blocks are transferred . The server and client can compute hashes ( for example ,SHA-256), To see if the local copy of the block is updated . On the server , If we already have a block with a similar hash ( Even from another user ), We don't need to create another copy , We can use the same block . This will be discussed in detail later in data De duplication .
In order to provide an efficient and scalable synchronization protocol , We can consider using the communication middleware between the client and the synchronization service . Message passing middleware should provide extensible message queue and change notification , To support a large number of clients using pull or push policies . such , Multiple synchronization service instances can receive requests from the global request queue , And the communication middleware will be able to balance its load .
D Message Queuing service
An important part of our architecture is messaging middleware , It should be able to handle a large number of requests . An extensible Message Queuing service that supports asynchronous message based communication between clients and synchronous services is best suited to the requirements of our applications . Message queue service supports asynchronous and loosely coupled message based communication between distributed components of the system . The Message Queuing service should be able to efficiently store any number of messages in high availability 、 Reliable and scalable queues .
The Message Queuing service will implement two types of queues in our system . The request queue is a global queue , All clients will share it . The client's request to update the metadata database will first be sent to the request queue , The synchronization service will get the request to update metadata from there . The response queue corresponding to a single subscription client is responsible for delivering update messages to each client . Because once the client receives the message , The message is deleted from the queue , So we need to create a separate response queue for each subscribed client to share update messages .
E cloud / Block storage
cloud / Block storage stores the file blocks uploaded by the user . The client interacts directly with the storage , To send and receive objects from memory . The separation of metadata from storage enables us to use any storage in the cloud or within .
7. File processing workflow
The following sequence shows when the client a Update with client B and C Shared files , Interaction between application components , So they should also receive updates . If other clients are not online at the time of update , Then the Message Queuing service will keep the update notification in a separate response queue , Until they come online later .
1. client A Upload blocks to cloud storage .
2. client A Update metadata and commit changes .
3. client A To be confirmed , And to the client B and C Send notification of changes .4. client B and C Receive metadata changes and download updated blocks .
8. Data De duplication
Data De duplication is a technology used to de duplicate data copies to improve storage utilization . It can also be applied to network data transmission , To reduce the number of bytes that must be sent . For each new incoming block , We can calculate its hash , And compare the hash with all the hashes of the existing block , To see if the same block already exists in our storage .
There are two ways we can implement data De duplication in our systems :
A. Post process de duplication
Use post-processing deduplication , The new data block is first stored on the storage device , Then a process analyzes the data to find duplicates . The advantage of this is , The client does not need to wait for hash calculations or lookups to complete before storing data , This ensures that storage performance does not degrade . The disadvantage of this method is :1) We will store duplicate data unnecessarily , Although in a short time ,2) The transmission of duplicate data will consume bandwidth .
B Online de duplication
perhaps , When a client enters data on its device , De duplication hash calculation can be completed in real time . If our system recognizes a stored block , Then only references to existing blocks will be added to the metadata , Instead of a complete copy of the block . This approach will provide us with the best network and storage utilization .
9 Metadata Partition
To extend the metadata database , We need to partition it , So that it can store information about millions of users and billions of files / Block information . We need to propose a zoning scheme , Divide and store data in different places DB Server .
1. Vertical zones :
We can partition the database , To store tables related to a particular function on a server . for example , We can store all user related tables in a database , Link all with the file / Block related tables are stored in another database . Although this approach is easy to implement , But there are also some problems :
Will we still have the problem of scale ? If we want to store trillions of data blocks , And our database cannot support such a large number of records , What should I do ? How can we further divide these tables ?
2. Joining two tables in two separate databases can cause performance and consistency problems . How often do we have to connect the user table and the file table ?
2. Range based partitioning :
If we set the file according to the first letter of the file path / Blocks are stored in separate partitions , What will happen? ? under these circumstances , We'll take all the letters “A” The first file is saved in a partition , Will be written in letters “B” The first file is saved to another partition , And so on . This approach is called range based partitioning . We can even combine some less common letters into a database partition . We should propose this partitioning scheme statically , So we can always store in a predictable way / Find files .
The main problem with this method is , It may cause server imbalance . for example , If we decide to put all the letters “E” The first file is placed in a DB partition , Later we found that the letters “E” Too many files at the beginning , So that we can't put them in one DB partition
3. Hash based partitioning :
In this scheme , We hash the objects being stored , And calculate what the object should go to according to the hash DB Partition . In our case , We can take
Of the file objects we are storing “FileID” hash , To determine the partition where the file will be stored . Our hash function randomly distributes objects into different partitions , for example , Our hash function can always put any ID Mapping to [1…256] A number between , This number will be the partition where we store the objects .
This approach still leads to partition overload , This can be solved by using consistent hashing .
10 cache
There are two kinds of cache in our system . In order to process thermal documents / block , We can introduce caching for block storage . We can use an off the shelf solution , such as Memcached, It can use its own id/ Hash stores the entire block , And before clicking on block storage , The block server can quickly check whether the cache has the required blocks . According to the usage mode of the client , We can determine how many cache servers we need . High end commercial servers can have 144GB Of memory ; Such a server can cache 36K Block .
Which cache replacement strategy best suits our needs ? When the cache is full , And we hope to use newer / When a hotter block replaces a block , How are we going to choose ? For our system , Recently at least use (LRU) It's a reasonable strategy . Under this strategy , We first discard the least recently used block . Similarly , We can provide cache for metadata database .
11 Load balancer ( pounds )
We can add load balancing layers in two places in the system :
1) Between the client and the block server ,
2) Between the client and the metadata server .
first , A simple looping method can be used , Distribute incoming requests evenly between back-end servers . this LB Easy to implement , No overhead will be introduced . Another advantage of this approach is , If the server crashes ,LB Will stop it from rotating , And stop sending any traffic to it .
loop LB One of the problems is , It does not take into account the server load . If the server is overloaded or slow ,LB Will not stop sending new requests to the server . To solve this problem , You can put a smarter LB Solution , Query the load of the back-end server regularly , And adjust the flow according to the load .
12 Security 、 Permissions and file sharing
When storing files in the cloud , One of the biggest concerns of users is the privacy and security of their data , Especially in our system , Users can share their files with other users , Even publicly shared with everyone . In order to deal with this problem , We will store the permissions of each file in the metadata database , To reflect which files any user can see or modify .
Reference material
grok_system_design_interview.pdf
边栏推荐
- Permission maintenance topic: domain controller permission maintenance
- 2022-2028 global cancer biopsy instrument and kit industry research and trend analysis report
- Tencent dongzhiqiang: network security construction needs to change from product driven to service driven
- [Tencent cloud double 12.12] from 56 yuan! New users of Tencent cloud buy for the first time, which is more cost-effective!
- Some tips for using uitextview
- Wkwebview audio and video media playback processing
- Go language starts again, go modules' past life, present life and basic use
- Go language core 36 lecture (go language advanced technology 8) -- learning notes
- How to install an application publisher
- Precautions for VPN client on Tencent cloud
猜你喜欢

2022-2028 global genome editing mutation detection kit industry survey and trend analysis report

2022-2028 global medical modified polypropylene industry research and trend analysis report
Cloudpods golang practice

2022-2028 global portable two-way radio equipment industry research and trend analysis report

IOS development - multithreading - thread safety (3)

Permission maintenance topic: domain controller permission maintenance

2022-2028 global aircraft audio control panel system industry research and trend analysis report

2022-2028 global marine wet exhaust hose industry research and trend analysis report

2022-2028 global high tibial osteotomy plate industry research and trend analysis report

2022-2028 global anti counterfeiting label industry research and trend analysis report
随机推荐
Initial experience of creating partitioned tables under MySQL cases tdsql for MySQL (I)
Buddha's foot before examination: the third bullet of leetcode
Pod abnormal troubleshooting
Offline store + online mall, why do you want to be an online mall
Is your database "cloud native"?
How to enable IPv6 network access for personal broadband
Activiti obtains the initiator based on the process instance ID
Interesting talk about decorator mode, so you will never forget it
Gin framework: add Prometheus monitoring
What is etcd and its application scenarios
2022-2028 global medical modified polypropylene industry research and trend analysis report
Objective-C downloads pictures from the network, saves them, and displays them from the save with uiimageview
Grc: GRC interface is mixed with restful API
S2b2c platform of fresh food industry reshapes the ecosystem of agricultural products and builds a mobile fresh food comprehensive mall
[Tencent cloud double 12.12] from 56 yuan! New users of Tencent cloud buy for the first time, which is more cost-effective!
2022-2028 global marine clutch industry research and trend analysis report
Build a reliable, scalable and maintainable application system
Learning and life -- Talking about my learning methods
2022-2028 global aircraft audio control panel system industry research and trend analysis report
C common regular expression collation