当前位置:网站首页>[system design] proximity service
[system design] proximity service
2022-07-03 06:08:00 【xuhss_ com】
High quality resource sharing
Learning route guidance ( Click unlock ) | Knowledge orientation | Crowd positioning |
---|---|---|
🧡 Python Actual wechat ordering applet 🧡 | Progressive class | This course is python flask+ Perfect combination of wechat applet , From the deployment of Tencent to the launch of the project , Create a full stack ordering system . |
Python Quantitative trading practice | beginner | Take you hand in hand to create an easy to expand 、 More secure 、 More efficient quantitative trading system |
In this paper , We will design a proximity service , Used to find places near users , Like restaurants , The hotel , Shopping malls, etc .
The design requirements
Start with a story about Xiao Ming's interview .
interviewer : Hello , I want to test your design ability , If you design a neighborhood service , Used to search for businesses near users , What would you do ?
Xiao Ming : well , Can users specify the search radius ? If there are not enough merchants in the search scope , Whether the system supports expanding the search scope ?
interviewer : Yes , Users can modify as needed , There are a few options ,0.5km,1km,2km,5km,10km,20km.
Xiao Ming : Um. , Are there any other system requirements ?
interviewer : Another thing to consider is , Low latency of the system , High availability , And scalability , And data privacy .
Xiao Ming : well , I understand .
To sum up , Need to do a proximity service , According to the user's location ( Longitude and latitude ) And the search radius returns to nearby businesses , The radius can be modified . Because the user's location information is sensitive data , We may need to comply with data privacy laws .
High level design
The high-level design drawing is shown below , The system consists of two parts : Location based services (location-based service)LBS And business (bussiness) Related services .
Let's look at each component of the system .
Load Balancer
The load balancer can distribute traffic to multiple back-end services according to the route .
Location based services (LBS)
LBS Service is the core of the system , Find nearby businesses by location and radius .LBS It has the following characteristics :
- There is no request , But there are a lot of queries
- QPS Very high , Especially during peak hours in dense areas .
- Service is stateless , Support horizontal scaling .
Business service
Merchant creation , to update , Delete merchant information , And users can view the business information .
Database cluster
The database cluster can use master-slave configuration , Improve availability and performance . The data is first saved to the master database , Then copy to the slave Library , The master database handles all writes , Multiple slave databases are used for read operations .
Next , We discuss location services in detail LBS The implementation of the .
1. Two dimensional search
This method is simple , It works , Draw a circle according to the user's position and search radius , Then find all the merchants in the circle , As shown below .
The latitude of the merchant is latitude Express , Longitude longitude Express . The latitude and longitude of the same user can be user_latitude and user_longitude Express , The radius uses radius Express .
The above search process can be translated into the following pseudo SQL .
SELECT business_id, latitude, longitude,
FROM business
WHERE
latitude >= (@user_latitude - radius) AND latitude < (@user_latitude + radius)
AND
longitude >= (@user_longitude - radius) AND longitude < (@user_longitude + radius)
This way we can achieve our needs , But in fact, the efficiency is not high , Because we need to scan the entire table . Although we can index longitude and latitude , Efficiency has improved , But not enough , We also need to compute the union of the results of the index .
2. Geohash
We said that up there , The effect of indexing by two-dimensional longitude and latitude is not obvious . and Geohash You can convert two-dimensional longitude and latitude into one-dimensional string , By algorithm , Each additional bit recursively divides the world into smaller and smaller grids , Let's see how it works .
First , Divide the earth into four quadrants through the prime meridian and the equator , as follows
- Latitude range [-90, 0] use 0 Express
- Latitude range [0, 90] use 1 Express
- Longitude range [-180, 0] use 0 Express
- Longitude range [0, 180] use 1 Express
then , Then divide each grid into four small grids .
Repeat the process , Until the size of the grid meets our needs ,Geohash Usually use base32 Express . Let's look at two examples .
- Google Headquarters Geohash( The length is 6):
1001 10110 01001 10000 11011 11010 (base32 convert) → 9q9hvu (base32)
- Facebook Headquarters Geohash( length by 6):
1001 10110 01001 10001 10000 10111 (base32 convert) → 9q9jhr (base32)
Geohash Yes 12 A precision ( Also known as level ), It can control the size of each grid , The longer the string , The smaller the split mesh , as follows
In the actual , Choose the right one according to the specific scenario Geohash precision .
In this way , Finally, the map is divided into small grids below , One Geohash The string represents a grid , In this way, you can query the merchant information in each grid , Search is very efficient .
Maybe you've found some rules , In each grid above , They all have the same prefix wtw3
. Yes ,Geohash Is characterized by , The longer the same prefix part of the two grids , It means that their positions are adjacent .
On the other hand , Two adjacent grids , Their Geohash Do strings have to be similar ?
not always , Because of existence The boundary problem . When both meshes are at the edge , Although they are adjacent , however Geohash The value of is different from the first , Here's the picture , Two purple dots are next to each other .
The following is a grid with high accuracy , Some adjacent grids Geohash The value of is totally different .
Another boundary problem is , For the user ( Orange ) Come on , The merchants in the next grid ( violet ) Perhaps more than their own grid businesses ( violet ) The distance is even closer , Here's the picture
therefore , When inquiring about nearby businesses , You can't just limit yourself to the grid where the user resides , To expand to users adjacent to 4 A or 9 Grid (s) , Then calculate the distance , Screening , Finally, find a business with the right distance .
in addition , When users are in remote suburbs , We can do it in the following way , Expand your search , Return a sufficient number of merchants .
Geohash Is widely used , in addition Redis and MongoDB Both provide corresponding functions , You can use it directly .
3 . quadtree
Another popular solution is the quadtree , This method can recursively divide the two-dimensional space into four quadrants , Until the number of merchants in each grid meets the requirements .
Here's the picture , For example, make sure that the number of grids does not exceed 10, If exceeded , It is divided into four small grids .
Please note that , A quadtree is a memory data structure , It is not a database solution . It runs on every LBS Service , The data structure is built at service startup .
Next , Take a look at what information is stored in the nodes ?
Internal node
Coordinates of the upper left and lower right corners of the grid , And pointing 4 individual Pointer to the child node .
Leaf node
Coordinates of the upper left and lower right corners of the grid , And the businesses in the grid ID Array .
Real world quadtree example
Yext A picture is provided , Shows the quadtree constructed by one of the cities . We need smaller 、 Finer grained meshes are used in dense areas , Larger grids are used in remote suburbs .
Google S2 and Hilbert curve
Google S2 Libraries are another important player in this field , Similar to a quadtree , It is a memory solution . It maps the sphere to one-dimensional index based on Hilbert curve .
and Hilbert curve It is a fractal curve that can fill a plane square ( Space filling curve ), David · Hilbert is in 1891 in , as follows
How Hilbert curves are generated ?
The simplest first order Hilbert curve , First divide the square into four grids , Then start at the exact center of one of the grids , Follow the direction , Connect each grid .
Second order Hilbert curve , Each grid is a first-order Hilbert curve , Then connect them end to end .
Third order Hilbert curve
n Hilbert curve of order , Realize a line connecting the whole plane .
Again , Hilbert curves can also fill the entire three-dimensional space .
An important characteristic of Hilbert curve is Dimension reduction , You can convert a multidimensional space into a one-dimensional array , You can see how it is implemented through animation .
The search efficiency in one-dimensional space is much higher than that in two-dimensional space .
Multiple data centers and high availability
We can LBS Services are deployed to multiple regions , Users in different regions connect to the nearest Data Center , This can improve the access speed and high availability of the system , And according to the actual scene , Expand .
Final design drawings
- Users need to find nearby 500 Rice's restaurant . The client puts the user location ( Longitude and latitude ), radius (500m) Send to back end .
- The load balancer forwards the request to LBS.
- Based on user location and radius information ,LBS Found a match for the search geohash length .
- LBS Calculate adjacent Geohash And add them to the list .
- call Redis The service obtains the corresponding merchant ID.
- LBS According to the returned merchant list , Calculate the distance between users and merchants , And rank , And then back to the client .
summary
In this paper , We designed a proximity service , It introduces 4 There are three common implementation methods , Two dimensional search ,Geohash, Quadtree and Google S2. They have their own advantages and disadvantages , You can base on the actual business scenario , Choose the right implementation .
Reference
https://halfrost.com/go_spatial_search/#toc-25
https://www.amazon.com/System-Design-Interview-Insiders-Guide/dp/1736049119
边栏推荐
- Method of converting GPS coordinates to Baidu map coordinates
- Leetcode solution - 02 Add Two Numbers
- Simple handwritten ORM framework
- Kubernetes notes (V) configuration management
- 理解 期望(均值/估计值)和方差
- Zhiniu stock project -- 04
- What's the difference between using the Service Worker Cache API and regular browser cache?
- Migrate data from Amazon aurora to tidb
- Es remote cluster configuration and cross cluster search
- Pytorch builds the simplest version of neural network
猜你喜欢
[teacher Zhao Yuqiang] redis's slow query log
Core principles and source code analysis of disruptor
[teacher Zhao Yuqiang] Flink's dataset operator
[teacher Zhao Yuqiang] kubernetes' probe
@Import annotation: four ways to import configuration classes & source code analysis
Oauth2.0 - explanation of simplified mode, password mode and client mode
The most responsible command line beautification tutorial
Project summary --01 (addition, deletion, modification and query of interfaces; use of multithreading)
Project summary --04
技术管理进阶——你了解成长的全貌吗?
随机推荐
Kubernetes notes (III) controller
How does win7 solve the problem that telnet is not an internal or external command
Migrate data from Mysql to tidb from a small amount of data
Cesium 点击获取模型表面经纬度高程坐标(三维坐标)
[teacher Zhao Yuqiang] RDB persistence of redis
Detailed explanation of contextclassloader
Project summary --2 (basic use of jsup)
Method of converting GPS coordinates to Baidu map coordinates
[teacher Zhao Yuqiang] Cassandra foundation of NoSQL database
Alibaba cloud OOS file upload
Simple solution of small up main lottery in station B
Using the ethtool command by example
The server data is all gone! Thinking caused by a RAID5 crash
Apple submitted the new MAC model to the regulatory database before the spring conference
Cesium 点击获三维坐标(经纬度高程)
JS implements the problem of closing the current child window and refreshing the parent window
技术管理进阶——你了解成长的全貌吗?
1. Somme des deux nombres
深度学习,从一维特性输入到多维特征输入引发的思考
arcgis创建postgre企业级数据库