当前位置:网站首页>[system design] proximity service
[system design] proximity service
2022-07-03 06:08:00 【xuhss_ com】
High quality resource sharing
Learning route guidance ( Click unlock ) | Knowledge orientation | Crowd positioning |
---|---|---|
🧡 Python Actual wechat ordering applet 🧡 | Progressive class | This course is python flask+ Perfect combination of wechat applet , From the deployment of Tencent to the launch of the project , Create a full stack ordering system . |
Python Quantitative trading practice | beginner | Take you hand in hand to create an easy to expand 、 More secure 、 More efficient quantitative trading system |
In this paper , We will design a proximity service , Used to find places near users , Like restaurants , The hotel , Shopping malls, etc .
The design requirements
Start with a story about Xiao Ming's interview .
interviewer : Hello , I want to test your design ability , If you design a neighborhood service , Used to search for businesses near users , What would you do ?
Xiao Ming : well , Can users specify the search radius ? If there are not enough merchants in the search scope , Whether the system supports expanding the search scope ?
interviewer : Yes , Users can modify as needed , There are a few options ,0.5km,1km,2km,5km,10km,20km.
Xiao Ming : Um. , Are there any other system requirements ?
interviewer : Another thing to consider is , Low latency of the system , High availability , And scalability , And data privacy .
Xiao Ming : well , I understand .
To sum up , Need to do a proximity service , According to the user's location ( Longitude and latitude ) And the search radius returns to nearby businesses , The radius can be modified . Because the user's location information is sensitive data , We may need to comply with data privacy laws .
High level design
The high-level design drawing is shown below , The system consists of two parts : Location based services (location-based service)LBS And business (bussiness) Related services .
Let's look at each component of the system .
Load Balancer
The load balancer can distribute traffic to multiple back-end services according to the route .
Location based services (LBS)
LBS Service is the core of the system , Find nearby businesses by location and radius .LBS It has the following characteristics :
- There is no request , But there are a lot of queries
- QPS Very high , Especially during peak hours in dense areas .
- Service is stateless , Support horizontal scaling .
Business service
Merchant creation , to update , Delete merchant information , And users can view the business information .
Database cluster
The database cluster can use master-slave configuration , Improve availability and performance . The data is first saved to the master database , Then copy to the slave Library , The master database handles all writes , Multiple slave databases are used for read operations .
Next , We discuss location services in detail LBS The implementation of the .
1. Two dimensional search
This method is simple , It works , Draw a circle according to the user's position and search radius , Then find all the merchants in the circle , As shown below .
The latitude of the merchant is latitude Express , Longitude longitude Express . The latitude and longitude of the same user can be user_latitude and user_longitude Express , The radius uses radius Express .
The above search process can be translated into the following pseudo SQL .
SELECT business_id, latitude, longitude,
FROM business
WHERE
latitude >= (@user_latitude - radius) AND latitude < (@user_latitude + radius)
AND
longitude >= (@user_longitude - radius) AND longitude < (@user_longitude + radius)
This way we can achieve our needs , But in fact, the efficiency is not high , Because we need to scan the entire table . Although we can index longitude and latitude , Efficiency has improved , But not enough , We also need to compute the union of the results of the index .
2. Geohash
We said that up there , The effect of indexing by two-dimensional longitude and latitude is not obvious . and Geohash You can convert two-dimensional longitude and latitude into one-dimensional string , By algorithm , Each additional bit recursively divides the world into smaller and smaller grids , Let's see how it works .
First , Divide the earth into four quadrants through the prime meridian and the equator , as follows
- Latitude range [-90, 0] use 0 Express
- Latitude range [0, 90] use 1 Express
- Longitude range [-180, 0] use 0 Express
- Longitude range [0, 180] use 1 Express
then , Then divide each grid into four small grids .
Repeat the process , Until the size of the grid meets our needs ,Geohash Usually use base32 Express . Let's look at two examples .
- Google Headquarters Geohash( The length is 6):
1001 10110 01001 10000 11011 11010 (base32 convert) → 9q9hvu (base32)
- Facebook Headquarters Geohash( length by 6):
1001 10110 01001 10001 10000 10111 (base32 convert) → 9q9jhr (base32)
Geohash Yes 12 A precision ( Also known as level ), It can control the size of each grid , The longer the string , The smaller the split mesh , as follows
In the actual , Choose the right one according to the specific scenario Geohash precision .
In this way , Finally, the map is divided into small grids below , One Geohash The string represents a grid , In this way, you can query the merchant information in each grid , Search is very efficient .
Maybe you've found some rules , In each grid above , They all have the same prefix wtw3
. Yes ,Geohash Is characterized by , The longer the same prefix part of the two grids , It means that their positions are adjacent .
On the other hand , Two adjacent grids , Their Geohash Do strings have to be similar ?
not always , Because of existence The boundary problem . When both meshes are at the edge , Although they are adjacent , however Geohash The value of is different from the first , Here's the picture , Two purple dots are next to each other .
The following is a grid with high accuracy , Some adjacent grids Geohash The value of is totally different .
Another boundary problem is , For the user ( Orange ) Come on , The merchants in the next grid ( violet ) Perhaps more than their own grid businesses ( violet ) The distance is even closer , Here's the picture
therefore , When inquiring about nearby businesses , You can't just limit yourself to the grid where the user resides , To expand to users adjacent to 4 A or 9 Grid (s) , Then calculate the distance , Screening , Finally, find a business with the right distance .
in addition , When users are in remote suburbs , We can do it in the following way , Expand your search , Return a sufficient number of merchants .
Geohash Is widely used , in addition Redis and MongoDB Both provide corresponding functions , You can use it directly .
3 . quadtree
Another popular solution is the quadtree , This method can recursively divide the two-dimensional space into four quadrants , Until the number of merchants in each grid meets the requirements .
Here's the picture , For example, make sure that the number of grids does not exceed 10, If exceeded , It is divided into four small grids .
Please note that , A quadtree is a memory data structure , It is not a database solution . It runs on every LBS Service , The data structure is built at service startup .
Next , Take a look at what information is stored in the nodes ?
Internal node
Coordinates of the upper left and lower right corners of the grid , And pointing 4 individual Pointer to the child node .
Leaf node
Coordinates of the upper left and lower right corners of the grid , And the businesses in the grid ID Array .
Real world quadtree example
Yext A picture is provided , Shows the quadtree constructed by one of the cities . We need smaller 、 Finer grained meshes are used in dense areas , Larger grids are used in remote suburbs .
Google S2 and Hilbert curve
Google S2 Libraries are another important player in this field , Similar to a quadtree , It is a memory solution . It maps the sphere to one-dimensional index based on Hilbert curve .
and Hilbert curve It is a fractal curve that can fill a plane square ( Space filling curve ), David · Hilbert is in 1891 in , as follows
How Hilbert curves are generated ?
The simplest first order Hilbert curve , First divide the square into four grids , Then start at the exact center of one of the grids , Follow the direction , Connect each grid .
Second order Hilbert curve , Each grid is a first-order Hilbert curve , Then connect them end to end .
Third order Hilbert curve
n Hilbert curve of order , Realize a line connecting the whole plane .
Again , Hilbert curves can also fill the entire three-dimensional space .
An important characteristic of Hilbert curve is Dimension reduction , You can convert a multidimensional space into a one-dimensional array , You can see how it is implemented through animation .
The search efficiency in one-dimensional space is much higher than that in two-dimensional space .
Multiple data centers and high availability
We can LBS Services are deployed to multiple regions , Users in different regions connect to the nearest Data Center , This can improve the access speed and high availability of the system , And according to the actual scene , Expand .
Final design drawings
- Users need to find nearby 500 Rice's restaurant . The client puts the user location ( Longitude and latitude ), radius (500m) Send to back end .
- The load balancer forwards the request to LBS.
- Based on user location and radius information ,LBS Found a match for the search geohash length .
- LBS Calculate adjacent Geohash And add them to the list .
- call Redis The service obtains the corresponding merchant ID.
- LBS According to the returned merchant list , Calculate the distance between users and merchants , And rank , And then back to the client .
summary
In this paper , We designed a proximity service , It introduces 4 There are three common implementation methods , Two dimensional search ,Geohash, Quadtree and Google S2. They have their own advantages and disadvantages , You can base on the actual business scenario , Choose the right implementation .
Reference
https://halfrost.com/go_spatial_search/#toc-25
https://www.amazon.com/System-Design-Interview-Insiders-Guide/dp/1736049119
边栏推荐
猜你喜欢
ThreadLocal的简单理解
[teacher Zhao Yuqiang] Flink's dataset operator
Method of converting GPS coordinates to Baidu map coordinates
Alibaba cloud OOS file upload
[teacher Zhao Yuqiang] index in mongodb (Part 1)
Clickhouse learning notes (2): execution plan, table creation optimization, syntax optimization rules, query optimization, data consistency
Maximum likelihood estimation, divergence, cross entropy
JDBC connection database steps
轻松上手Fluentd,结合 Rainbond 插件市场,日志收集更快捷
理解 YOLOV1 第一篇 预测阶段
随机推荐
卷积神经网络CNN中的卷积操作详解
GPS坐标转百度地图坐标的方法
Kubernetes notes (IX) kubernetes application encapsulation and expansion
BeanDefinitionRegistryPostProcessor
Installation du plug - in CAD et chargement automatique DLL, Arx
Common exceptions when Jenkins is released (continuous update...)
QT read write excel -- qxlsx insert chart 5
Kubernetes notes (VIII) kubernetes security
Oauth2.0 - using JWT to replace token and JWT content enhancement
CAD插件的安裝和自動加載dll、arx
[teacher Zhao Yuqiang] the most detailed introduction to PostgreSQL architecture in history
@Import annotation: four ways to import configuration classes & source code analysis
The most responsible command line beautification tutorial
How to create and configure ZABBIX
[Zhao Yuqiang] deploy kubernetes cluster with binary package
Kubernetes notes (V) configuration management
Kubesphere - Multi tenant management
Zhiniu stock project -- 05
[teacher Zhao Yuqiang] RDB persistence of redis
[teacher Zhao Yuqiang] use the catalog database of Oracle