当前位置:网站首页>System design: index
System design: index
2022-06-24 05:03:00 【Xiaochengxin post station】
If someone talks to you about indexing , Is it the first time you think about databases , So what does indexing solve ? Such as query SQL slow , When this happens , One of the first things to do is to see if it's slow SQL Go to the database index .
The purpose of creating an index on a specific table in the database is to make it faster to search the table and find the required rows . You can create indexes using one or more columns of a database table , Facilitate rapid random search and efficient access to ordered records .
Example : Library catalog
A library catalogue is a register containing a list of books found in a library . Directories are organized like database tables , There are usually four columns : Title 、 author 、 Subject and publication date . There are usually two such directories : A sort by title , Another sort by author's name . In this case , You can think of a writer you want to read , Then read their books , Or find a specific title you know you want to read , In case you don't know the author's name . These catalogs are like indexes in a book database . They provide a sorted list of data , You can easily search through relevant information .
In short , An index is a data structure , Can be regarded as a directory , Point us to where the actual data is . therefore , When we create an index on a column of a table , We store this column and a pointer to the entire row in the index in the index . Let's assume a table containing a list of books , The image below shows “Title” The appearance of the index on the column :
Just like traditional relational data storage , We can also apply this concept to larger data sets . The trick to indexing is , We must carefully consider how users will access the data . For a number of sizes TB But the payload is very small ( Such as 1KB) Data set of , Indexing is a necessary condition for optimizing data access . Finding a small payload in such a large data set can be a real challenge , Because we can't iterate over so much data in any reasonable time . Besides , Such a large data set is likely to be distributed on multiple physical devices , This means that we need some way to find the correct physical location of the data we need . Indexing is the best way to do this .
Why indexes degrade write performance ?
Indexing can greatly speed up data retrieval , But with the extra keys , The index itself can be very large , This slows down data insertion and updates .
When adding rows to a table with an active index or updating existing rows , We don't just have to write data , And update the index . This reduces write performance . This performance degradation applies to all inserts of the table 、 Update and delete operations . therefore , Avoid adding unnecessary indexes to tables , And delete indexes that are no longer in use . To reiterate , Adding indexes is to improve the performance of search queries . If the goal of the database is to provide a data store that is frequently written but rarely read , that , Reduce the more common operations ( Write now ) The performance of may not be worth the performance improvement we get from reading . You can refer to it Wiki Encyclopedias https://en.wikipedia.org/wiki/Database_index Database index .
Off topic author's supplement
The Google system design guide specifies what we call the advantages and disadvantages of indexing , Well, actually, think deeply , Indexing is the solution to the read problem , Data storage is the solution to the write problem , And when we design the system , In the process of middleware , You will find that a large number of designs are separated from reading and writing , For example, writing to a disk is sequential , Disk reading is random reading . So the purpose of using an index is for us to make a trade-off , Does the index help us , If there is only one data record, then no index can . If the data is very large , Many redundant indexes are built, which undoubtedly makes it more difficult for us to write .
Reference material
grok_system_design_interview.pdf
边栏推荐
- IP and traffic reconciliation tool networktrafficview
- CTF learning notes 17:iwesec file upload vulnerability-02 file name filtering bypass
- Svg quick start small white article
- Understanding OAuth 2.0
- MySQL cases MySQL find out who holds the row lock (RC)
- Weak current engineer, 25g Ethernet and 40g Ethernet: which do you choose?
- The conference assistant hidden in wechat is the best way to work efficiently!
- What is cloud server? How to access the ECS Homepage
- What is the implementation of domain name to IP address conversion? What are the benefits of switching to a website?
- Activity recommendation | cloud native community meetup phase VII Shenzhen station begins to sign up!
猜你喜欢

Leetcode question brushing (question 3) - the longest substring without repeated characters

少儿编程教育在特定场景中的普及作用

阿里云新一代云计算体系架构 CIPU 到底是啥?

CTF learning notes 18:iwesec file upload vulnerability-03-content-type filtering bypass

梯度下降法介绍-黑马程序员机器学习讲义

重新认识WorkPlus,不止IM即时通讯,是企业移动应用管理专家

Detailed explanation of tcpip protocol

线性回归的损失和优化,机器学习预测房价

011_ Cascader cascade selector

SAP mts/ato/mto/eto topic 7: ATO mode 1 m+m mode strategy 82 (6892)
随机推荐
The easyplayer player displays compileerror:webassembly Reason for instance() and its solution
How to file ECS? What should be paid attention to when selecting ECS
解析90后创客教育的主观积极性
Locating memory leaks with poolmon
LeetCode 1791. Find the central node of the star chart
Detailed explanation of tcpip protocol
SAP mts/ato/mto/eto topic 10: ETO mode q+ empty mode unvalued inventory policy customization
What is the secondary domain name of the website? What is the relationship between the secondary domain name and the primary domain name?
How does the mobile phone remotely connect to the ECS? What should be paid attention to during the operation
4G industrial VPN router
Application practice of helium decentralized lorawan network in Tencent cloud IOT development platform
Bi-sql order by
How to build a website for ECS? What are the prices of different ECS
Troubleshooting for the error message "[err] mod\u local\u stream.c:880 unknown source default" in easyrtc
LeetCode 1290. Binary linked list to integer
The principle of defer keyword in go
Spirit breath development log (12)
What domain names do not need to be filed? Is there any process for domain name registration
What kind of domain name is better? What should enterprises pay attention to when choosing a domain name?
Weak current engineer, 25g Ethernet and 40g Ethernet: which do you choose?