当前位置:网站首页>System design: index
System design: index
2022-06-24 05:03:00 【Xiaochengxin post station】
If someone talks to you about indexing , Is it the first time you think about databases , So what does indexing solve ? Such as query SQL slow , When this happens , One of the first things to do is to see if it's slow SQL Go to the database index .
The purpose of creating an index on a specific table in the database is to make it faster to search the table and find the required rows . You can create indexes using one or more columns of a database table , Facilitate rapid random search and efficient access to ordered records .
Example : Library catalog
A library catalogue is a register containing a list of books found in a library . Directories are organized like database tables , There are usually four columns : Title 、 author 、 Subject and publication date . There are usually two such directories : A sort by title , Another sort by author's name . In this case , You can think of a writer you want to read , Then read their books , Or find a specific title you know you want to read , In case you don't know the author's name . These catalogs are like indexes in a book database . They provide a sorted list of data , You can easily search through relevant information .
In short , An index is a data structure , Can be regarded as a directory , Point us to where the actual data is . therefore , When we create an index on a column of a table , We store this column and a pointer to the entire row in the index in the index . Let's assume a table containing a list of books , The image below shows “Title” The appearance of the index on the column :
Just like traditional relational data storage , We can also apply this concept to larger data sets . The trick to indexing is , We must carefully consider how users will access the data . For a number of sizes TB But the payload is very small ( Such as 1KB) Data set of , Indexing is a necessary condition for optimizing data access . Finding a small payload in such a large data set can be a real challenge , Because we can't iterate over so much data in any reasonable time . Besides , Such a large data set is likely to be distributed on multiple physical devices , This means that we need some way to find the correct physical location of the data we need . Indexing is the best way to do this .
Why indexes degrade write performance ?
Indexing can greatly speed up data retrieval , But with the extra keys , The index itself can be very large , This slows down data insertion and updates .
When adding rows to a table with an active index or updating existing rows , We don't just have to write data , And update the index . This reduces write performance . This performance degradation applies to all inserts of the table 、 Update and delete operations . therefore , Avoid adding unnecessary indexes to tables , And delete indexes that are no longer in use . To reiterate , Adding indexes is to improve the performance of search queries . If the goal of the database is to provide a data store that is frequently written but rarely read , that , Reduce the more common operations ( Write now ) The performance of may not be worth the performance improvement we get from reading . You can refer to it Wiki Encyclopedias https://en.wikipedia.org/wiki/Database_index Database index .
Off topic author's supplement
The Google system design guide specifies what we call the advantages and disadvantages of indexing , Well, actually, think deeply , Indexing is the solution to the read problem , Data storage is the solution to the write problem , And when we design the system , In the process of middleware , You will find that a large number of designs are separated from reading and writing , For example, writing to a disk is sequential , Disk reading is random reading . So the purpose of using an index is for us to make a trade-off , Does the index help us , If there is only one data record, then no index can . If the data is very large , Many redundant indexes are built, which undoubtedly makes it more difficult for us to write .
Reference material
grok_system_design_interview.pdf
边栏推荐
- Use of golang testing framework test
- Jimureport building block report - expression introduction
- How to use and apply for ECS? What parameters can be configured
- Zhang Xiaodan, chief architect of Alibaba cloud hybrid cloud: evolution and development of government enterprise hybrid cloud technology architecture
- LeetCode 1662. Check whether two string arrays are equal
- LeetCode 1290. Binary linked list to integer
- What is the new generation cloud computing architecture cipu of Alibaba cloud?
- SAP mts/ato/mto/eto topic 10: ETO mode q+ empty mode unvalued inventory policy customization
- The principle of defer keyword in go
- What is an ECS? What is the difference between ECs and traditional servers?
猜你喜欢

SAP mts/ato/mto/eto topic 7: ATO mode 1 m+m mode strategy 82 (6892)

重新认识WorkPlus,不止IM即时通讯,是企业移动应用管理专家

阿里云新一代云计算体系架构 CIPU 到底是啥?

SAP mts/ato/mto/eto topic 10: ETO mode q+ empty mode unvalued inventory policy customization

解析后人类时代类人机器人的优越性

Leetcode (question 1) - sum of two numbers

Training methods after the reform of children's programming course

Are you ready for the exam preparation strategy of level II cost engineer in 2022?

解析90后创客教育的主观积极性

Detailed explanation of tcpip protocol
随机推荐
Ext4 file system jam caused by MEM CGroup OOM
Bi-sql basic cognition
CTF learning notes 17:iwesec file upload vulnerability-02 file name filtering bypass
What are the disadvantages of the free IP address replacement tool?
Integration of Alibaba cloud SMS services and reasons for illegal message signing
The conference assistant hidden in wechat is the best way to work efficiently!
Analyzing the superiority of humanoid robot in the post human era
Is it useful to build an industrial knowledge map platform?
Many regulations come into effect today! The main responsibility of network security will be further implemented
How should a new data center be built?
Three methods of local storage
2021-08-27: the normal odometer will display natural numbers in turn to indicate mileage, Kyrgyzstan
重新认识WorkPlus,不止IM即时通讯,是企业移动应用管理专家
Pg-pool-ii read / write separation experience
SAP MTS/ATO/MTO/ETO专题之七:ATO模式1 M+M模式策略用82(6892)
Jimureport building block report - what problems does the layout design solve?
What domain name does not need to be filed? What should be done for domain name filing
Blackmail virus prevention guide
让孩子们学习Steam 教育的应用精髓
Pgbouncer lightweight PG connection pool management tool