当前位置:网站首页>Mongodb index management of distributed document storage database

Mongodb index management of distributed document storage database

2020-11-10 00:10:00 Linux-1874

   We talked about MongoDB An introduction to the 、 Installation and alignment collection Of CRUD operation , Please refer to https://www.cnblogs.com/qiuhom-1874/p/13941797.html; Today, let's talk about mongodb The index of ;

  1、 Why index ? What is the function of index ?

   We know mongodb It's usually applied to some web Site , In a scenario with a lot of data ; In the context of big data , For us to query a data ,mongodb Whether we can respond to the results quickly becomes particularly important ; This is also the meaning of the index ; Index is used to help us quickly query the data we want in large data sets ; Usually we mongodb When a piece of data is inserted into ,mongodb It will automatically add a _id Field of , This field is mongodb Internal maintenance , Usually we don't care about it ; In a relational database , We can build an index on a single field , You can also build indexes on multiple fields , The reason why we want to build an index in multiple fields is that our query criteria are likely to use more than one field ; So the principle of building an index is to build it according to the query conditions ; For example, we need to check that the age is older than 30 Who are the users of , We can build the index on the age field , Build on other fields , For us to query, the age is older than 30 This condition is meaningless ; So building an index usually leads us to understand the most frequent queries of users , Build an index on the fields that users query most often , This can effectively improve the user's query ; about mongodb It's the same , Indexes exist to improve our queries ;

  2、 Why can the index help us find quickly ?

   First, the index is built according to the fields we specify , Building an index is to extract the fields we specify , And then sort it out in advance ( Or arrange them in a regular way ), And then save it as another collection; When users are looking for data ,mongodb First, I'll look for the index , See if the user's criteria match the index , Able to match , The index can tell the user where the data to be queried is , In this way, we can quickly find the data that users query ; If the index we build doesn't match the user's query , Then the user's query will be traversed to find , In this way, the speed will be slowed down ( It wasn't indexed , Direct traversal , Now there's an index , Look up the index first , missed , And traverse ); So building an index , If it must be a large amount of data to build , Small amount of data , Building an index doesn't help us find content quickly , On the contrary, it will slow down our query speed ; Second, in terms of a large amount of data , If the field constructed by the index is not hit by the query , So the index I'm building doesn't make sense ;

  3、 To a certain extent, the index is to affect the performance of users' writing

   After we build an index in a field , When users write data , It's usually an extra time to write io; For write requests , In the absence of an index , Users only need to write once io, With the index, every time a user writes a piece of data , The index will be written once io; In this way, the writing performance of users will be affected to a certain extent ; But usually we build the index in the scenario of reading more and writing less ; In a scenario where there are not too many write requests, actually write once more io, Compared with the pressure of reading requests, we can accept ; What's more, some databases support delayed index writing , The so-called delayed index writing refers to that when users insert data , It doesn't index immediately , It's about waiting for a while to write , In this way, it can effectively reduce the impact of write index on user's write request performance ;

   The above figure mainly describes the relationship between index and document , The data in the index is usually the field we specify , Put together in a specific arrangement , When a user queries some data , You can quickly get the location of the corresponding document from the index , So you don't have to traverse each document next to each other ; That's why the index can help us find it quickly ;

  4、 Index type

   Indexes are typed , Different types of indexes organize indexes in different ways , Different types of indexes have different effects on our query ; Common index types are b+ tree( Balanced tree index ),hash Indexes 、 Spatial index 、 Full text index and so on ; stay mongodb There are also many types of indexes in , The difference is the index type we mentioned above ,b+ tree,hash Index these are described from the internal organization structure of the index ; stay mongodb We describe the index in terms of where the index was built ;mongodb The index in has a single key index 、 Composite index 、 Multi key index 、 Spatial index 、 Text index and hash Indexes ; A single key index is an index built on a field ; A composite index is an index built on multiple fields ; A multi key index is to build an index on a field where the value of a key is a subdocument ; We know that documents and documents can be nested , It also means that one document can reference another document internally , The value of a key in a document can also be another subdocument ; We build this index in a document where a key is a field of a subdocument, which is called a multi key index , It doesn't correspond to a single key index ; Spatial index refers to the index based on location query , But usually, this kind of index can only be queried by specific methods , It's going to work , For example, using functions based on spatial location ; Text index refers to the support for searching text information in the whole document , Usually this kind of index is also called full text index ;hash Index means to make the value of a field hash The index of the organization after calculation ; One of the characteristics of this index is that the time complexity is o(1); No matter how much data there is , It takes the same amount of time to find data ; The time complexity is o(1), as a result of hash Calculate that each value is unique ; The search method of this index is similar to that of key value search , The difference is hash Behind it is a hash bucket , First find hash bucket , And then find the corresponding hash value ;hash Index and b+ The biggest difference between tree indexes is ,b+ A tree index can query a range , Because a tree index usually organizes data into an ordered structure , and hash Index cannot ,hash An index can only find an exact value , Can't find a range ; because hash Behind the index is a hash value , Every hash Values may not be in one hash bucket , So if we want to query the age is older than 30 Year old user , use hash Index is not suitable for , because 30 and 31 Of hash The value may not be in one hash On the barrel ;

  5、 stay mongodb Create an index on the database

   Prepare the data

> use testdb
switched to db testdb
> for (i=1;i<=1000000;i++) db.peoples.insert({name:"people"+i,age:(i%120),classes:(i%20)})
WriteResult({ "nInserted" : 1 })
> db.peoples.find().count()
1000000
> db.peoples.find()
{ "_id" : ObjectId("5fa943987a7deafb9e543326"), "name" : "people1", "age" : 1, "classes" : 1 }
{ "_id" : ObjectId("5fa943987a7deafb9e543327"), "name" : "people2", "age" : 2, "classes" : 2 }
{ "_id" : ObjectId("5fa943987a7deafb9e543328"), "name" : "people3", "age" : 3, "classes" : 3 }
{ "_id" : ObjectId("5fa943987a7deafb9e543329"), "name" : "people4", "age" : 4, "classes" : 4 }
{ "_id" : ObjectId("5fa943987a7deafb9e54332a"), "name" : "people5", "age" : 5, "classes" : 5 }
{ "_id" : ObjectId("5fa943987a7deafb9e54332b"), "name" : "people6", "age" : 6, "classes" : 6 }
{ "_id" : ObjectId("5fa943987a7deafb9e54332c"), "name" : "people7", "age" : 7, "classes" : 7 }
{ "_id" : ObjectId("5fa943987a7deafb9e54332d"), "name" : "people8", "age" : 8, "classes" : 8 }
{ "_id" : ObjectId("5fa943987a7deafb9e54332e"), "name" : "people9", "age" : 9, "classes" : 9 }
{ "_id" : ObjectId("5fa943987a7deafb9e54332f"), "name" : "people10", "age" : 10, "classes" : 10 }
{ "_id" : ObjectId("5fa943987a7deafb9e543330"), "name" : "people11", "age" : 11, "classes" : 11 }
{ "_id" : ObjectId("5fa943987a7deafb9e543331"), "name" : "people12", "age" : 12, "classes" : 12 }
{ "_id" : ObjectId("5fa943987a7deafb9e543332"), "name" : "people13", "age" : 13, "classes" : 13 }
{ "_id" : ObjectId("5fa943987a7deafb9e543333"), "name" : "people14", "age" : 14, "classes" : 14 }
{ "_id" : ObjectId("5fa943987a7deafb9e543334"), "name" : "people15", "age" : 15, "classes" : 15 }
{ "_id" : ObjectId("5fa943987a7deafb9e543335"), "name" : "people16", "age" : 16, "classes" : 16 }
{ "_id" : ObjectId("5fa943987a7deafb9e543336"), "name" : "people17", "age" : 17, "classes" : 17 }
{ "_id" : ObjectId("5fa943987a7deafb9e543337"), "name" : "people18", "age" : 18, "classes" : 18 }
{ "_id" : ObjectId("5fa943987a7deafb9e543338"), "name" : "people19", "age" : 19, "classes" : 19 }
{ "_id" : ObjectId("5fa943987a7deafb9e543339"), "name" : "people20", "age" : 20, "classes" : 0 }
Type "it" for more
> it
{ "_id" : ObjectId("5fa943987a7deafb9e54333a"), "name" : "people21", "age" : 21, "classes" : 1 }
{ "_id" : ObjectId("5fa943987a7deafb9e54333b"), "name" : "people22", "age" : 22, "classes" : 2 }
{ "_id" : ObjectId("5fa943987a7deafb9e54333c"), "name" : "people23", "age" : 23, "classes" : 3 }
{ "_id" : ObjectId("5fa943987a7deafb9e54333d"), "name" : "people24", "age" : 24, "classes" : 4 }
{ "_id" : ObjectId("5fa943987a7deafb9e54333e"), "name" : "people25", "age" : 25, "classes" : 5 }
{ "_id" : ObjectId("5fa943987a7deafb9e54333f"), "name" : "people26", "age" : 26, "classes" : 6 }
{ "_id" : ObjectId("5fa943987a7deafb9e543340"), "name" : "people27", "age" : 27, "classes" : 7 }
{ "_id" : ObjectId("5fa943987a7deafb9e543341"), "name" : "people28", "age" : 28, "classes" : 8 }
{ "_id" : ObjectId("5fa943987a7deafb9e543342"), "name" : "people29", "age" : 29, "classes" : 9 }
{ "_id" : ObjectId("5fa943987a7deafb9e543343"), "name" : "people30", "age" : 30, "classes" : 10 }
{ "_id" : ObjectId("5fa943987a7deafb9e543344"), "name" : "people31", "age" : 31, "classes" : 11 }
{ "_id" : ObjectId("5fa943987a7deafb9e543345"), "name" : "people32", "age" : 32, "classes" : 12 }
{ "_id" : ObjectId("5fa943987a7deafb9e543346"), "name" : "people33", "age" : 33, "classes" : 13 }
{ "_id" : ObjectId("5fa943987a7deafb9e543347"), "name" : "people34", "age" : 34, "classes" : 14 }
{ "_id" : ObjectId("5fa943987a7deafb9e543348"), "name" : "people35", "age" : 35, "classes" : 15 }
{ "_id" : ObjectId("5fa943987a7deafb9e543349"), "name" : "people36", "age" : 36, "classes" : 16 }
{ "_id" : ObjectId("5fa943987a7deafb9e54334a"), "name" : "people37", "age" : 37, "classes" : 17 }
{ "_id" : ObjectId("5fa943987a7deafb9e54334b"), "name" : "people38", "age" : 38, "classes" : 18 }
{ "_id" : ObjectId("5fa943987a7deafb9e54334c"), "name" : "people39", "age" : 39, "classes" : 19 }
{ "_id" : ObjectId("5fa943987a7deafb9e54334d"), "name" : "people40", "age" : 40, "classes" : 0 }
Type "it" for more
> 

   Tips : You can create test data in a circular way , It's the loop here and c The cycle in language is the same ; stay mongodb View data in , When there's too much data , It doesn't show all at once , It's pagination , Each time the default display 20 strip ; type it The command can display the next page ;

   stay mongodb Create index on , Grammar format :db.mycoll.ensureIndex(keypattern[,options]) perhaps db.mycoll.createIndex(keypattern[,options])

   stay name Create index on field

> db.peoples.ensureIndex({name:1})
{
        "createdCollectionAutomatically" : false,
        "numIndexesBefore" : 1,
        "numIndexesAfter" : 2,
        "ok" : 1
}
> 

   Tips : there name Refers to the field name , Not the index name ; hinder 1 Expressing ascending order ,-1 Representation of descending order ;

   Look at the index

> db.peoples.getIndices()
[
        {
                "v" : 2,
                "key" : {
                        "_id" : 1
                },
                "name" : "_id_"
        },
        {
                "v" : 2,
                "key" : {
                        "name" : 1
                },
                "name" : "name_1"
        }
]
> 

   Tips : You can see from the above results that ,peoples There are two indexes on this set , A group called _id_, The corresponding field is _id, Arrange in ascending order ; A group called name_1, The corresponding field is name, Arrange in ascending order ; By default, the index is not named , It is the field name followed by an underline , Add the numbers that indicate ascending or descending order ; As shown below

   Delete index

> db.peoples.dropIndex("name_1")
{ "nIndexesWas" : 3, "ok" : 1 }
> db.peoples.dropIndex("age_-1")
{ "nIndexesWas" : 2, "ok" : 1 }
> db.peoples.getIndices()
[ { "v" : 2, "key" : { "_id" : 1 }, "name" : "_id_" } ]
> 

   Tips : To delete an index, you need to specify the index name , And it needs to be in quotation marks ;

   stay name Building a unique key index on a field

   Tips : Create a unique key index , We just need to add unique:true This option is enough ; The so-called unique key means that the value in the field we specify must be unique ; If we have fields that are repeated before the data is inserted , This will insert a failure ;

   verification : Insert a name The field values for peoples23 The data of , See if you can insert it successfully ?

   Tips : You can see when we are name After building a unique key index on a field , In the insert name When a field has data of the same value , It tells us that the inserted data is duplicated ; We are not allowed to insert ; It shows that the unique key index we created is in effect ;

   Rebuild index

> db.peoples.reIndex()
{
        "nIndexesWas" : 2,
        "nIndexes" : 2,
        "indexes" : [
                {
                        "v" : 2,
                        "key" : {
                                "_id" : 1
                        },
                        "name" : "_id_"
                },
                {
                        "v" : 2,
                        "unique" : true,
                        "key" : {
                                "name" : 1
                        },
                        "name" : "name_1"
                }
        ],
        "ok" : 1
}
> 

   Tips : If we want to change the index , You can delete the redo key , above reIndex Unable to modify the attribute information of the original index ;

   Build index and specify for background build , Release the current shell

> db.peoples.createIndex({age:-1},{background:true})
{
        "createdCollectionAutomatically" : false,
        "numIndexesBefore" : 2,
        "numIndexesAfter" : 3,
        "ok" : 1
}
> db.peoples.getIndices()
[
        {
                "v" : 2,
                "key" : {
                        "_id" : 1
                },
                "name" : "_id_"
        },
        {
                "v" : 2,
                "unique" : true,
                "key" : {
                        "name" : 1
                },
                "name" : "name_1"
        },
        {
                "v" : 2,
                "key" : {
                        "age" : -1
                },
                "name" : "age_-1",
                "background" : true
        }
]
> 

   Delete all manually built indexes

> db.peoples.dropIndexes()
{
        "nIndexesWas" : 3,
        "msg" : "non-_id indexes dropped for collection",
        "ok" : 1
}
> db.peoples.getIndices()
[ { "v" : 2, "key" : { "_id" : 1 }, "name" : "_id_" } ]
> 

   Create a composite index

> db.peoples.createIndex({name:1,age:1},{background:true})
{
        "createdCollectionAutomatically" : false,
        "numIndexesBefore" : 1,
        "numIndexesAfter" : 2,
        "ok" : 1
}
> db.peoples.getIndices()
[
        {
                "v" : 2,
                "key" : {
                        "_id" : 1
                },
                "name" : "_id_"
        },
        {
                "v" : 2,
                "key" : {
                        "name" : 1,
                        "age" : 1
                },
                "name" : "name_1_age_1",
                "background" : true
        }
]
> 

   With name The field is the conditional query data , have a look mongodb The query process

> db.peoples.find({name:"people1221"}).explain()
{
        "queryPlanner" : {
                "plannerVersion" : 1,
                "namespace" : "testdb.peoples",
                "indexFilterSet" : false,
                "parsedQuery" : {
                        "name" : {
                                "$eq" : "people1221"
                        }
                },
                "queryHash" : "01AEE5EC",
                "planCacheKey" : "4C5AEA2C",
                "winningPlan" : {
                        "stage" : "FETCH",
                        "inputStage" : {
                                "stage" : "IXSCAN",
                                "keyPattern" : {
                                        "name" : 1,
                                        "age" : 1
                                },
                                "indexName" : "name_1_age_1",
                                "isMultiKey" : false,
                                "multiKeyPaths" : {
                                        "name" : [ ],
                                        "age" : [ ]
                                },
                                "isUnique" : false,
                                "isSparse" : false,
                                "isPartial" : false,
                                "indexVersion" : 2,
                                "direction" : "forward",
                                "indexBounds" : {
                                        "name" : [
                                                "[\"people1221\", \"people1221\"]"
                                        ],
                                        "age" : [
                                                "[MinKey, MaxKey]"
                                        ]
                                }
                        }
                },
                "rejectedPlans" : [ ]
        },
        "serverInfo" : {
                "host" : "node01.test.org",
                "port" : 27017,
                "version" : "4.4.1",
                "gitVersion" : "ad91a93a5a31e175f5cbf8c69561e788bbc55ce1"
        },
        "ok" : 1
}
>

   Tips : From the results returned above, we can see that this query is IXSCAN( An index scan ), So the search quickly returned ; It also shows information about the index ;

   Combine name and age Field condition query , See if it hits the index ?

> db.peoples.find({$and:[{age:{$lt:80}},{name:{$gt:"people200"}}]}).explain()
{
        "queryPlanner" : {
                "plannerVersion" : 1,
                "namespace" : "testdb.peoples",
                "indexFilterSet" : false,
                "parsedQuery" : {
                        "$and" : [
                                {
                                        "age" : {
                                                "$lt" : 80
                                        }
                                },
                                {
                                        "name" : {
                                                "$gt" : "people200"
                                        }
                                }
                        ]
                },
                "queryHash" : "96038BC4",
                "planCacheKey" : "E71214BA",
                "winningPlan" : {
                        "stage" : "FETCH",
                        "inputStage" : {
                                "stage" : "IXSCAN",
                                "keyPattern" : {
                                        "name" : 1,
                                        "age" : 1
                                },
                                "indexName" : "name_1_age_1",
                                "isMultiKey" : false,
                                "multiKeyPaths" : {
                                        "name" : [ ],
                                        "age" : [ ]
                                },
                                "isUnique" : false,
                                "isSparse" : false,
                                "isPartial" : false,
                                "indexVersion" : 2,
                                "direction" : "forward",
                                "indexBounds" : {
                                        "name" : [
                                                "(\"people200\", {})"
                                        ],
                                        "age" : [
                                                "[-inf.0, 80.0)"
                                        ]
                                }
                        }
                },
                "rejectedPlans" : [ ]
        },
        "serverInfo" : {
                "host" : "node01.test.org",
                "port" : 27017,
                "version" : "4.4.1",
                "gitVersion" : "ad91a93a5a31e175f5cbf8c69561e788bbc55ce1"
        },
        "ok" : 1
}
>

   Tips : As you can see, we can also scan the index normally by combining two fields for conditional range query ;

版权声明
本文为[Linux-1874]所创,转载请带上原文链接,感谢