当前位置：网站首页>MySQL advanced - personal notes

Compared with other databases ,MySQL It's a little different , Its architecture can be applied in many different scenarios and play a good role . Mainly reflected in The architecture of the storage engine , The plug-in storage engine architecture integrates query processing and other system tasks to

And data storage and extraction . This architecture can choose the right storage engine according to the needs of the business and the actual needs .

Introduction of each layer ：

1.1 adjoining course

At the top are some clients and connection services , Contains the local sock Communication and most are client based / The implementation of server-side tools is similar to tcp/ip Communication for . Mainly completes some similar to the connection processing 、 Authorized certification 、 And related safety programs . On this floor

The concept of thread pool is introduced on , Provide threads for clients accessing through authentication security . Also on this layer, we can implement the system based on SSL Security links for . The server will also verify the operation permissions it has for each client of secure access .

1.2 Service layer

Management Serveices & Utilities	System management and control tools
SQL Interface:	SQL Interface . Accept user's SQL command , And returns the result of the query that the user needs . such as select from It's called SQL Interface
Parser	Parser . SQL Commands are validated and parsed by the parser when they are passed to the parser
Optimizer	Query optimizer . SQL The statement optimizes the query using the query optimizer before the query , Such as the where When the conditions , The optimizer decides whether to project or filter first .
Cache and Buffer	The query cache . If the query cache has a hit query result , The query statement can be directly retrieved from the query cache data . The caching mechanism consists of a series of small caches . For example, table caching , Record the cache ,key cache , Authority cache, etc

1.3. Engine layer

Storage engine layer , The storage engine is really responsible for MySQL The storage and extraction of data in , Server pass API Communicating with the storage engine . Different storage engines have different functions , So we can choose according to our actual needs .

1.4. Storage layer

Data storage layer , It mainly stores the data on the file system running on the bare device , And complete the interaction with the storage engine .

2. show profile

utilize show profile You can see sql The execution cycle of ！

2.1 Turn on profile

see profile Open or not ：show variables like '%profiling%'

If it's not on , It can be executed set profiling=1 Turn on ！

2.2 Use profile

perform show prifiles command , You can view recent queries .

according to Query_ID, It can be further implemented show profile cpu,block io for query Query_id Check it out. sql Specific implementation steps of .

2.3 General query process

mysql The query process of is ：

mysql The client through the protocol and mysql Server connection , Send query statement , Check the query cache first , If hit , Direct return , Otherwise, parse the statement , in other words , Before parsing the query , The server will access the query buffer first

save (query cache)—— It stores SELECT Statements and The corresponding query result set . If a query result is already in the cache , The server will no longer parse the query 、 Optimize 、 And perform . It simply returns the cached results to

Users can , This will greatly improve the performance of the system .

Syntax parser and preprocessor ： First mysql Pass the keyword will SQL Statement parsing , And make a corresponding one “ The parse tree ”.mysql analysis The device will use mysql Syntax rules validate and parse queries ; The preprocessor is based on some mysql

The rule further checks whether the parsed number is legal .

Query optimizer when parse tree is considered legal , And the optimizer converts it into an execution plan . A query can be executed in many ways , All return the same result . The role of the optimizer is to find the best execution plan

draw . then ,mysql Default BTREE Indexes , And a general direction is : No matter how tossed sql, At least for the moment ,mysql Only use one index in the table at most .

2.4 SQL Execution order of

Handwritten order ：

Order of actual execution ：

With Mysql Version update , Its optimizer is also constantly upgrading , The optimizer will analyze the performance consumption caused by different execution order and dynamically adjust the execution order . Here is the order of queries that often appear ：

2.5 MyISAM and InnoDB

Contrast item	MyISAM	InnoDB
Foreign keys	I won't support it	Support
Business	I won't support it	Support
Row table lock	Table locks , Even operating on one record locks the entire table , Not suitable for highly concurrent operations	Row lock , Lock only one line during operation , It doesn't affect anything else , Suitable for high concurrency The operation of
cache	Cache index only , Don't cache real data	Not only index but also real data , High memory requirements , And inside Memory size has a decisive impact on performance
concerns	Read performance	Write concurrently 、 Business 、 resources
Default installation	Y	Y
By default	N	Y
since belt system system surface Use	Y	N

show engines: View all database engines

show variables like '%storage_engine%' View the default database engine

The first 4 Chapter SQL preheating

1. common Join Query graph

2. Join Example

2.1 Create table statement

CREATE TABLE `t_dept` (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`deptName` VARCHAR(30) DEFAULT NULL,
`address` VARCHAR(40) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=INNODB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;

CREATE TABLE `t_emp` (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`name` VARCHAR(20) DEFAULT NULL,
`age` INT(3) DEFAULT NULL,
`deptId` INT(11) DEFAULT NULL,
empno int not null,
PRIMARY KEY (`id`),
KEY `idx_dept_id` (`deptId`)
#CONSTRAINT `fk_dept_id` FOREIGN KEY (`deptId`) REFERENCES `t_dept` (`id`)
) ENGINE=INNODB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;

INSERT INTO t_dept(deptName,address) VALUES(' Huashan Mountain ',' Huashan Mountain ');
INSERT INTO t_dept(deptName,address) VALUES(' Beggars' sect ',' luoyang ');
INSERT INTO t_dept(deptName,address) VALUES(' emei ',' Mount Emei ');
INSERT INTO t_dept(deptName,address) VALUES(' Wudang ',' Wudang Mountain ');
INSERT INTO t_dept(deptName,address) VALUES(' Mingjiao ',' Bright Summit ');
INSERT INTO t_dept(deptName,address) VALUES(' Shaolin ',' The shaolin temple ');
INSERT INTO t_emp(NAME,age,deptId,empno) VALUES(' The breeze pure Yang ',90,1,100001);
INSERT INTO t_emp(NAME,age,deptId,empno) VALUES(' yue buqun ',50,1,100002);
INSERT INTO t_emp(NAME,age,deptId,empno) VALUES(' linghu chong ',24,1,100003);
INSERT INTO t_emp(NAME,age,deptId,empno) VALUES(' master hongqi ',70,2,100004);
INSERT INTO t_emp(NAME,age,deptId,empno) VALUES(' Xiao feng ',35,2,100005);
INSERT INTO t_emp(NAME,age,deptId,empno) VALUES(' Extinction teacher ',70,3,100006);
INSERT INTO t_emp(NAME,age,deptId,empno) VALUES(' Zhou Zhiruo ',20,3,100007);
INSERT INTO t_emp(NAME,age,deptId,empno) VALUES(' Zhang Sanfeng ',100,4,100008);
INSERT INTO t_emp(NAME,age,deptId,empno) VALUES(' zhang wuji ',25,5,100009);
INSERT INTO t_emp(NAME,age,deptId,empno) VALUES(' Trinket ',18,null,100010);

2.2 Case study

1. All information about department personnel （ Department name is required ）

select * from t_emp a inner join t_dept b on a.deptId = b.id

2. List all personnel and their department information

select * from t_emp a left join t_dept b on a.deptId = b.id

3. List the personnel information of all departments

select * from t_emp a right join t_dept b on a.deptId = b.id

4. List the personnel information without department

select * from t_emp a left join t_dept b on a.deptId = b.id where b.id is null

5. List the Department information without personnel

select * from t_emp a right join t_dept b on a.deptId = b.id where a.id is null

6. Correspondence of all personnel and departments

select * from t_emp a left join t_dept b on a.deptId = b.id
union
select * from t_emp a right join t_dept b on a.deptId = b.id

7. All those who are not in the Department and those who are not in the Department

select * from t_emp a left join t_dept b on a.deptId = b.id where b.id is null
union
select * from t_emp a right join t_dept b on a.deptId = b.id where a.id is null

The first 5 Chapter Index optimization analysis

1. The concept of index

1.1 What is it?

MySQL The official definition of index is ： Indexes （Index） Help MySQL Data structure for efficient data acquisition . You can get the essence of the index ： An index is a data structure . It can be simply understood as Quickly find data structures in order .

Out of data , The database system also maintains a data structure that satisfies a specific search algorithm , These data structures are referenced in some way （ Point to ） data , In this way, advanced search algorithms can be implemented on these data structures . This data structure , It's Suo

lead . The following figure is an example of a possible indexing method ：

On the left is the data table , There are two columns and seven records , On the far left is the physical address of the data record . In order to speed up Col2 Lookup , You can maintain a binary search tree as shown on the right , Each node contains an index key value and a corresponding data record

A pointer to a physical address , So you can use Binary search obtains the corresponding data within a certain complexity , So as to quickly retrieve the qualified records . Generally speaking, the index itself is very large , It's impossible to store everything in memory , So the index goes to

To a disk stored as an index file .

1.2 Advantages and disadvantages

advantage ：

Improve the data retrieval The efficiency of , Reduce the IO cost .
Sort data through index columns , Reduce the cost of sorting data , To reduce the CPU Consumption of .

Inferiority ：

Although the index greatly improves the query speed , At the same time, it will reduce the speed of updating the table , Such as on the table INSERT、UPDATE and DELETE. Because when updating tables ,MySQL Not only to save data , Also save the fields in the index file that have index columns added every time it is updated , Will adjust because The index information after the key value changes caused by the update .
The index is actually a table , The table holds the primary key and index fields , And points to the record of the entity table , So index columns also need to Occupancy space Of

2. Mysql The index of

2.1 Btree Indexes

MySQL It uses Btree Indexes .

【 Initialization Introduction 】

One b Trees , The light blue block is called a disk block , You can see that each disk block contains several data items （ In dark blue ） And a pointer （ As shown in yellow ）

Such as disk blocks 1 Contain data item 17 and 35, Contains a pointer P1、P2、P3,

P1 Say less than 17 The disk blocks ,P2 It means that 17 and 35 Between the disk blocks ,P3 Greater than 35 The disk blocks .

The real data exists in the leaf node 3、5、9、10、13、15、28、29、36、60、75、79、90、99.

Non-leaf nodes do not store real data , Only data items that guide the search direction are stored , Such as 17、35 Does not really exist in the data table .

【 Lookup process 】

If you are looking for a data item 29, So the first thing you do is you take the disk block 1 Loaded by disk into memory , It happens once IO, It is determined in memory by binary search 29 stay 17 and 35 Between , Locked disk block 1 Of P2 The pointer , Memory time is very short （ phase

Than the disk IO） Negligible , Through the disk block 1 Of P2 Pointer to the disk address to the disk block 3 Loaded by disk into memory , Happen a second time IO,29 stay 26 and 30 Between , Locked disk block 3 Of P2 finger The needle , Load disk blocks with Pointers 8

To the memory , The third time IO, At the same time do binary search in memory to find 29, The end of the query , A total of three times IO. Here's the real story ,3 Layer of b+ Trees can represent millions of data , If it only takes three times to find millions of data IO, Performance improvement

It will be huge , If there is no index , Each data item has to happen once IO, So it's going to take a million IO, Obviously the cost is very, very high .

2.2 B+tree Indexes

B+Tree And B-Tree The difference between

1）B- The key words and records of the tree are put together , Leaf nodes can be regarded as external nodes , It doesn't contain any information ;B+ In the non leaf node of the tree, there are only keywords and indexes pointing to the next node , Records are only placed in leaf nodes .

2） stay B- In the tree , The closer to the root node, the faster the search time , Just find the keyword to confirm the existence of the record ; and B+ The search time of each record in the tree is basically the same , You need to go from the root node to the leaf node , In addition, we need to compare keywords in the leaf node . Look at it this way B- Trees seem to perform better than B+ Good tree , But in practice it is B+ Trees perform better . because B+ The non leaf nodes of the tree do not store the actual data , In this way, each node can hold more elements than B- There are many trees , Tree height ratio B- Small tree , The advantage of this is to reduce the number of disk accesses . Even though B+ Tree find A record requires more comparisons than B- There are many trees , But one disk access is equivalent to hundreds of memory comparisons , So in practice B+ The performance of the tree may be better , and B+ The leaf nodes of the tree are connected together with pointers , Easy to traverse in order （ For example, view all the in a directory file , All records in a table, etc ）, This is also used by many databases and file systems B+ Because of the trees .

reflection ： Why do you say B+ Tree ratio B- Tree is more suitable for file index and database index of operating system in practical application ？

1) B+ Tree's disk read and write costs less

B+ The internal node of the tree does not point to the specific information of the keyword . So its internal nodes are relative B The trees are smaller . If all the keywords of the same internal node are stored in the same disk block , Then the disk block can hold more keywords . The keywords that need to be searched are read into memory at one time The more . relatively speaking IO The number of reading and writing is also reduced .

2) B+ Tree query efficiency is more stable

Because non endpoints are not the nodes that ultimately point to the contents of the file , It's just the index of keywords in the leaf node . So any keyword search must take a path from root node to leaf node . All keyword queries have the same path length , The query efficiency of each data is equivalent .

2.3 Clustered index and non clustered index

Cluster index is not a separate index type , It's a way of storing data . The term ‘ Clustering ’ Indicates that data rows and adjacent key values are clustered and stored together . Here's the picture , The index on the left is the clustered index , Because the arrangement of data rows on the disk is consistent with the index sorting .

The benefits of clustered indexes ：

Sort by cluster index , When a query displays a certain range of data , Because the data is tightly connected , The database doesn't have to extract data from multiple data blocks , So it saves a lot of io operation .

The limitation of clustering index ：

about mysql The database currently has only innodb Data engine supports clustering index , and Myisam Clustering index is not supported .

There can only be one way to sort data in physical storage , So each Mysql The table of can only have one clustered index . In general, it is the primary key of the table .

In order to make full use of the clustering characteristics of clustering index , therefore innodb The primary key columns of the table shall be in an orderly order as far as possible id, It is not recommended to use disordered id, such as uuid such

2.4 Time complexity （ Expand ）

The same problem can be solved by different algorithms , The quality of an algorithm will affect the efficiency of the algorithm and even the program . The purpose of algorithm analysis is to select suitable algorithm and improve algorithm .

Time complexity refers to the amount of computation required to execute the algorithm , Use big O It means to record as ：O(…)

3. Mysql Index classification

3.1 Single value index

Concept ： That is, an index contains only a single column , A table can have multiple single-column indexes

grammar ：

Create with the table ：
CREATE TABLE customer (
id INT(10) UNSIGNED AUTO_INCREMENT ,
customer_no VARCHAR(200),
customer_name VARCHAR(200),
PRIMARY KEY(id),
KEY (customer_name)
);

Create a single value index separately ：
CREATE INDEX idx_customer_name ON customer(customer_name);

3.2 unique index

Concept ： The value of the index column must be unique , But you can have an empty value

Created with the table ：
CREATE TABLE customer (
id INT(10) UNSIGNED AUTO_INCREMENT ,
customer_no VARCHAR(200),
customer_name VARCHAR(200),
PRIMARY KEY(id),
KEY (customer_name),
UNIQUE (customer_no)
);

Create a unique index ：
CREATE UNIQUE INDEX idx_customer_no ON customer(customer_no);

3.3 primary key

Concept ： After setting it as the primary key, the database will automatically create an index ,innodb Index for clustering

Index with table
CREATE TABLE customer (id INT(10) UNSIGNED
AUTO_INCREMENT ,customer_no VARCHAR(200),customer_name
VARCHAR(200),
PRIMARY KEY(id)
);

Create a separate primary key index ：
ALTER TABLE customer add PRIMARY KEY customer(customer_no);

Delete Create a primary key index ：
ALTER TABLE customer drop PRIMARY KEY ;

modify Create a primary key index ：
You must delete (drop) The original index , New again (add) Indexes

3.4 Composite index

Concept ： That is, an index contains multiple columns

Index with table ：
CREATE TABLE customer (
id INT(10) UNSIGNED AUTO_INCREMENT ,
customer_no VARCHAR(200),
customer_name VARCHAR(200),
PRIMARY KEY(id),
KEY (customer_name),
UNIQUE (customer_name),
KEY (customer_no,customer_name)
);

Index separately ：
CREATE INDEX idx_no_name ON customer(customer_no,customer_name);

3.5 Basic grammar

operation	command
establish	CREATE [UNIQUE ] INDEX [indexName] ON table_name(column))
Delete	DROP INDEX [indexName] ON mytable;
see	SHOW INDEX FROM table_name\G
send use Alter command	ALTER TABLE tbl_name ADD PRIMARY KEY (column_list) : This statement adds a primary key , This means that the index value must be unique , And cannot be NULL.
	ALTER TABLE tbl_name ADD PRIMARY KEY (column_list) This statement creates the value of the index It has to be the only one （ except NULL Outside ,NULL There may be many times ）
	ALTER TABLE tbl_name ADD INDEX index_name (column_list): Add a normal index , Index values can appear multiple times .
	ALTER TABLE tbl_name ADD FULLTEXT index_name (column_list): The statement specifies that the index is FULLTEXT , For full text retrieval lead .

4. Index creation time

4.1 Suitable for index creation

Primary key automatically creates unique index ;
Fields that are frequently used as query criteria should be indexed
Fields associated with other tables in the query , Index foreign key relationship
Single key / The choice of Composite Index , Composite index is more cost-effective
Fields sorted in the query , If the sorting field is accessed by index, the sorting speed will be greatly improved
Statistics or grouping fields in query

4.2 Not suitable for index creation

There are too few records
A table or field that is frequently added, deleted, or modified
Where The fields that are not used in the condition are not indexed
Those with poor filtering are not suitable for indexing

5. Performance analysis

5.1 MySql Query Optimizer

1、Mysql Technical secondary schools are responsible for optimizing SELECT Statement optimizer module , The main function ： By calculating the statistics collected in the analysis system , Requested for client Query Provide the best execution plan he thinks （ He thinks the best way to retrieve data , But not necessarily DBA Think it's the best , This part is the most time consuming ）

2、 When the client MySQL Ask for one Query, The command parser module completes the request classification , Show the difference SELECT And forward it to MySQL Query Optimizer when ,MySQL Query Optimizer First of all, the whole Query To optimize , Deal with the budget of some constant expressions , Convert directly to constant value . Also on Query To simplify and convert the query conditions in , Such as removing some useless or obvious conditions 、 Structural adjustment has not Himt or Hint Information （ If there is ）, The statistics of the object involved will be read , according to Query Write the corresponding calculation and analysis , And then come up with the final execution plan .

5.2 MySQL Common bottlenecks

CPU：CPU When saturated, it usually occurs when data is loaded into memory or read from disk
IO: disk I/O The bottleneck occurs when the load data is much larger than the memory capacity
The performance bottleneck of server hardware ：top,free,iostat, and vmstat To see the performance status of the system

The first 6 Chapter Explain Performance analysis

1. Concept

Use EXPLAIN Keyword can simulate optimizer execution SQL Query statement , So they know MySQL How to deal with you SQL Of the statement . Analyze the performance bottleneck of your query statement or table structure .

What can I do?

Read order of tables
Operation type of data read operation
Which indexes can be used
Which indexes are actually used
References between tables
How many rows per table are queried by the optimizer

usage ：Explain+SQL sentence .

Explain Information returned after execution ：

2. Explain preparation

CREATE TABLE t1(
id INT(10) AUTO_INCREMENT,
content VARCHAR(100) NULL , PRIMARY KEY (id)
);
CREATE TABLE t2(
id INT(10) AUTO_INCREMENT,
content VARCHAR(100) NULL , PRIMARY KEY (id)
);
CREATE TABLE t3(
id INT(10) AUTO_INCREMENT,
content VARCHAR(100) NULL , PRIMARY KEY (id)
);
CREATE TABLE t4(
id INT(10) AUTO_INCREMENT,
content VARCHAR(100) NULL , PRIMARY KEY (id)
);
INSERT INTO t1(content) VALUES(CONCAT('t1_',FLOOR(1+RAND()*1000)));
INSERT INTO t2(content) VALUES(CONCAT('t2_',FLOOR(1+RAND()*1000)));
INSERT INTO t3(content) VALUES(CONCAT('t3_',FLOOR(1+RAND()*1000)));
INSERT INTO t4(content) VALUES(CONCAT('t4_',FLOOR(1+RAND()*1000)));

3. id

select The serial number of the query , Contains a set of numbers , Represents execution in a query select The order of clauses or operation tables .

explain select t2.*
from t1,t2,t3
where t1.id = t2.id and t1.id = t3.id
and t1.content = ''

①id identical , The order of execution is from top to bottom

explain select t2.*
from t2
where id = (select id
                               from t1
                               where id =(select t3.id
                                                       from t3
                                                       where t3.content = ''))

②id Different , If it's a subquery ,id The serial number of will increase ,id The higher the value, the higher the priority , The first to be executed

explain select t2.* from (
   select t3.id
   from t3
   where t3.content = '') s1, t2
   where s1.id = t2.id;

③ There are similarities and differences

id If the same , It can be thought of as a group , From top to bottom ; In all groups ,id The bigger the value is. , The higher the priority , Execute first

derivative = DERIVED

concerns ：id Number, each number , Represents an independent query . One sql The fewer query times, the better .

4. select_type

select_type Represents the type of query , It is mainly used to distinguish ordinary queries 、 The joint query 、 Complex queries such as subqueries .

4.1 SIMPLE

SIMPLE Represents a single table query , The query does not contain subqueries or UNION;

4.2 PRIMARY

If the query contains any complex sub parts , The outermost query is marked Primary.

4.3 DERIVED

stay FROM The subqueries contained in the list are marked with DERIVED( derivative ),MySQL These subqueries will be executed recursively , Put the results on the provisional list .

4.4 SUBQUERY

stay SELECT or WHERE The list contains subqueries .

4.5 DEPENDENT SUBQUERY

stay SELECT or WHERE The list contains subqueries , The subquery is based on the outer layer .

All are where Later conditions ,subquery It's a single value ,dependent subquery It's a set of values .

4.6 UNCACHEABLE SUBQUREY

When using the @@ To reference system variables , Cache will not be used .

4.7 UNION

If the second SELECT Appear in the UNION after , Is marked as UNION; if UNION Included in FROM Clause , Outer layer SELECT Will be marked as ：DERIVED.

4.8 UNION RESULT

from UNION Table to get the result SELECT.

5. table

Which table is this data based on .

6. type

type Is the access type of the query . Is a more important indicator , The result value from the best to the worst is ：

system > const > eq_ref > ref > fulltext > ref_or_null > index_merge > unique_subquery > index_subquery > range > index > ALL , Generally speaking , Make sure that the query is at least range Level , It's better to achieve ref.

6.1 system

There is only one line in the table （ It's equal to the system table ）, This is a const Type of specials , Not usually , This can also be ignored

6.2 const

Indicates that it is found through index once ,const For comparison primary key perhaps unique Indexes . Because only one line of data is matched , So soon

For example, place the primary key in where In the list ,MySQL You can convert the query to a constant .

6.3 eq_ref

Unique index scan , For each index key , Only one record in the table matches it . Common in primary key or unique index scan .

6.4 ref

Non unique index scan , Returns all rows that match a single value . In essence, it is also an index access , It returns all rows that match a single value ,

However , It may find more than one eligible row , So it should be a mixture of search and scan .

Before using the index ：

After indexing ：

6.5 range

Retrieve only rows in the given range , Use an index to select rows .key The column shows which index is used, usually in your where Statement

了 between、<、>、in The range scan index scan is better than the full table scan , Because it just needs to start at a certain point in the index , and

The other point of conclusion is , Don't scan all indexes .

6.6 index

appear index yes sql The index is used, but It's useless to filter through the index , Generally used Overlay index Or is it The index is used to sort and group .

6.7 all

Full Table Scan, Will traverse the entire table to find the matching rows .

6.8 index_merge

In the query process, we need to use multiple indexes in combination , It usually occurs when there is or The keywords of sql in .

6.9 ref_or_null

For a field, both association conditions are required , Also needed null When it's worth it . The query optimizer will choose to use ref_or_null Link query .

6.10 index_subquery

Using indexes to associate subqueries , No more full table scans .

6.11 unique_subquery

The join type is similar to index_subquery. Unique index in subquery .

7. possible_keys

Show the indexes that may be applied to this table , One or more . If there is an index on the field involved in the query , Then the index will be listed , But different Must be queried for actual use .

8. key

Actual index used . If NULL, No index is used .

9. key_len

Represents the number of bytes used in the index , You can use this column to calculate the length of the index used in the query . key_len Field can help you check whether you make full use of the index .ken_len The longer the , Explain that the more fully used the index .

How to calculate ：

① Let's first look at the types of fields on the index + Length, e.g int=4 ; varchar(20) =20 ; char(20) =20

② If it is varchar perhaps char This string field , Depending on the character set, multiply by different values , such as utf-8 To ride 3,GBK To ride 2,

③varchar This dynamic string needs to be added 2 Bytes

④ Fields that are allowed to be empty are added with 1 Bytes

The first group ：key_len=age Byte length of +name Byte length of =4+1 + ( 20*3+2)=5+62=67

The second group ：key_len=age Byte length of =4+1=5

10. ref

Shows which column of the index is used , If possible , It's a constant . Which columns or constants are used to find values on index columns .

11. rows

rows Columns show MySQL The number of rows that it must check to execute the query . The less, the better. ！

12. Extra

Other extra important information .

12.1 Using filesort

explain mysql Will use an external index to sort the data , Instead of reading in the order of the indexes in the table .MySQL Index cannot be used in The sort operation completed is called “ File sorting ”.

appear filesort The situation of ：

After optimization , No more filesort The situation of ：

Fields sorted in the query , If the sorting field is accessed by index, the sorting speed will be greatly improved .

12.2 Using temporary

Use temporary table to save intermediate results ,MySQL Use temporary tables when sorting query results . It is common in sorting order by And group queries group by.

Before optimization ：

After optimization ：

12.3 Using index

Using index The representative represents the corresponding select The override index is used in the operation (Covering Index), Avoid accessing the data rows of the table , Good efficiency ！ If it appears at the same time using where, Indicates that the index is used to perform index key value lookups ; If not at the same time using where, Indicates that the index is only used to read data and not to perform searches using the index .

Sort or group by index .

12.4 Using where

Indicates that where Filter .

12.5 Using join buffer

Connection caching is used .

12.6 impossible where

where The value of the clause is always false, Can't be used to get any tuples .

12.7 select tables optimized away

In the absence of GROUPBY In the case of clause , Index based optimization MIN/MAX Operation or for MyISAM Storage engine optimization COUNT(*) operation , You don't have to wait until the execution phase to do the calculation , The query execution plan generation phase completes the optimization .

stay innodb in ：

stay Myisam in ：

The first 7 Chapter Batch data script

1. insert data

1.1 Create table statement

CREATE TABLE `dept` (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`deptName` VARCHAR(30) DEFAULT NULL,
`address` VARCHAR(40) DEFAULT NULL,
ceo INT NULL ,
PRIMARY KEY (`id`)
) ENGINE=INNODB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
CREATE TABLE `emp` (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`empno` INT NOT NULL ,
`name` VARCHAR(20) DEFAULT NULL,
`age` INT(3) DEFAULT NULL,
`deptId` INT(11) DEFAULT NULL,
PRIMARY KEY (`id`)
#CONSTRAINT `fk_dept_id` FOREIGN KEY (`deptId`) REFERENCES `t_dept` (`id`)
) ENGINE=INNODB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;

1.2 Set parameters

Before executing the create function , First of all, please ensure log_bin_trust_function_creators Parameter is 1, namely on On state .

Otherwise, an error will be reported ：

Inquire about ：show variables like 'log_bin_trust_function_creators';

Set up ：set global log_bin_trust_function_creators=1;

Of course , The above settings only exist in the current operation , Want permanent effect , Need to write to the configuration file ：

stay [mysqld] Medium plus log_bin_trust_function_creators=1

1.3 Write random functions

Create a function , Make sure every data is different .

1.3.1 Randomly generate strings

DELIMITER $$
CREATE FUNCTION rand_string(n INT) RETURNS VARCHAR(255)
BEGIN
DECLARE chars_str VARCHAR(100) DEFAULT 'abcdefghijklmnopqrstuvwxyzABCDEFJHIJKLMNOPQRSTUVWXYZ';
DECLARE return_str VARCHAR(255) DEFAULT '';
DECLARE i INT DEFAULT 0;
WHILE i < n DO
SET return_str =CONCAT(return_str,SUBSTRING(chars_str,FLOOR(1+RAND()*52),1));
SET i = i + 1;
END WHILE;
RETURN return_str;
END $$

If you want to delete a function , execute ：drop function rand_string;

# Used to randomly generate number to number
DELIMITER $$
CREATE FUNCTION rand_num (from_num INT ,to_num INT) RETURNS INT(11)
BEGIN
DECLARE i INT DEFAULT 0;
SET i = FLOOR(from_num +RAND()*(to_num -from_num+1))
;
RETURN i;
END$$

If you want to delete a function ：drop function rand_num;

1.4 Create stored procedure

1.4.1 To create emp A stored procedure that inserts data into a table

DELIMITER $$
CREATE PROCEDURE insert_emp( START INT , max_num INT )
BEGIN
DECLARE i INT DEFAULT 0;
#set autocommit =0 hold autocommit Set to 0
SET autocommit = 0;
REPEAT
SET i = i + 1;

INSERT INTO emp (empno, NAME ,age ,deptid ) VALUES ((START+i) ,rand_string(6) ,

rand_num(30,50),rand_num(1,10000));

UNTIL i = max_num

END REPEAT;

COMMIT;

END$$

# Delete

# DELIMITER ;

# drop PROCEDURE insert_emp;

1.4.2 To create dept A stored procedure that inserts data into a table

# Execute stored procedures , Go to dept Add random data to table
DELIMITER $$
CREATE PROCEDURE `insert_dept`( max_num INT )
BEGIN
DECLARE i INT DEFAULT 0;
SET autocommit = 0;
REPEAT
SET i = i + 1;
INSERT INTO dept ( deptname,address,ceo ) VALUES (rand_string(8),rand_string(10),rand_num(1,500000));
UNTIL i = max_num
END REPEAT;
COMMIT;
END$$

1.5 Calling stored procedure

1.5.1 Add data to department table

# Execute stored procedures , Go to dept Table to add 1 Ten thousand data
DELIMITER ;
CALL insert_dept(10000);

1.5.2 Add data to employee table

# Execute stored procedures , Go to emp Table to add 50 Ten thousand data
DELIMITER ;
CALL insert_emp(100000,500000);

1.6 Batch delete all indexes on a table

1.6.1 Delete the stored procedure of the index

DELIMITER $$
CREATE PROCEDURE `proc_drop_index`(dbname VARCHAR(200),tablename VARCHAR(200))
BEGIN
DECLARE done INT DEFAULT 0;
DECLARE ct INT DEFAULT 0;
DECLARE _index VARCHAR(200) DEFAULT '';
DECLARE _cur CURSOR FOR
SELECT
index_name
FROM information_schema.STATISTICS
WHERE
table_schema=dbname AND table_name=tablename AND seq_in_index=1 AND
index_name <>'PRIMARY' ;
DECLARE CONTINUE HANDLER FOR NOT FOUND set done=2 ;
OPEN _cur;
FETCH
_cur INTO _index;
WHILE _index<>'' DO
SET @str = CONCAT("drop index ",_index," on ",tablename );
PREPARE sql_str FROM @str ;
EXECUTE sql_str;
DEALLOCATE PREPARE sql_str;
SET _index='';
FETCH
_cur INTO _index;
END WHILE;
CLOSE _cur;
END$$

1.6.2 Execute stored procedures

call ：CALL proc_drop_index("dbname","tablename");

The first 8 Chapter Common index failures in single table indexes

1. Full value matching my favorite

1.1 There are the following SQL sentence

EXPLAIN SELECT SQL_NO_CACHE * FROM emp WHERE emp.age=30
EXPLAIN SELECT SQL_NO_CACHE * FROM emp WHERE emp.age=30 and deptid=4
EXPLAIN SELECT SQL_NO_CACHE * FROM emp WHERE emp.age=30 and deptid=4 AND emp.name = 'abcd

1.2 Index

CREATE INDEX idx_age_deptid_name ON emp(age,deptid,NAME);

Conclusion ： Full time matching my favorite is , The fields of the query can be matched in the index in order ！

SQL The order of the query fields in , In the same order as using the fields in the index , It doesn't matter. . The optimizer will not affect SQL On the premise of implementing the results , Give you automatic optimization .

2. The best left prefix rule

The difference between the order of query fields and index fields will lead to , The index is not fully utilized , Even index failure ！

reason ： Using composite indexes , You need to follow the best left prefix rule , That is, if multiple columns are indexed , Follow the leftmost prefix rule . Refers to the query from the index Start at the top left and don't skip columns in the index .

Conclusion ： The index to be used for filter conditions must be in the order in which the index is created , Meet in turn , Once you skip a field , There are no fields after the index

The method is used .

3. Don't do any calculations on the index column

Do nothing on the index column （ Calculation 、 function 、( Automatically or Manual ) Type conversion ）, It will cause index invalidation and turn to full table scan .

3.1 The function... Is used on the query column

EXPLAIN SELECT SQL_NO_CACHE * FROM emp WHERE age=30;

EXPLAIN SELECT SQL_NO_CACHE * FROM emp WHERE LEFT(age,3)=30;

Conclusion ： There is no calculation to the left of the equal sign ！

3.2 A transformation is made on the query column

create index idx_name on emp(name);

explain select sql_no_cache * from emp where name='30000';

explain select sql_no_cache * from emp where name=30000;

String without single quotes , Will be in name Make a conversion on the column ！

Conclusion ： There is no conversion to the right of the equal sign ！

4. Cannot have range query on index column

explain SELECT SQL_NO_CACHE * FROM emp WHERE emp.age=30 and deptid=5 AND emp.name = 'abcd';

explain SELECT SQL_NO_CACHE * FROM emp WHERE emp.age=30 and deptid<=5 AND emp.name = 'abcd';

Suggest ： Put the index order of the fields that may be used for range query at the end

5. Try to use overlay index

That is, the query column and index column always , Do not write select *!

explain SELECT SQL_NO_CACHE * FROM emp WHERE emp.age=30 and deptId=4 and name='XamgXt';

explain SELECT SQL_NO_CACHE age,deptId,name FROM emp WHERE emp.age=30 and deptId=4 and name='XamgXt';

6. Use is not equal to (!= perhaps <>) When

mysql In use is not equal to (!= perhaps <>) when , Sometimes the index cannot be used, resulting in a full table scan .

7. Field is not null and is null

When the field is allowed to be Null Under the condition of ：

is not null No index ,is null You can use indexes .

8. like Fuzzy matching before and after

The prefix cannot have a fuzzy match ！

9. Reduce use or

Use union all perhaps union To replace ：

10. practice

hypothesis index(a,b,c);

Where sentence	Whether the index is used
where a = 3	Y, Use to a
where a = 3 and b = 5	Y, Use to a,b
where a = 3 and b = 5 and c = 4	Y, Use to a,b,c
where b = 3 perhaps where b = 3 and c = 4 perhaps where c = 4	N
where a = 3 and c = 5	Use to a, however c Can not be ,b It's broken in the middle
where a = 3 and b > 4 and c = 5	Use to a and b, c Cannot be used after range ,b Broken
where a is null and b is not null	is null Support the index however is not null I won't support it , the With a You can use index , however b Not available
where a <> 3	Index cannot be used
where abs(a) =3	Out of commission Indexes
where a = 3 and b like 'kk%' and c = 4	Y, Use to a,b,c
where a = 3 and b like '%kk' and c = 4	Y, Only for a
where a = 3 and b like '%kk%' and c = 4	Y, Only for a
where a = 3 and b like 'k%kk%' and c = 4	Y, Use to a,b,c

11. formula

Full time matching my favorite , The leftmost prefix should follow ;

Leading brother can't die , The middle brother can't break ;

Less computation on index columns , After the range, it all fails ;

LIKE 100% right , Overlay index without writing *;

I don't want empty values and OR, Attention should be paid to the influence of index ;

VAR Don't throw quotation marks ,SQL Optimization has a knack .

The first 9 Chapter Association query optimization

1. Create table statement

CREATE TABLE IF NOT EXISTS `class` (
`id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`card` INT(10) UNSIGNED NOT NULL,
PRIMARY KEY (`id`)
);
CREATE TABLE IF NOT EXISTS `book` (
`bookid` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`card` INT(10) UNSIGNED NOT NULL,
PRIMARY KEY (`bookid`)
);
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));

2. Case study

2.1 left join

①EXPLAIN SELECT * FROM class LEFT JOIN book ON class.card = book.card;

② How to optimize ？ On which table to index ？

ALTER TABLE `book` ADD INDEX idx_card( `card`);

③ Delete book Index of tables ：drop index idx_card on book;

stay class Index tables ：alter table class add index idx_card(card);

Conclusion ：

① When optimizing associative queries , An index is only valid if it is created on the driven table ！

②left join when , On the left is the drive table , On the right is the driven table ！

2.2 inner join

①EXPLAIN SELECT * FROM book inner join class on class.card=book.card;

② Exchange order of two query fields , It turns out the same thing ！

③ stay book In the table , Delete 9 Bar record

④ Conclusion ：inner join when ,mysql Will help you choose the table of small result set as the driving table .

⑤straight_join: Effect and inner join equally , But it will force the left side to be the driving table ！

2.3 Analysis of four related query cases

EXPLAIN SELECT ed.name ' figure ',c.name ' representative or leader in a certain field ' FROM
(SELECT e.name,d.ceo from t_emp e LEFT JOIN t_dept d on e.deptid=d.id) ed
LEFT JOIN t_emp c on ed.ceo= c.id;

EXPLAIN SELECT e.name ' figure ',tmp.name ' representative or leader in a certain field '
FROM t_emp e LEFT JOIN (SELECT d.id did,e.name FROM t_dept d LEFT JOIN t_emp e ON d.ceo=e.id)tmp
ON e.deptId=tmp.did;

These two cases , The first query is more efficient , And there is room for optimization . In the second case , Subquery as driven table , Because the subquery is a virtual table , Unable to index , Therefore, it is not possible to optimize .

Conclusion ：

Subqueries should not be placed in the driven table , It's possible that indexes are not used ;

left join when , Try to make the entity table as the driven table .

EXPLAIN SELECT e1.name ' figure ',e2.name ' representative or leader in a certain field '
FROM t_emp e1
LEFT JOIN t_dept d on e1.deptid = d.id
LEFT JOIN t_emp e2 on d.ceo = e2.id ;

Explain SELECT e2.name ' figure ',
(SELECT e1.name FROM t_emp e1 where e1.id= d.ceo) ' representative or leader in a certain field '
from t_emp e2 LEFT JOIN t_dept d on e2.deptid=d.id;

Conclusion ： If you can directly associate multiple tables, try to associate directly , No sub query ！

The first 10 Chapter Sub query optimization

1. Case study

Take all employees who are not leaders , Group by age ！

select age as ' Age ', count(*) as ' The number of ' from t_emp where id not in
(select ceo from t_dept where ceo is not null) group by age;

How to optimize ？

① solve dept Full table scan of table , establish ceo Index of field ：

here , Query again ：

② Further optimization , Replace not in.

Above SQL Can be replaced with ：

select age as ' Age ',count(*) as ' The number of ' from emp e left join dept d on e.id=d.ceo where d.id is null group by age;

Conclusion ： When judging the range , Try not to use not in and not exists, Use left join on xxx is null Instead of .

The first 11 Chapter Sorting and grouping optimization

where Conditions and on Judge these filter conditions , As a priority optimization Department , Is to be considered first ！ secondly , If there is grouping and sorting , Then we should also consider grouo by and order by.

1. No filtering, no indexing

create index idx_age_deptid_name on emp (age,deptid,name);

explain select * from emp where age=40 order by deptid;

explain select * from emp order by age,deptid;

explain select * from emp order by age,deptid limit 10;

using filesort Instructions are sorted manually ！ The reason is that there is no where As a filter ！

Conclusion ： No filtering , No index .where,limt Are equivalent to a filter condition , So you can use the index ！

2. Wrong order , No sorting is required

①explain select * from emp where age=45 order by deptid,name;

②explain select * from emp where age=45 order by deptid,empno;

empno Field is not indexed , Therefore, the index cannot be used , This field needs to be sorted ！

③explain select * from emp where age=45 order by name,deptid;

where The order of the columns on both sides can be changed , The effect is the same , however order by The order of columns cannot be changed randomly ！

④explain select * from emp where deptid=45 order by age;

deptid Fields as filter conditions , Index not available , So sorting can't be indexed

3. In the opposite direction , No sorting is required

①explain select * from emp where age=45 order by deptid desc, name desc ;

If the fields that can be indexed are in positive or reverse order , In fact, it has no effect , It's nothing more than changing the order of the result set .

②explain select * from emp where age=45 order by deptid asc, name desc ;

If the sorted field , The order is different , You need to put the difference part , Perform an inverted sequence , So you still need to sort manually ！

4. Index selection

① First , eliminate emp All indexes above , Keep only the primary key index ！

drop index idx_age_deptid_name on emp;

② Inquire about ： Age is 30 Year old , And the employee number is less than 101000 Users of , Sort by user name

explain SELECT SQL_NO_CACHE * FROM emp WHERE age =30 AND empno <101000 ORDER BY NAME ;

③ Full table scanning is definitely not allowed , So we have to think about optimization .

Ideas ： First of all, let's where The filter conditions of , Use the index ; Querying ,age.empno Is the filter condition of query , and name Is the sorted field , So let's create a composite index of these three fields ：

create index idx_age_empno_name on emp(age,empno,name);

Query again , Find out using filesort There is still a .

reason ：empno It's range query , As a result, the index fails , therefore name Fields cannot be sorted by index . therefore , The matching index of the three fields , It makes no sense , because empno and name Only one of the fields can be selected ！

④ solve ： You can't have your cake and eat it too , therefore , Or choose empno, Or choose name

drop index idx_age_empno_name on emp;

create index idx_age_name on emp(age,name);

create index idx_age_empno on emp(age,empno);

Two indexes exist at the same time ,mysql Which one would you choose ？

explain SELECT SQL_NO_CACHE * FROM emp use index(idx_age_name) WHERE age =30 AND empno <101000 ORDER BY NAME ;

reason ： All sorts are performed after conditional filtering , So if the condition filters most of the data , Sorting hundreds of thousands of pieces of data is not very performance consuming , Even if the index optimizes sorting, the actual performance improvement is limited . Relative empno<101000 If the index is not used in this condition , To scan tens of thousands of pieces of data , This is very performance intensive , Use empno Field range query , Better filtration （empno from 100000 Start ）！

Conclusion ： When range conditions and group by perhaps order by One of two fields appears , Filter quantity of priority observation condition field , If enough data is filtered , When there is not much data to sort , Put the index on the range field first . conversely , also .

5. using filesort

5.1 mysql Sort algorithm

① Two way sorting

MySQL 4.1 It used to be two-way sorting , Literally, two scans of the disk , And finally get the data , Read the line pointer and orderby Column , Sort them out , Then scan the sorted list , Re read the corresponding data output from the list according to the value in the list .

Take sort field from disk , stay buffer Sort , Take other fields from disk .

Simply speaking , Take a batch of data , To scan the disk twice , as everyone knows ,I\O It's time consuming , So in mysql4.1 after , appear The second improved algorithm , It's one-way sorting .

② One way sorting

Read all the columns needed for the query from disk , according to order by Listed in buffer Sort them out , Then scan the sorted list for output , It's more efficient , Avoid reading data for the second time . And put random IO It becomes a sequence IO, But it will use more space , Because it keeps every line in memory .

③ The problem of one-way sorting

Because the single way is backward , Generally speaking, it's better than two-way . But there are the following problems ：

stay sort_buffer in , Method B Comparison method A Take up a lot of space , Because of the method B Take out all fields , So it's possible that the total size of the extracted data exceeds sort_buffer The capacity of , Cause only access sort_buffer Data of capacity size , Sort （ establish tmp file , Multiplex merge ）, Take it after you've finished sort_buffer Capacity size , Arrange again …… So many times I/O.

Conclusion ： I wanted to save it once I/O operation , Instead, it leads to a lot of I/O operation , It's not worth it .

5.2 How to optimize

① increase sort_butter_size Parameter settings

No matter which algorithm is used , Increasing this parameter will improve the efficiency , Of course , According to the ability of the system to improve , Because this parameter is for each process 1M-8M Adjust between .

② increase max_length_for_sort_data Parameter settings

mysql The premise of using one-way sorting is that the size of the sorted field should be less than max_length_for_sort_data.

Increase this parameter , Will increase the probability of using the improved algorithm . But if it's too high , The total data capacity exceeds sort_buffer_size The probability of that increases ,

The obvious symptom is high disk I/O Active and low processor utilization .（

1024-8192 Adjust between ）.

③ Reduce select The fields of the following query .

When Query The total field size of is less than max_length_for_sort_data And the sort field is not TEXT|BLOB Type , Will use the improved

Algorithm —— One way sorting , Otherwise, use the old algorithm —— Multiple sorting .

The data of both algorithms may exceed sort_buffer The capacity of , Beyond , Will create tmp Merge and sort files , Lead to many times I/O,

But it's more risky to use single path sorting algorithm , So improve sort_buffer_size.

6. Use overlay index

Overlay index ：SQL Just index to return the data needed for the query , It is not necessary to query the data after the primary key is found through the secondary index .

7. group by

group by The principle of using indexes is almost the same as order by Agreement , The only difference is groupby Even if no filter conditions are used, the index , You can also use the index directly .

The first 12 Chapter practice

1. Case a

Make a list of people who are younger than your own leader

select e1.name empname,e1.age empage,e2.name ceoname,e2.age ceoage
from t_emp e1 inner join t_dept d on e1.deptid=d.id
inner join t_emp e2 on d.ceo=e2.id
where e1.age>e2.age;

Replace with a large watch , Analyze ：

explain select e1.name empname,e1.age empage,e2.name ceoname,e2.age ceoage from emp e1 inner join dept d on
e1.deptid=d.id
inner join emp e2 on d.ceo=e2.id
where e1.age>e2.age;

two inner join The driven tables of have been indexed .

2. Case 2

List all persons below the average age of their sect

Ideas ： Take the average age of the sect first , Compare it with your own age ！

select e1.name from t_emp e1
inner join
(select deptid,AVG(age) avgage from t_emp
group by deptid) tmp
on e1.deptid=tmp.deptid
where e1.age<tmp.avgage;

Replace with a large watch ：

explain select e1.name from emp e1
inner join
(select deptid,AVG(age) avgage from emp
group by deptid) tmp
on e1.deptid=tmp.deptid
where e1.age<tmp.avgage;

Without index ：

How to optimize ：

① First in the subquery , Need basis deptid do groupby operation , therefore , Need to be in deptid Index it ;

② the reason being that inner join, Therefore, the small table will be automatically used as the driving table , in other words , Grouped tmp It's the drive meter , and e1 It's driven watch ;

③ And in the e1 in , Need to check deptid and age Two fields , Therefore, these two fields also need to be indexed

result ： establish deptid and age In accordance with the index :

create index idx_deptid_age on emp(deptid,age);

3. Case three

List at least 2 One is older than 40 Year old members of the sect

Ideas ： First query greater than 40 Year old member , Then group according to the sect , And then judge that at least 2 A sect ！

select d.deptName,count(*)
from t_emp e inner join t_dept d
on e.deptid=d.id
where e.age>40
group by d.id,d.deptName
having count(*)>=2

Big watch optimization ：

explain select d.deptName,count(*)
from emp e inner join dept d
on e.deptid=d.id
where e.age>40
group by d.id,d.deptName
having count(*)>=2

Optimize ：

① The two tables are related , We can consider using the small table as the driving table .

②group by Field of id,deptName You can also index : create index idx_id_deptName on dept(id,deptName);

③ Driven watch deptid As an associated field , You can index ：create index idx_deptid on emp(deptid);

create index idx_id_deptname on dept(id,deptName);

4. Case four

There are at least 2 A sect that is not a leader

select d2.deptName from t_emp e inner join t_dept d2 on e.deptid=d2.id
left join t_dept d on e.id=d.ceo
where d.id is null and e.deptid is not null
group by d2.deptName,d2.id
having count(*)>=2;

Switch large tables ：

explain select d2.deptName from emp e inner join dept d2 on e.deptid=d2.id
left join dept d on e.id=d.ceo
where d.id is null and e.deptid is not null
group by d2.deptName,d2.id
having count(*)>=2;

Without index ：

Optimization analysis ： Three table associations , Then I do group by grouping ！

①group by Field of , You can add an index ：create index idx_deptname_id on dept(deptName,id);

② You can use the Department table as the driving table

③ for the first time join when ,e Table as driven table , Can be deptid catalog index ：create index idx_deptid on emp(deptid);

④ Most once join in , Used dept Table as driven table , Inquire about ceo Field , So you can ceo Index it

create index idx_ceo on dept(ceo);

5. Case 5

List all personnel , And add a column of remarks “ Is it the leader ”, If it's the leader, it shows that , It's not the leader who shows No

select e.name,case when d.id is null then ' no ' else ' yes ' end ' Is it the leader ' from t_emp e

left join t_dept d

on e.id=d.ceo;

Big table correlation ：

explain select e.name,case when d.id is null then ' no ' else ' yes ' end ' Is it the leader ' from emp e
left join dept d
on e.id=d.ceo;

Optimize ： stay d Tabular ceo Fields can be indexed ！

6. Case 6

List all sects , And add a column of remarks “ veteran or rookie ”, If the average age of the sect >40 Show “ veteran ”, Otherwise, it will show “ rookie ”

Ideas ： First from emp Table calculation , The average age of each sect , grouping , And then in association dept surface , Use if Function to judge ！

select d.deptName,if(avg(age)>40,' veteran ',' rookie ') from t_emp e inner join t_dept d
on d.id=e.deptid group by d.deptName,d.id

Switch large tables ：

explain select d.deptName,if(avg(age)>40,' veteran ',' rookie ') from dept d inner join emp e
on d.id=e.deptid
group by d.deptName,d.id

Optimize ：

① Use dept As a driving table

② stay dept On the establishment of deptName and id The index of ：create index idx_deptName_id on dept(deptName,id);

③ stay emp On the establishment of deptid Index of field ： create index index_deptid on emp(deptid);

7. Case seven

Show the oldest person of each sect

Ideas ： First query emp surface , Find the oldest person of each sect , And in accordance with the deptid grouping ; Then relate again emp surface , Relate other information ！

select * from t_emp e
inner join
(select deptid,max(age) maxage
from t_emp
group by deptid) tmp
on tmp.deptid=e.deptid and tmp.maxage=e.age;

Big watch optimization ：

explain select * from emp e
inner join
(select deptid,max(age) maxage
from emp
group by deptid) tmp
on tmp.deptid=e.deptid and tmp.maxage=e.age;

Before optimization ：

Optimization idea ：

① Sub query ,emp According to deptid Grouping , Therefore, it is possible to establish deptid Index of field ;

②inner join Querying , Associated with the age and deptid, So you can deptid,age Field indexing

create index idx_deptid_age on emp(deptid,age);

The first 13 Chapter Intercept query analysis

1. Slow query log

1.1 What is it?

（1）MySQL The slow query log of is MySQL A type of logging provided , It is used to record in MySQL Statement with response time over threshold in , Specifically, the running time exceeds long_query_time It's worth it SQL, It will be recorded in the slow query log .

（2） Specifically, the running time exceeds long_query_time It's worth it SQL, It will be recorded in the slow query log .long_query_time The default value is 10, It means running 10 Seconds or more .

（3） It's up to him to see what SQL Beyond our maximum endurance time , Like one sql To perform more than 5 Second , We are slow SQL, Hope to collect more than 5 Of a second sql, Before integration explain Conduct a comprehensive analysis .

1.2 How to use it?

By default ,MySQL Slow query log is not enabled in the database , We need to set this parameter manually .

Of course , If it's not for tuning , It is generally not recommended to start this parameter , Because opening slow query log will bring some performance impact more or less . Slow query log supports writing log records to files .

（1） Turn on settings

SQL sentence	describe	remarks
SHOW VARIABLES LIKE '%slow_query_log%';	Check whether the slow query log is on	By default slow_query_log The value of is OFF, Indicates that slow query logs are disabled
set global slow_query_log=1;	Open slow query log
SHOW VARIABLES LIKE 'long_query_time%';	View slow query set threshold	Unit second
set long_query_time=1	Set the slow query threshold	Unit second

（2） If it takes effect permanently, the configuration file needs to be modified my.cnf in [mysqld] The configuration

[mysqld]
slow_query_log=1
slow_query_log_file=/var/lib/mysql/atguigu-slow.log
long_query_time=3
log_output=FILE

（3） Long running queries sql, Open the slow query log to view

1.2 Log analysis tool mysqldumpslow

（1） see mysqldumpslow Help for

command ：[[email protected] ~]# mysqldumpslow --help

（2） see mysqldumpslow Help for

Get the most returned recordset 10 individual SQL
mysqldumpslow -s r -t 10 /var/lib/mysql/atguigu-slow.log
The most visited 10 individual SQL
mysqldumpslow -s c -t 10 /var/lib/mysql/atguigu-slow.log
Get the top... In chronological order 10 There are left connected query statements in the bar
mysqldumpslow -s t -t 10 -g "left join" /var/lib/mysql/atguigu-slow.log
It is also recommended to use these commands in conjunction with | and more Use , Otherwise, the screen may explode
mysqldumpslow -s r -t 10 /var/lib/mysql/atguigu-slow.log | more

2. SHOW PROCESSLIST

1.1 What is it?

Inquire about mysql Process list , It can kill the faulty process

1.2 How to use it?

The first 14 Chapter A collection of tools and techniques

1.1 What is it?

Put a query sql Encapsulated as a virtual table .

This virtual table only saves sql Logic , No query results will be saved .

1.2 effect

（1） Complex packaging sql sentence , Improve reusability

（2） Put logic on the database , Updates don't need to be published , More flexible in the face of frequent demand changes

The first 15 Chapter Master slave copy

1.1 The fundamentals of replication

（1）slave From master Read binlog To synchronize data

（2） Three steps + Schematic diagram

MySQL The replication process is divided into three steps ：

master Log changes to binary log （binary log）. These logging processes are called binary log events ,

binary log events;

slave take master Of binary log events Copy to its trunk log （relay log）;

slave Redo events in relay log , Apply the changes to your own database . MySQL Replication is asynchronous and serialized

1.2 The basic principle of reproduction

（1） Every slave only one master

（2） Every slave There can only be one unique server ID

（3） Every master There can be multiple salve

1.3 The biggest problem with copying

Because it happened many times IO, There is a delay problem

1.4 One master one slave common configuration

（1） mysql Version consistent and running as a service in the background

（2） The master and slave are configured in [mysqld] Under node , Are all lowercase

Host modification my.ini The configuration file

The primary server is unique ID
server-id=1
Enable binary logging JAVAEE Course series
log-bin= My own local path /data/mysqlbin
log-bin=D:/devSoft/MySQLServer5.5/data/mysqlbin
Set up the database not to be copied
binlog-ignore-db=mysql
Set up the database to be copied
binlog-do-db= The name of the master database to be copied
Set up logbin Format
binlog_format=STATEMENT（ Default ）

mysql At the beginning of master-slave replication , The slave does not inherit host data

（3） logbin Format

binlog_format=STATEMENT（ Default ）

binlog_format=ROW

binlog_format=MIXED

（4） Slave configuration file modification my.cnf Of [mysqld] Under the field

# Slave service id
server-id = 2
# Be careful my.cnf There is server-id = 1
# Set relay log
relay-log=mysql-relay

（5） Because the configuration file has been modified , Please check the host + All slave computers restart the background mysql service

（6） The host and slave turn off the firewall 、 Safety tools （ Tencent housekeeper, etc ）

（7） stay Windows Set up an account on the host and authorize slave

# Create user , And authorize
GRANT REPLICATION SLAVE ON *.* TO ' Backup account '@' From the machine database IP' IDENTIFIED BY '123456';

（8） Inquire about master The state of , And record File and Position Value

# Inquire about master The state of
show master status;

Do not operate the master server after this step MYSQL, Prevent the state value of the primary server from changing

（9） stay Linux Configure the host to be copied from the host

# Inquire about master The state of
CHANGE MASTER TO MASTER_HOST=' host IP',MASTER_USER=' Create user name ',MASTER_PASSWORD=' Created password ',
MASTER_LOG_FILE='File name ',MASTER_LOG_POS=Position Numbers ;

（10） Start copy from server

start slave;
show slave status\G;

The following two parameters are Yes, The master-slave configuration is successful ！

Slave_IO_Running: Yes
Slave_SQL_Running: Yes

（11） Host new library 、 new table 、insert Record , Copy from the machine

（12） How to stop replication from the service

stop slave;

The first 16 Chapter MYCAT

1.1 What is it?

database middleware , It was formerly Ali's cobar

1.2 What do you do

Read / write separation
Data fragmentation ：

Split Vertically 、 Horizontal split

vertical + Horizontal split

3. Multiple data source integration

1.3 MYCAT principle

“ Intercept ”：Mycat The most important verb in the principle of is “ Intercept ”, It intercepts what the user sent SQL sentence , First of all, SQL The statement does some specific analysis ： Such as fragment analysis 、 Route analysis 、 Read write separation analysis 、 Cache analysis, etc , And then put this SQL Real database sent to the back end , And will return the results to do the appropriate processing , And finally back to the user .

In this way, the distributed database is decoupled from the code , The programmer doesn't notice that it's used in the background mycat still mysql.

1.4 Installing the

Unzip the file and copy it to linux Next /usr/local/
Three configuration files

schema.xml Define logical library , surface 、 Fragment node and so on rule.xml Define fragmentation rules

server.xml Define user and system related variables , Such as port, etc .

3. Modify before starting schema.xml

<?xml version="1.0"?>
<!DOCTYPE mycat:schema SYSTEM "schema.dtd">
<mycat:schema xmlns:mycat="http://io.mycat/">
	<!-- Logical library 	name  name ,	checkSQLschema	sqlMaxLimit  Whether to add... At the end  limit xxx-->
<schema name="TESTDB" checkSQLschema="false" sqlMaxLimit="100" dataNode="dn1"> </schema>
	<!-- Logical library 	name  name ,	dataHost  Which... Is quoted  dataHost	database: Corresponding  mysql  Of  database-->
<dataNode name="dn1" dataHost="localhost1" database="db1" />
<dataHost name="localhost1" maxCon="1000" minCon="10" balance="0"
	writeType="0"	dbType="mysql"	dbDriver="native"	switchType="1"
slaveThreshold="100">
<heartbeat>select user()</heartbeat>
<!-- can have multi write hosts -->
	<writeHost host="hostM1" url="localhost:3306" user="root"	 
password="123456">
	</writeHost>	 
</dataHost>
</mycat:schema>

4. Revise server.xml

<user name="root">
<property name="password">654321</property>
<property name="schemas">TESTDB</property>
</user>

5. Start the program

Console launch ： Go to mycat/bin Under the table of contents mycat console Background start ： Go to mycat/bin Under the table of contents mycat start

6. An error may occur during startup

Domain name resolution failed

use vim modify /etc/hosts file , stay 127.0.0.1 Add your machine name later

Restart network service after modification

7. Background management window login

原网站

版权声明
本文为[Code knower]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/02/202202221439119257.html