当前位置:网站首页>SQL tuning guide notes 10:optimizer statistics concepts
SQL tuning guide notes 10:optimizer statistics concepts
2022-06-12 21:39:00 【dingdingfish】
This paper is about SQL Tuning Guide The first 10 Chapter “Optimizer Statistics Concepts” The notes .
Important basic concepts
execution plan
The combination of steps used by the database to execute a SQL statement. Each step either retrieves rows of data physically from the database or prepares them for the session issuing the statement. You can override execution plans by using a hint.
The database is used to execute SQL Step combinations of statements . Each step either physically retrieves data rows from the database , Or prepare them for the session that issued the statements . You can use the prompt to override the execution plan .extended statistics
A type of optimizer statistics that improves estimates for cardinality when multiple predicates exist or when predicates contain an expression.
Extended Statistics
When there are multiple predicates or predicates contain expressions , An optimizer statistic improves the estimation of cardinality .cardinality
The number of rows that is expected to be or is returned by an operation in an execution plan.
The number of rows expected or to be returned by the operation in the execution plan .synopsis
A set of auxiliary statistics gathered on a partitioned table when the INCREMENTAL value is set to true.
When INCREMENTAL Value is set to true when , A set of auxiliary statistics collected on the partition table .SQL compilation
In the context of Oracle SQL processing, this term refers collectively to the phases of parsing, optimization, and plan generation.
stay Oracle SQL In the context of processing , This term is collectively referred to as parsing 、 Optimization and plan generation phase .SQL profile
A set of auxiliary information built during automatic tuning of a SQL statement. A SQL profile is to a SQL statement what statistics are to a table. The optimizer can use SQL profiles to improve cardinality and selectivity estimates, which in turn leads the optimizer to select better plans.
SQL The configuration file
In automatic tuning SQL A set of auxiliary information constructed during a statement . SQL Configuration files are for SQL Statements are like statistics to a table . The optimizer can use SQL Profiles to improve cardinality and selectivity estimates , This leads the optimizer to choose a better plan .automatic reoptimization
The ability of the optimizer to automatically change a plan on subsequent executions of a SQL statement. Automatic reoptimization can fix any suboptimal plan chosen due to incorrect optimizer estimates, from a suboptimal distribution method to an incorrect choice of degree of parallelism.
The optimizer is SQL The ability to automatically change the plan during subsequent execution of the statement . Automatic re optimization can fix any sub optimal plan selected due to incorrect optimizer estimates , From suboptimal distribution to wrong choice of parallelism .
Oracle Database optimizer statistics describe the details about the database and its objects .
10.1 Introduction to Optimizer Statistics
The optimizer cost model relies on the collected statistics about the objects involved in the query and the database and host where the query runs .
The optimizer uses statistics to estimate from the table 、 Number of rows retrieved from partition or index ( And the number of bytes ). The optimizer estimates the access cost , Determine the costs of possible programs , Then choose the lowest cost execution plan .
Optimizer statistics include the following :
- Table statistics
- Row number
- Number of blocks
- The average President
- Make statistics
- Different values in the column (NDV) The number of
- The number of null values in the column
- The data distribution ( Histogram )
- Extended Statistics
- The index statistics
- Number of leaf blocks
- The layer number
- Index clustering factor
- System statistics
- I/O Performance and utilization
- CPU Performance and utilization
Pictured 10-1 Shown , The database will table 、 Column 、 Indexes and optimizer statistics for the system are stored in the data dictionary . You can access these statistics using the data dictionary view .
Be careful : Optimizer statistics and pass V$ The performance statistics visible in the view are different .
10.2 About Optimizer Statistics Types
The optimizer collects statistics about different types of database objects and database environment characteristics .
10.2.1 Table Statistics
The table statistics contain the metadata used by the optimizer in developing the execution plan .
10.2.1.1 Permanent Table Statistics
stay Oracle In the database , Table statistics include information about rows and blocks .
The optimizer uses these statistics to determine the cost of table scans and table joins . The database tracks all relevant statistics about persistent tables . for example , Stored in DBA_TAB_STATISTICS Table statistics in track the following :
- Row number
When determining the cardinality, the database uses storage in DBA_TAB_STATISTICS Row count in . - The average President
- Number of data blocks
The optimizer uses with DB_FILE_MULTIBLOCK_READ_COUNT Initialize the number of data blocks of the parameter to determine the base table access cost . - Number of empty data blocks
DBMS_STATS.GATHER_TABLE_STATS Submit... Before collecting statistics for persistent tables .
This sample query sh.customers Table statistics for table .
SELECT NUM_ROWS, AVG_ROW_LEN, BLOCKS,
EMPTY_BLOCKS, LAST_ANALYZED
FROM DBA_TAB_STATISTICS
WHERE OWNER='SH'
AND TABLE_NAME='CUSTOMERS';
NUM_ROWS AVG_ROW_LEN BLOCKS EMPTY_BLOCKS LAST_ANAL
---------- ----------- ---------- ------------ ---------
55500 189 1551 0 28-MAY-22
10.2.1.2 Temporary Table Statistics
DBMS_STATS You can collect statistics on permanent and global temporary tables , But there are other considerations for the latter .
10.2.1.2.1 Types of Temporary Tables
Temporary tables are divided into global tables 、 Private or cursor duration .
In all types of temporary tables , The data is only visible to the session in which it is inserted . The differences between these tables are as follows :
Global temporary tables are explicitly created persistent objects , Intermediate session private data for a specific duration .
The table is global , Because the definition is visible to all sessions . CREATE GLOBAL TEMPORARY TABLE Of ON COMMIT Clause indicates that the table is transaction specific (DELETE ROWS) It is also session specific (PRESERVE ROWS). Optimizer statistics for global temporary tables can be shared or session specific .A private temporary table is an explicitly created object , Defined by private memory metadata , Store intermediate session private data for a specific duration .
The table is private , Because the definition is only visible to the session that created the table . CREATE PRIVATE TEMPORARY TABLE Of ON COMMIT Clause indicates that the table is transaction specific (DROP DEFINITION) It is also session specific (PRESERVE DEFINITION).Cursor duration temporary tables are implicitly created memory only objects associated with cursors .
Unlike global and private temporary tables ,DBMS_STATS Unable to collect statistics for cursor duration temporary table .
The difference between these tables is where they store data 、 How they are created and deleted, as well as the duration and visibility of metadata . Please note that , The database allocates storage space when a session first inserts data into a global temporary table , Instead of creating tables .
The following table is an important feature of the temporary table :
| features | Global Temporary Table | Private Temporary Table | Cursor-Duration Temporary Table |
|---|---|---|---|
| Visibility of data | Session insert data | Session insert data | Session insert data |
| data storage | persistent | Memory or temporary files , But only during a session or transaction | In memory only |
| Visibility of metadata | All sessions | Create a session for the table ( Based on V$ View's USER_PRIVATE_TEMP_TABLES In the view ) | The session that executes the cursor |
| Duration of metadata | Until the table is explicitly deleted | Until the table is explicitly deleted , Or the end of the conversation (PRESERVE DEFINITION) Or the transaction ends (DROP DEFINITION) | Until the cursor clears the shared pool |
| Create table | CREATE GLOBAL TEMPORARY TABLE ( Support AS SELECT) | CREATE PRIVATE TEMPORARY TABLE ( Support AS SELECT) | Implicitly create when the optimizer thinks it is useful |
| Create impact on existing transactions | No implicit commit | No implicit commit | No implicit commit |
| Naming rules | Same as permanent table | Must be ORA$PTT_ start | Internally generated unique name |
| Delete table | DROP GLOBAL TEMPORARY TABLE | DROP PRIVATE TEMPORARY TABLE, Or in a conversation (PRESERVE DEFINITION) And transaction (DROP DEFINITION) Implicitly delete at the end | Implicitly delete at the end of the session |
10.2.1.2.2 Statistics for Global Temporary Tables
DBMS_STATS Collect statistics of the same type as permanent tables for global temporary tables .
Be careful : You cannot collect statistics for private temporary tables .
The following table shows the differences between global temporary tables in collecting and storing optimizer Statistics , It depends on whether the scope of the table is a transaction or a session .
| features | Transaction specific | Session specific |
|---|---|---|
| DBMS_STATS Impact of collection | No submission | Submit |
| Statistics storage | Only memory | Dictionary table |
| Histogram creation | I won't support it | Support |
The following procedure does not commit transaction specific temporary tables , Therefore, the rows in these tables will not be deleted :
- GATHER_TABLE_STATS
- DELETE_obj_STATS, among obj yes TABLE、COLUMN or INDEX
- SET_obj_STATS, among obj yes TABLE、COLUMN or INDEX
- GET_obj_STATS, among obj yes TABLE、COLUMN or INDEX
The preceding program unit follows GLOBAL_TEMP_TABLE_STATS Statistics preferences ( This is an initialization parameter , The default is empty. ). for example , If the table preference is set to SESSION, be SET_TABLE_STATS Set session Statistics , and GATHER_TABLE_STATS Keep all rows in the transaction specific temporary table . however , If the table preference is set to SHARED, be SET_TABLE_STATS Shared statistics will be set , and GATHER_TABLE_STATS All rows are deleted from the transaction specific temporary table .
10.2.1.2.3 Shared and Session-Specific Statistics for Global Temporary Tables
from Oracle Database 12c Start , You can set table level preferences GLOBAL_TEMP_TABLE_STATS To share the global temporary table (SHARED) Or session specific (SESSION) Make statistics .
When GLOBAL_TEMP_TABLE_STATS by SESSION when , You can collect optimizer statistics for global temporary tables in one session , Then use only the statistics for that session . meanwhile , Users can continue to maintain statistics for a shared version . During optimization , The optimizer first checks whether the global temporary table has session specific statistics . If it is , Then the optimizer uses them . otherwise , The optimizer will use shared Statistics ( If there is ).
Be careful : stay Oracle Database 12c In the previous version , The database maintains optimizer statistics for global temporary tables and non global temporary tables in the same way . The database maintains a version of statistics shared by all sessions , Even though the data in different sessions may be different .
Session specific optimizer statistics have the following characteristics :
- The dictionary view of tracking statistics displays shared statistics and session specific statistics in the current session .
- CREATE … AS SELECT Automatically collect optimizer Statistics . however , When GLOBAL_TEMP_TABLE_STATS Set to SHARED when , You must use DBMS_STATS Collect statistics manually .
These views are DBA_TAB_STATISTICS、DBA_IND_STATISTICS、DBA_TAB_HISTOGRAMS and DBA_TAB_COL_STATISTICS( Each view has a corresponding USER_ and ALL_ edition ). SCOPE Columns show whether the statistics are session specific or shared . Session specific statistics must be stored in the data dictionary , So that multiple processes can be in Oracle RAC Access them in . - Pending statistics are not supported .
- Other sessions do not share cursors that use session specific statistics .
Different sessions can share a cursor that uses shared Statistics , As in the Oracle Database 12c Same as in previous versions . The same session can share a cursor that uses session specific statistics . - By default , Temporary watch GATHER_TABLE_STATS Immediately invalidate previous cursors compiled in the same session . however , This procedure does not invalidate cursors compiled in other sessions .
10.2.2 Column Statistics
Column statistics track information about column values and data distribution .
The optimizer uses column statistics to generate accurate cardinality estimates , And use... For indexing 、 Connection sequence 、 Connection methods, etc. to make better decisions . for example ,DBA_TAB_COL_STATISTICS Statistics in track the following :
- The number of different values
- Null number
- Max min
- Histogram related information
The optimizer can use extended Statistics , This is a special type of column Statistics . These statistics are useful for informing the Optimizer about logical relationships between columns .
10.2.3 Index Statistics
Index statistics include the number of index levels 、 Information such as the number of index blocks and the relationship between indexes and data blocks . The optimizer uses these statistics to determine the cost of an index scan .
10.2.3.1 Types of Index Statistics
DBA_IND_STATISTICS View tracking index statistics .
The statistics include the following :
- Hierarchy
BLEVEL The column shows the number of blocks required from the root block to the leaf block . B-tree There are two types of blocks in an index : Branch blocks for searching and leaf blocks for storing values . - Different keys
This column tracks the number of different index values . If a unique constraint is defined , And there's no definition of NOT NULL constraint , Then the value is equal to the number of non null values . - The average number of leaf blocks per different index key
- The average number of data blocks pointed to by each different index key
It is known that CUSTOMERS Table by 1664 individual block.
SELECT INDEX_NAME, BLEVEL, LEAF_BLOCKS AS "LEAFBLK", DISTINCT_KEYS AS "DIST_KEY",
AVG_LEAF_BLOCKS_PER_KEY AS "LEAFBLK_PER_KEY",
AVG_DATA_BLOCKS_PER_KEY AS "DATABLK_PER_KEY"
FROM DBA_IND_STATISTICS
WHERE OWNER = 'SH'
AND INDEX_NAME IN ('CUST_LNAME_IX','CUSTOMERS_PK');
INDEX_NAME BLEVEL LEAFBLK DIST_KEY LEAFBLK_PER_KEY DATABLK_PER_KEY
-------------- ------ ------- -------- --------------- ---------------
CUSTOMERS_PK 1 115 55500 1 1
CUST_LNAME_IX 1 141 908 1 10
10.2.3.2 Index Clustering Factor
about B Tree index , Index clustering factor measurement and index value ( For example, last name ) The physical aggregation of the related rows .
The index clustering factor helps the optimizer decide for certain queries , Index scan or full table scan is more effective . The oligomeric set factor indicates that the index scan is more efficient .
Close to the number of blocks in the table The clustering factor of indicates that rows are physically sorted by index key in the table block . If the database performs a full table scan , The database tends to retrieve these rows , Because they are stored on disks sorted by index key . Close to the number of rows The clustering factor of indicates that the rows are randomly scattered in the database block relative to the index key . If the database performs a full table scan , Then the database will not retrieve rows in any sort order by this index key .
The aggregation factor is an attribute of a particular index , Not a watch . If there are multiple indexes on a table , The aggregation factors of different indexes may be different . Trying to reorganize tables to increase the clustering factor of one index may reduce the clustering factor of another index .
This example shows how the optimizer uses the index clustering factor to determine if using an index is more efficient than a full table scan .
SELECT table_name, num_rows, blocks
FROM user_tables
WHERE table_name='CUSTOMERS';
TABLE_NAME NUM_ROWS BLOCKS
------------------------------ ---------- ----------
CUSTOMERS 55500 1551
-- stay customers.cust_last_name Create index on column
CREATE INDEX CUSTOMERS_LAST_NAME_IDX ON customers(cust_last_name);
-- Query the index clustering factor of the new index .
SELECT index_name, blevel, leaf_blocks, clustering_factor
FROM user_indexes
WHERE table_name='CUSTOMERS'
AND index_name= 'CUSTOMERS_LAST_NAME_IDX';
INDEX_NAME BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
------------------------------ ---------- ----------- -----------------
CUSTOMERS_LAST_NAME_IDX 1 141 9936
-- establish customers A new copy of the table , The lines in it are pressed cust_last_name Sort .
CREATE TABLE customers3 AS
SELECT *
FROM customers
ORDER BY cust_last_name;
-- Collect about customers3 Table statistics .
EXEC DBMS_STATS.GATHER_TABLE_STATS(null,'CUSTOMERS3');
-- Inquire about customers3 Number of rows and blocks in the table .
SELECT TABLE_NAME, NUM_ROWS, BLOCKS
FROM USER_TABLES
WHERE TABLE_NAME='CUSTOMERS3';
TABLE_NAME NUM_ROWS BLOCKS
------------------------------ ---------- ----------
CUSTOMERS3 55500 1550
-- stay customers3 Of cust_last_name Create index on column .
CREATE INDEX CUSTOMERS3_LAST_NAME_IDX ON customers3(cust_last_name);
-- Inquire about customers3_last_name_idx Index clustering factor of the index .
SELECT INDEX_NAME, BLEVEL, LEAF_BLOCKS, CLUSTERING_FACTOR
FROM USER_INDEXES
WHERE TABLE_NAME = 'CUSTOMERS3'
AND INDEX_NAME = 'CUSTOMERS3_LAST_NAME_IDX';
INDEX_NAME BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
------------------------------ ---------- ----------- -----------------
CUSTOMERS3_LAST_NAME_IDX 1 141 1516
-- Inquire about customers surface , Show execution plan , Take a full scan
SELECT cust_first_name, cust_last_name
FROM customers
WHERE cust_last_name BETWEEN 'Puleo' AND 'Quinn';
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR());
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 423 (100)| |
|* 1 | TABLE ACCESS FULL| CUSTOMERS | 2335 | 35025 | 423 (1)| 00:00:01 |
-------------------------------------------------------------------------------
-- Inquire about customers3 surface , Show execution plan , Go to the index
SELECT cust_first_name, cust_last_name
FROM customers3
WHERE cust_last_name BETWEEN 'Puleo' AND 'Quinn';
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR());
----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 71 (100)| |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| CUSTOMERS3 | 2335 | 35025 | 71 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | CUSTOMERS3_LAST_NAME_IDX | 2335 | | 7 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------------
-- Query customers with hints that force the optimizer to use indexes . The cost is much higher .
SELECT /*+ index (Customers CUSTOMERS_LAST_NAME_IDX) */ cust_first_name,
cust_last_name
FROM customers
WHERE cust_last_name BETWEEN 'Puleo' and 'Quinn';
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR());
---------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 425 (100)| |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| CUSTOMERS | 2335 | 35025 | 425 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | CUSTOMERS_LAST_NAME_IDX | 2335 | | 7 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------------
-- clear
DROP TABLE customers3 PURGE;
DROP INDEX CUSTOMERS_LAST_NAME_IDX;
The above plan shows , The cost of using indexes for customers is higher than the cost of full table scanning . therefore , Using indexes does not necessarily improve performance . The index clustering factor is a measure of whether an index scan is more effective than a full table scan .
10.2.3.3 Effect of Index Clustering Factor on Cost: Example
This example illustrates how index clustering factors affect table access costs .
Consider the following scenarios :
- A table contains 9 That's ok , Stored in 3 In blocks of data .
- col1 Column current stored value A、B and C.
- Of this table col1 There is a file named col1_idx The non unique index of .
Suppose these rows are stored in data blocks , As shown below :
Block 1 Block 2 Block 3
------- ------- -------
A A A B B B C C C
In this example ,col1_idx The index clustering factor of is low . col1 Rows with the same index column values are in the same data block in the table . therefore , Using the index range scan, all values returned are A The cost of the line is very low , Because only one block in the table needs to be read .
Suppose the same rows are scattered in data blocks , As shown below :
Block 1 Block 2 Block 3
------- ------- -------
A B C A C B B A C
In this example ,col1_idx The index clustering factor of is high . The database must read all three blocks in the table to retrieve col1 The median is A All of the line .
10.2.4 System Statistics
The system statistics describe the hardware characteristics , for example I/O and CPU Performance and utilization .
System statistics enable the query optimizer to estimate more accurately when selecting an execution plan I/O and CPU cost . When updating system statistics , The database will not make the previously resolved SQL Statement invalidation . The database parses all new with new statistics SQL sentence .
10.2.5 User-Defined Optimizer Statistics
The extensible optimizer enables authors of user-defined functions and indexes to create statistical data collections 、 Selectivity and cost function .
The optimizer cost model is extended to integrate user supplied information to evaluate CPU and I/O cost . The statistics type acts as an interface to user-defined functions that affect the selection of execution plans . however , To use the statistics type , The optimizer needs a mechanism to bind this type to a database object , For example, columns 、 Independent functions 、 object type 、 Indexes 、 Index type or package . SQL sentence ASSOCIATE STATISTICS This binding is allowed .
Functions and usage criteria for user-defined statistics SQL The data type is related to the column of the object type and the domain index . When you associate a statistical type with a column or domain index , as long as DBMS_STATS Collect statistics , The database will call the statistics collection method in the statistics type .
10.3 How the Database Gathers Optimizer Statistics
Oracle The database provides several mechanisms for collecting statistical information .
10.3.1 DBMS_STATS Package
DBMS_STATS PL/SQL Package collects and manages optimizer Statistics .
This package enables you to control what and how statistics are collected , Including parallelism 、 Sampling method and granularity of statistics collection in partition table .
Be careful : Do not use ANALYZE Of the statement COMPUTE and ESTIMATE Clause to collect optimizer Statistics . These terms have been deprecated . contrary , Please use DBMS_STATS.
Creating an accurate execution plan requires the use of DBMS_STATS Statistics collected by packages . for example ,DBMS_STATS The collected table statistics include the number of rows 、 Number of blocks and average row length .
By default ,Oracle The database uses automatic optimizer statistics to collect . under these circumstances , The database automatically runs for all schema objects with missing or outdated statistics DBMS_STATS Collect optimizer Statistics . This process eliminates many manual tasks associated with managing the optimizer , And significantly reduce the risk of generating sub optimal execution plans due to lack or outdated statistical information . You can also do this manually DBMS_STATS To update and manage optimizer Statistics .
Oracle Database 19c High frequency automatic optimizer statistical information collection is introduced . This lightweight task periodically collects statistics on stale objects . The default interval is 15 minute . Compared with the automatic statistics collection job , High frequency tasks do not perform tasks such as clearing statistics or calling for nonexistent objects Optimizer Statistics Advisor Something like that . You can use DBMS_STATS.SET_GLOBAL_PREFS Procedure to set preferences for high frequency tasks , And use DBA_AUTO_STAT_EXECUTIONS View metadata .
10.3.2 Supplemental Dynamic Statistics
By default , When optimizer statistics are lost 、 Obsolete or insufficient , The database automatically collects dynamic statistics during parsing . The database uses recursion SQL To scan a small sample of random table blocks .
Be careful : Dynamic statistics supplement rather than replace statistical information .
Dynamic statistics supplement optimizer Statistics , For example, table and index block counts 、 Table and join cardinality ( Estimated number of rows )、 Join column statistics and GROUP BY Statistics . This information helps the optimizer improve the plan by better estimating predicate cardinality .
Dynamic statistics are useful when :
- Because of the complex predicates , The implementation plan is not ideal .
- The sampling time is a fraction of the total query execution time .
- The query is executed multiple times , To amortize the sampling time .
10.3.3 Online Statistics Gathering
In some cases ,DDL and DML The operation will automatically trigger online statistics collection .
10.3.3.1 Online Statistics Gathering for Bulk Loads
The database can automatically collect table statistics during the following types of bulk loads :INSERT INTO … SELECT Insert and... Using the direct path CREATE TABLE AS SELECT.
By default , Parallel inserts use direct path inserts . You can use /*+APPEND*/ Prompt force insert direct path .
10.3.3.1.1 Purpose of Online Statistics Gathering for Bulk Loads
Data warehouse applications typically load large amounts of data into the database . for example , The sales data warehouse may be daily 、 Load data weekly or monthly .
stay Oracle Database 12c In the previous version , The best practice is to manually collect statistics after bulk loading . however , Due to negligence or waiting for the maintenance window to start the collection , Many applications do not collect statistics after loading . The lack of statistical data is the main reason for the sub optimal implementation plan .
Automatic collection of statistics during bulk loading has the following benefits :
- Improve performance
Collecting statistics during load avoids additional table scans to collect table statistics . - Improved manageability
After batch loading, statistics can be collected without user intervention .
10.3.3.1.2 Global Statistics During Inserts into Partitioned Tables
When inserting rows into a partitioned table , The database collects global statistics during insert .
for example , If sales It's a partition table , And if you run INSERT INTO sales SELECT, Then the database will collect global statistics . however , The database does not collect partition level statistics .
Suppose you use partition extension syntax to insert rows into a specific partition or sub partition . The database collects statistics about partitions during insert . however , The database does not collect global statistics .
Suppose you run INSERT INTO sales PARTITION (sales_q4_2000) SELECT. The database collects statistics during insert . If sales To enable the INCREMENTAL Preferences , Then the database will collect sales_q4_2000 Summary . Statistics are available immediately after insertion . however , If you rollback a transaction , Then the database will automatically delete the statistics collected during batch loading .
10.3.3.1.3 Histogram Creation After Bulk Loads
After collecting Online Statistics , The database does not automatically create histograms .
If you need a histogram , Then after batch loading ,Oracle It is recommended to use options=>GATHER AUTO function DBMS_STATS.GATHER_TABLE_STATS.
EXEC DBMS_STATS.GATHER_TABLE_STATS(user, 'MYT', options=>'GATHER AUTO');
Ahead PL/SQL The program only collects missing or outdated Statistics . The database does not collect table and basic column statistics collected during bulk loading .
Be careful : You can set the table preference to... On tables that you plan to load in batches GATHER AUTO. such , You are running GATHER_TABLE_STATS There is no need to explicitly set options Parameters .
10.3.3.1.4 Restrictions for Online Statistics Gathering for Bulk Loads
In some cases , Batch loading does not automatically collect optimizer Statistics .
say concretely , When any of the following conditions apply to the target table 、 Partition or sub partition , Bulk loading does not automatically collect statistics :
- This object contains data . Batch loading automatically collects online statistics only when the object is empty .
- It is located in Oracle In the mode of ownership , for example SYS.
- It is one of the following types of tables : Nested table 、 Index organization table (IOT)、 An external table or definition is ON COMMIT DELETE ROWS Global temporary table for .
Be careful : The database will automatically collect online statistics for the internal partitions of the mixed partition table . - its PUBLISH The preference is set to FALSE.
- Its statistics are locked .
- It uses multiple tables INSERT Statement loaded .
10.3.3.1.5 User Interface for Online Statistics Gathering for Bulk Loads
By default , The database collects statistics during bulk loading .
You can use GATHER_OPTIMIZER_STATISTICS Prompt to enable this feature at the statement level . You can use NO_GATHER_OPTIMIZER_STATISTICS Prompt to disable this feature at the statement level . for example , The following statement disables online statistics collection for bulk loading :
CREATE TABLE employees2 AS
SELECT /*+NO_GATHER_OPTIMIZER_STATISTICS*/ * FROM employees
10.3.3.2 Online Statistics Gathering for Partition Maintenance Operations
Oracle The database provides similar support for online statistics during specific partition maintenance operations .
about MOVE、COALESCE and MERGE, The database maintains global and partition level statistics , As shown below :
- If the partition uses incremental or non incremental statistics , Then the database will directly update the... In the global table statistics BLOCKS value . Please note that , This update is not a statistics collection operation .
- The database generates new statistics for the generated partition . If incremental statistics is enabled , Then the database will maintain the partition profile .
about TRUNCATE or DROP PARTITION, The database updates the... In the global table statistics BLOCKS and NUM_ROWS value . This update does not require a collect statistics operation . Statistics update occurs when incremental or non incremental statistics are used .
Be careful : The database does not maintain partition level statistics for maintenance operations with multiple target segments .
10.3.3.3 Real-Time Statistics
Oracle The database can be used in the regular DML Automatic collection of real-time statistics during operations .
10.3.3.3.1 Purpose of Real-Time Statistics
Online Statistics , Whether it's batch loading or traditional DML, Designed to reduce the likelihood that the optimizer will be misled by stale Statistics .
Oracle Database 12c by CREATE TABLE AS SELECT Statement and direct path insertion introduce online statistics collection . Oracle Database 19c Real time statistics are introduced , Extend online support to traditional DML sentence . because DBMS_STATS Inter job statistics may be outdated , Therefore, real-time statistics help the optimizer generate more optimized plans .
The bulk load operation collects all necessary statistics , Real time statistics increase rather than replace traditional statistics . therefore , You must continue to use DBMS_STATS Collect statistics regularly , Best use AutoTask Homework .
10.3.3.3.2 How Real-Time Statistics Work
Oracle The database is DML The values of the most important statistics are calculated dynamically during the operation .
Consider a transaction that is currently moving to oe.orders Add tens of thousands of rows to the table . Real time statistics record important statistical changes , For example, the maximum column value . This enables the optimizer to obtain more accurate cost estimates .
When real-time statistics change , Existing cursors will not be marked as invalid .
10.3.3.3.2.1 Regression Models for Real-Time Statistics
from 21c version ,Oracle The database automatically builds regression models to predict different values of variable table statistics (NDV) The number of . The use of models enables the optimizer to produce accurate results at low cost NDV Estimated value .
Be careful : The time required to establish the regression model may vary . The first step in this process is to NDV How to model over time . This depends on information derived from statistical history about NDV Changed information . If the immediately available information is insufficient , The construction of the model will remain in a waiting state , Until enough historical information is collected .
Use DBMS_STATS Delete 、 Exporting and importing regression models
The regression model is built automatically by the database according to the needs , Unwanted DBA intervention . however , You can use DBMS_STATS Delete 、 Import or export regression models . Default stat_category Include default parameter values MODELS And previously supported values OBJECT_STATS、SYNOPSES and REALTIME_STATS. These are related API:
DBMS_STATS.DELETE_*_STATS
DBMS_STATS_EXPORT_*_STATS
DBMS_STATS.IMPORT_*_STATS
Dictionary view for checking the real-time statistical model
from Oracle Database 21c Start , These new dictionary views can be used to examine saved real-time statistical models .
- ALL_TAB_COL_STAT_MODELS
- DBA_TAB_COL_STAT_MODELS
- USER_TAB_COL_STAT_MODELS
10.3.3.3.3 User Interface for Real-Time Statistics
You can PL/SQL package 、 Data dictionary views and tips use to manage and access real-time statistics .
OPTIMIZER_REAL_TIME_STATISTICS Initialize parameters
When OPTIMIZER_REAL_TIME_STATISTICS The initialization parameter is set to TRUE when ,Oracle The database will be in the regular DML Automatic collection of real-time statistics during operations . The default setting is FALSE, Indicates that real-time statistics is disabled .
By default ,DBMS_STATS Subroutines include real-time statistics . You can also specify parameters to include only these statistics .
| Subroutines | describe |
|---|---|
| EXPORT_TABLE_STATS and EXPORT_SCHEMA_STATS | These subroutines enable you to export statistics . By default ,stat_category Parameters include real-time statistics . REALTIME_STATS Value specifies only real-time statistics . |
| IMPORT_TABLE_STATS and IMPORT_SCHEMA_STATS | These subroutines enable you to import statistics . By default ,stat_category Parameters include real-time statistics . REALTIME_STATS Value specifies only real-time statistics . |
| DELETE_TABLE_STATS and DELETE_SCHEMA_STATS | These subroutines enable you to delete statistics . By default ,stat_category Parameters include real-time statistics . REALTIME_STATS Value specifies only real-time statistics . |
| DIFF_TABLE_STATS_IN_STATTAB | This function compares table statistics from two sources . Statistics always include real-time statistics . |
| DIFF_TABLE_STATS_IN_HISTORY | This function compares table statistics up to two specified timestamps . Statistics always include real-time statistics . |
When NOTES As a STATS_ON_CONVENTIONAL_DML when , You can view statistics in the data dictionary table ( Such as USER_TAB_STATISTICS and USER_TAB_COL_STATISTICS) View real-time statistics in , As shown in the table below .
DBA_* The view has ALL_* and USER_* edition .
| View | describe |
|---|---|
| DBA_TAB_COL_STATISTICS | This view shows from DBA_TAB_COLUMNS Column statistics and histogram information extracted from . Real time statistics by NOTES In column STATS_ON_CONVENTIONAL_DML instructions . |
| DBA_TAB_STATISTICS | This view displays optimizer statistics for the tables that the current user can access . Real time statistics by NOTES In column STATS_ON_CONVENTIONAL_DML instructions . |
NO_GATHER_OPTIMIZER_STATISTICS Prompt to prevent collection of real-time statistics .
10.3.3.3.4 Real-Time Statistics: Example
In this example , Conventional INSERT Statement triggers the collection of real-time statistics .
Before this experiment , Please backup first SH schema Medium sales surface , For subsequent recovery .
create table sales_orig as select * from sales;
This is a Exadata Unique characteristics :
alter system set "_exadata_feature_on"=true scope=spfile;
shutdown immediate;
startup;
Also set the following parameters :
alter session set OPTIMIZER_REAL_TIME_STATISTICS=TRUE;
This example assumes sh The user has been granted DBA role , And you have sh Log in to the database as . You can perform the following steps :
- Collect sales table statistics :
exec DBMS_STATS.GATHER_TABLE_STATS('SH', 'SALES');
- Query the column level statistics of the sales table
SET PAGESIZE 5000
SET LINESIZE 200
COL COLUMN_NAME FORMAT a13
COL LOW_VALUE FORMAT a14
COL HIGH_VALUE FORMAT a14
COL NOTES FORMAT a5
COL PARTITION_NAME FORMAT a13
-- Notes Field is empty , Indicates that real-time statistics have not been collected
SELECT COLUMN_NAME, LOW_VALUE, HIGH_VALUE, SAMPLE_SIZE, NOTES
FROM USER_TAB_COL_STATISTICS
WHERE TABLE_NAME = 'SALES'
ORDER BY 1, 5;
COLUMN_NAME LOW_VALUE HIGH_VALUE SAMPLE_SIZE NOTES
------------- -------------- -------------- ----------- -----
AMOUNT_SOLD C10729 C2125349 918843
CHANNEL_ID C103 C10A 918843
CUST_ID C103 C30B0B 918843
PROD_ID C10E C20231 918843
PROMO_ID C122 C20A64 918843
QUANTITY_SOLD C102 C102 918843
TIME_ID 77C60101010101 78650C1F010101 918843
- Query the table level statistics of the sales table
SELECT NVL(PARTITION_NAME, 'GLOBAL') PARTITION_NAME, NUM_ROWS, BLOCKS, NOTES
FROM USER_TAB_STATISTICS
WHERE TABLE_NAME = 'SALES'
ORDER BY 1, 4;
-- Notes Field is empty , Indicates that real-time statistics have not been collected
PARTITION_NAM NUM_ROWS BLOCKS NOTES
------------- ---------- ---------- -----
GLOBAL 918843 1874
SALES_1995 0 0
SALES_1996 0 0
SALES_H1_1997 0 0
SALES_H2_1997 0 0
SALES_Q1_1998 43687 97
SALES_Q1_1999 64186 126
SALES_Q1_2000 62197 125
SALES_Q1_2001 60608 124
SALES_Q1_2002 0 0
SALES_Q1_2003 0 0
SALES_Q2_1998 35758 86
SALES_Q2_1999 54233 110
SALES_Q2_2000 55515 114
SALES_Q2_2001 63292 124
SALES_Q2_2002 0 0
SALES_Q2_2003 0 0
SALES_Q3_1998 50515 103
SALES_Q3_1999 67138 128
SALES_Q3_2000 58950 118
SALES_Q3_2001 65769 130
SALES_Q3_2002 0 0
SALES_Q3_2003 0 0
SALES_Q4_1998 48874 116
SALES_Q4_1999 62388 121
SALES_Q4_2000 55984 112
SALES_Q4_2001 69749 140
SALES_Q4_2002 0 0
SALES_Q4_2003 0 0
29 rows selected.
- Use traditional INSERT Statement will 918,843 Load rows into sales in
INSERT INTO sales(prod_id, cust_id, time_id, channel_id, promo_id,
quantity_sold, amount_sold)
SELECT prod_id, cust_id, time_id, channel_id, promo_id,
quantity_sold * 2, amount_sold * 2
FROM sales;
COMMIT;
- Get the execution plan from the cursor
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(format=>'TYPICAL'));
----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | | | 4381 (100)| | | |
| 1 | LOAD TABLE CONVENTIONAL | SALES | | | | | | |
| 2 | OPTIMIZER STATISTICS GATHERING | | 918K| 25M| 4381 (1)| 00:00:01 | | |
| 3 | PARTITION RANGE ALL | | 918K| 25M| 4381 (1)| 00:00:01 | 1 | 28 |
| 4 | TABLE ACCESS FULL | SALES | 918K| 25M| 4381 (1)| 00:00:01 | 1 | 28 |
----------------------------------------------------------------------------------------------------------
Pay attention to the OPTIMIZER STATISTICS GATHERING.
- For testing purposes , Force the database to immediately write optimizer statistics to the data dictionary .
EXEC DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO;
- Query the column level statistics of the sales table ,NOTES The information is listed
COLUMN_NAME LOW_VALUE HIGH_VALUE SAMPLE_SIZE NOTES
------------- -------------- -------------- ----------- -------------------------
AMOUNT_SOLD C10729 C224422D 9070 STATS_ON_CONVENTIONAL_DML
AMOUNT_SOLD C10729 C2125349 918843
CHANNEL_ID C103 C10A 9070 STATS_ON_CONVENTIONAL_DML
CHANNEL_ID C103 C10A 918843
CUST_ID C103 C30B0B 9070 STATS_ON_CONVENTIONAL_DML
CUST_ID C103 C30B0B 918843
PROD_ID C10E C20231 9070 STATS_ON_CONVENTIONAL_DML
PROD_ID C10E C20231 918843
PROMO_ID C122 C20A64 9070 STATS_ON_CONVENTIONAL_DML
PROMO_ID C122 C20A64 918843
QUANTITY_SOLD C102 C103 9070 STATS_ON_CONVENTIONAL_DML
QUANTITY_SOLD C102 C102 918843
TIME_ID 77C60101010101 78650C1F010101 9070 STATS_ON_CONVENTIONAL_DML
TIME_ID 77C60101010101 78650C1F010101 918843
14 rows selected.
- Query the table level statistics of the sales table ,NOTES Some have information
PARTITION_NAM NUM_ROWS BLOCKS NOTES
------------- ---------- ---------- -------------------------
GLOBAL 1837686 16096 STATS_ON_CONVENTIONAL_DML
GLOBAL 918843 16096
SALES_1995 0 0
SALES_1996 0 0
SALES_H1_1997 0 0
SALES_H2_1997 0 0
SALES_Q1_1998 43687 1006
SALES_Q1_1999 64186 1006
SALES_Q1_2000 62197 1006
SALES_Q1_2001 60608 1006
SALES_Q1_2002 0 0
SALES_Q1_2003 0 0
SALES_Q2_1998 35758 1006
SALES_Q2_1999 54233 1006
SALES_Q2_2000 55515 1006
SALES_Q2_2001 63292 1006
SALES_Q2_2002 0 0
SALES_Q2_2003 0 0
SALES_Q3_1998 50515 1006
SALES_Q3_1999 67138 1006
SALES_Q3_2000 58950 1006
SALES_Q3_2001 65769 1006
SALES_Q3_2002 0 0
SALES_Q3_2003 0 0
SALES_Q4_1998 48874 1006
SALES_Q4_1999 62388 1006
SALES_Q4_2000 55984 1006
SALES_Q4_2001 69749 1006
SALES_Q4_2002 0 0
SALES_Q4_2003 0 0
30 rows selected.
- Execute the sample query
SELECT COUNT(*) FROM sales WHERE quantity_sold > 50;
- View execution plan , Be careful Notes part
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 4398 (100)| | | |
| 1 | SORT AGGREGATE | | 1 | 3 | | | | |
| 2 | PARTITION RANGE ALL| | 1 | 3 | 4398 (1)| 00:00:01 | 1 | 28 |
|* 3 | TABLE ACCESS FULL | SALES | 1 | 3 | 4398 (1)| 00:00:01 | 1 | 28 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("QUANTITY_SOLD">50)
Note
-----
- dynamic statistics used: statistics for conventional DML
- recovery
truncate table sales;
insert into sales select * from sales_orig;
commit;
alter session set OPTIMIZER_REAL_TIME_STATISTICS=FALSE;
alter system set "_exadata_feature_on"=false scope=spfile;
shutdown immediate;
startup;
This experiment refers to the following links :
- https://blogs.oracle.com/optimizer/post/optimizer-real-time-statistics-parameter-in-ru-1910-onwards
- https://oracle-base.com/articles/19c/real-time-statistics-19c
10.4 When the Database Gathers Optimizer Statistics
The database collects optimizer statistics from different sources at different times .
10.4.1 Sources for Optimizer Statistics
The optimizer uses several different sources for optimizer Statistics .
The sources are as follows :
- DBMS_STATS perform , Automatic or manual
This PL/SQL Packages are the primary method of collecting optimizer Statistics . - SQL compile
stay SQL During compilation , The database can be added previously by DBMS_STATS Statistics collected . At this stage , The database runs additional queries to get information about how many rows in the table satisfy SQL Statement WHERE More accurate information about clause predicates . - SQL perform
During execution , The database can further augment previously collected statistics . At this stage ,Oracle The database collects each row source during execution SQL The number of rows produced during the statement . At the end of execution , The optimizer determines whether the estimated number of rows is sufficiently imprecise to ensure reanalysis at the next statement execution . If the cursor is marked for reparse , Then the optimizer uses the actual row count from the previous execution instead of the estimated value . - SQL The configuration file
SQL A configuration file is a collection of auxiliary statistics for a query . The configuration file stores these supplementary statistics in the data dictionary . The optimizer uses... During optimization SQL Configuration file to determine the best plan .
10.4.2 SQL Plan Directives
SQL Plan instructions are additional information and instructions that the optimizer can use to generate more optimized plans .
This instruction is the optimizer's “ Explain yourself ”, It incorrectly estimates the cardinality of certain types of predicates , It also reminds us that DBMS_STATS Collect statistics needed to correct erroneous estimates in the future . for example , When connecting two tables with data skew in their connected columns ,SQL Planning instructions can instruct the optimizer to use dynamic statistics to obtain more accurate connection cardinality estimates .
10.4.2.1 When the Database Creates SQL Plan Directives
The database is automatically created based on information learned during automatic re optimization SQL Planning instructions . If in SQL Cardinality error occurred during execution , The database will create SQL Planning instructions .
For each new instruction ,DBA_SQL_PLAN_DIRECTIVES.STATE The column shows the value USABLE. This value indicates that the database can use this instruction to correct erroneous estimates .
The optimizer defines... On the query expression SQL Planning instructions , for example , Filter predicates on two columns used at the same time . The instruction is not bound to a specific SQL Sentence or SQL ID. therefore , The optimizer can use instructions for different statements . for example , Directives can help the optimizer process queries that use similar patterns , For example, the same query except for the selection list item .
The comment section of the execution plan indicates the... For the statement SQL Number of planning instructions . By inquiring DBA_SQL_PLAN_DIRECTIVES and DBA_SQL_PLAN_DIR_OBJECTS View to get more information about instructions .
10.4.2.2 How the Database Uses SQL Plan Directives
Compiling SQL When the sentence is , If the optimizer sees an instruction , Then it will comply with the directive by collecting additional information .
The optimizer uses instructions in the following ways :
- dynamic database
As long as there is not enough statistics corresponding to the instruction , The optimizer will use dynamic statistics . for example , Cardinality estimates for queries whose predicates contain specific column pairs can be severely wrong . SQL Planning instructions indicate , Whenever you parse a query that contains these columns , The optimizer needs to use dynamic sampling to avoid serious cardinality error estimation .
Dynamic statistics has some performance overhead . Every time the optimizer hard parses a query that applies dynamic statistics instructions , The database must perform additional sampling .
from Oracle Database 12c The first 2 edition (12.2) Start , The database writes statistics from adaptive dynamic sampling to SQL Plan instruction store , Make it available for other queries . - Line up
The optimizer checks the query corresponding to the instruction . If a rank is missing , And if the affected table DBMS_STATS Preferences AUTO_STAT_EXTENSIONS Set to ON( The default is OFF), Then the optimizer will DBMS_STATS This rank is automatically created the next time statistics about the table are collected . otherwise , The optimizer does not automatically create a rank .
If there is a rank , The next time this statement is executed , Whenever possible, the optimizer uses column group statistics instead of SQL Planning instructions ( Equality predicate 、GROUP BY etc. ). In subsequent execution , The optimizer may create additional SQL Plan instructions to solve other problems in the plan , For example, connect or GROUP BY Base estimate error .
notes : at present , The optimizer only monitors rank . The optimizer does not create extensions on expressions .
When the problem that caused the instruction is solved , Whether it's because there are better instructions or because there are histograms or extensions ,DBA_SQL_PLAN_DIRECTIVES.STATE Value from USABLE Turn into SUPERSEDED. More information about the status of the instruction is available in DBA_SQL_PLAN_DIRECTIVES.NOTES Column shows .
10.4.2.3 SQL Plan Directive Maintenance
Automatic database creation SQL Planning instructions . You cannot create them manually .
The database initially creates instructions in the shared pool . The database periodically writes instructions to SYSAUX Table space . The database is automatically purged in a specified number of weeks (SPD_RETENTION_WEEKS) Any that has not been used since SQL Planning instructions , The default is 53.
You can use DBMS_SPD Package to manage instructions . for example , You can :
- Enable and disable SQL Planning instructions (ALTER_SQL_PLAN_DIRECTIVE)
- change SQL Retention period for planning instructions (SET_PREFS)
- Export instructions to temporary tables (PACK_STGTAB_DIRECTIVE)
- Delete instruction (DROP_SQL_PLAN_DIRECTIVE)
- Force the database to write instructions to disk (FLUSH_SQL_PLAN_DIRECTIVE)
10.4.2.4 How the Optimizer Uses SQL Plan Directives: Example
This example shows how the database is SQL Statements are created and used automatically SQL Planning instructions .
hypothesis : You plan to target sh Run query in mode , And you have this architecture as well as the data dictionary and V$ Permissions for views .
- Inquire about sh.customers surface :
SELECT /*+gather_plan_statistics*/ *
FROM customers
WHERE cust_state_province='CA'
AND country_id=52790;
gather_plan_statistics The prompt displays the actual number of rows returned from each operation in the plan . therefore , You can compare the optimizer estimate with the actual number of rows returned .
- Query the plan of the previous query .
SQL_ID ayd76b1zycdwr, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ * FROM customers WHERE
cust_state_province = 'CA' AND country_id = 52790
Plan hash value: 2008213504
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 3341 |00:00:00.01 | 1529 | 1512 |
|* 1 | TABLE ACCESS FULL| CUSTOMERS | 1 | 20 | 3341 |00:00:00.01 | 1529 | 1512 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(("CUST_STATE_PROVINCE"='CA' AND "COUNTRY_ID"=52790))
19 rows selected.
The actual number of rows returned by each operation in the plan (A-Rows) And estimates (E-Rows) There's a big difference . This statement is a candidate for automatic re optimization .
- Check whether the customer query can be re optimized .
- Show sh Mode instruction .
EXEC DBMS_SPD.FLUSH_SQL_PLAN_DIRECTIVE;
SELECT TO_CHAR(d.DIRECTIVE_ID) dir_id, o.OWNER AS "OWN", o.OBJECT_NAME AS "OBJECT",
o.SUBOBJECT_NAME col_name, o.OBJECT_TYPE, d.TYPE, d.STATE, d.REASON
FROM DBA_SQL_PLAN_DIRECTIVES d, DBA_SQL_PLAN_DIR_OBJECTS o
WHERE d.DIRECTIVE_ID=o.DIRECTIVE_ID
AND o.OWNER IN ('SH')
ORDER BY 1,2,3,4,5;
DIR_ID OWN OBJECT COL_NAME OBJECT_TYPE TYPE STATE REASON
14911729199042574566 SH CUSTOMERS COUNTRY_ID COLUMN DYNAMIC_SAMPLING USABLE SINGLE TABLE CARDINALITY MISESTIMATE
14911729199042574566 SH CUSTOMERS CUST_STATE_PROVINCE COLUMN DYNAMIC_SAMPLING USABLE SINGLE TABLE CARDINALITY MISESTIMATE
14911729199042574566 SH CUSTOMERS TABLE DYNAMIC_SAMPLING USABLE SINGLE TABLE CARDINALITY MISESTIMATE
first , Database will SQL Planning instructions are stored in memory , Then each 15 Minutes to write them to disk . therefore , The previous example calls DBMS_SPD.FLUSH_SQL_PLAN_DIRECTIVE To force the database to write instructions to SYSAUX Table space .
Use view DBA_SQL_PLAN_DIRECTIVES and DBA_SQL_PLAN_DIR_OBJECTS Monitoring instructions . Three entries appear in the view , One for the customer table itself , One for each related column . because customers Of the query IS_REOPTIMIZABLE The value is Y, If you re execute the statement , Then the database will be hard parsed again , Then generate the plan according to the previous execution statistics .
- Execute the query again
SELECT /*+gather_plan_statistics*/ *
FROM customers
WHERE cust_state_province='CA'
AND country_id=52790;
- View execution plan
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 3341 |00:00:00.01 | 1528 |
|* 1 | TABLE ACCESS FULL| CUSTOMERS | 1 | 3341 | 3341 |00:00:00.01 | 1528 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(("CUST_STATE_PROVINCE"='CA' AND "COUNTRY_ID"=52790))
Note
-----
- statistics feedback used for this statement
Note Part indicates that the database used re optimization for this statement . Estimated number of rows (E-Rows) Now it's right . SQL Planning instructions have not been used .
- aaa
SELECT SQL_ID, CHILD_NUMBER, SQL_TEXT, IS_REOPTIMIZABLE
FROM V$SQL
WHERE SQL_TEXT LIKE 'SELECT /*+gather_plan_statistics*/%';
SQL_ID CHILD_NUMBER SQL_TEXT IS_REOPTIMIZABLE
3q5u7q9vq52xm 0 SELECT /*+gather_plan_statistics*/ * FROM customers WHERE cust_state_province='CA' AND country_id='US' N
3q5u7q9vq52xm 1 SELECT /*+gather_plan_statistics*/ * FROM customers WHERE cust_state_province='CA' AND country_id='US' N
customers Query that a new plan exists , There is also a new child cursor .
- confirm SQL Planning instructions exist and can be used in other statements .
SELECT /*+gather_plan_statistics*/ CUST_EMAIL
FROM CUSTOMERS
WHERE CUST_STATE_PROVINCE='MA'
AND COUNTRY_ID=52790;
- Query the plan in the cursor .
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(FORMAT=>'ALLSTATS LAST'));
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 181 |00:00:00.01 | 1522 |
|* 1 | TABLE ACCESS FULL| CUSTOMERS | 1 | 20 | 181 |00:00:00.01 | 1522 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(("CUST_STATE_PROVINCE"='MA' AND "COUNTRY_ID"=52790))
19 rows selected.
10.4.2.5 How the Optimizer Uses Extensions and SQL Plan Directives: Example
This example shows how the database uses SQL Planning instructions , Until the optimizer verifies that the extension exists and the statistics apply .
here , The command changes its state to SUPERSEDED. Subsequent compilation uses statistics instead of instructions .
This experiment depends on the above experiment . Here is a brief introduction .
10.4.3 When the Database Samples Data
from Oracle Database 12c Start , The optimizer automatically determines whether dynamic statistics are useful and useful for all SQL The sample size used by the statement .
Be careful : In previous releases , Dynamic statistics is called dynamic sampling .
The main factor in determining the use of dynamic statistics is whether the available statistics are sufficient to generate the best plan . If the statistics are not enough , Then the optimizer uses dynamic statistics .
When OPTIMIZER_DYNAMIC_SAMPLING Initialization parameter is not set to 0 Enable automatic dynamic statistics . By default , The dynamic statistics level is set to 2.
Usually , The optimizer uses default statistics instead of dynamic statistics to calculate optimization tables 、 Statistics required during indexing and columns . The optimizer determines whether to use dynamic statistics based on several factors , Include :
- SQL Statements are executed in parallel .
- There is SQL Planning instructions .
The following figure illustrates the process of collecting dynamic statistics .
As shown in the figure above , The optimizer automatically collects dynamic statistics in the following cases :
- Missing Statistics
When the table in the query has no statistics , The optimizer collects the basic statistics of these tables before optimization . Statistics may be lost , Because the application has no subsequent calls DBMS_STATS To collect statistics in the case of creating new objects , Or because the statistics were locked on the object before the statistics were collected .
under these circumstances , Statistics are not like using DBMS_STATS As high quality or complete as the statistics collected by the package . This trade-off is to limit the impact on statement compilation time . - The statistics are insufficient
Whenever the optimizer estimates predicates ( Filter or connection ) or GROUP BY Clause without considering the correlation between columns 、 Skew of column data distribution 、 Statistics of the expression, etc , Statistics may be insufficient .
Extended statistics help the optimizer to obtain accurate mass cardinality estimates for complex predicate expressions . The optimizer can use dynamic statistics to compensate for the lack of extended statistics , Or when it cannot use extended Statistics , for example , For unequal predicates .
Be careful : To contain AS OF Clause , The database does not use dynamic statistics .
10.4.4 How the Database Samples Data
At the beginning of optimization , When determining whether a table is a candidate for dynamic statistics , The optimizer checks the table for persistent SQL Planning instructions .
For each instruction , The optimizer registers a statistical expression , The optimizer evaluates the expression when determining the cardinality of predicates involving tables . In the figure 10-2 in , The database issues a recursive SQL Statement to scan a small random sample of a table block . Database applies related single table predicates and joins to estimate predicate cardinality .
The database saves the results of dynamic statistics as sharable statistics . The database can be found in a query SQL During compilation, the results are shared with recompilation of the same query . The database can also reuse the results for queries with the same schema .
边栏推荐
- Oracle SQL Developer的代码输入框中推荐使用的中文字体
- NiO User Guide
- gzip压缩解压缩
- Libmysqlclient A static library
- Smart management of green agriculture: a visual platform for agricultural product scheduling
- makefile 的ifeq,filter,strip 简单使用
- Common error in script execution: build sh: caller: not found
- 如何自己动手写一个vscode插件,实现插件自由!
- How do complex systems detect anomalies? North Carolina UNCC and others' latest overview of graph based deep learning anomaly detection methods in complex distributed systems describes the latest prog
- MySQL master-slave replication
猜你喜欢

图灵奖得主:想要在学术生涯中获得成功,需要注意哪些问题?

Npoi create word

Kdd2022 | graphmae: self supervised mask map self encoder

ASCII 码对照表

#141 Linked List Cycle

#113 Path Sum II

Sorting out the knowledge points of primary and secondary indicators

如何自己动手写一个vscode插件,实现插件自由!

测试基础之:单元测试

阅读笔记 Deep Hough Voting for 3D Object Detection in Point Clouds
随机推荐
Digital intelligence data depth | Bi goes down the altar? It's not that the market has declined, it's that the story has changed
测试基础之:单元测试
Graphics2d class basic use
(4) Pyqt designs and implements the [factory production management system] order page - add, delete, modify and query (including source code analysis)
SQL调优指南笔记18:Analyzing Statistics Using Optimizer Statistics Advisor
JVisualVM初步使用
Teamwork collaboration application experience sharing | community essay solicitation
GNS安装与配置
Risk control modeling X: Discussion on problems existing in traditional modeling methods and Exploration on improvement methods
#886 Possible Bipartition
Bubble sort
Recursively call knowledge points - including example solving binary search, frog jumping steps, reverse order output, factorial, Fibonacci, Hanoi tower.
实现从字符串中删除某个字符操作
Image processing 12- image linear blending
SQL调优指南笔记9:Joins
Oracle LiveLabs实验:Introduction to Oracle Spatial Studio
好数对的求解
Shell script Basics
大一下学年学期总结
USB机械键盘改蓝牙键盘