当前位置：网站首页>SQL tuning guide notes 10:optimizer statistics concepts

SQL tuning guide notes 10:optimizer statistics concepts

2022-06-12 21:39:00 【dingdingfish】

This paper is about SQL Tuning Guide The first 10 Chapter “Optimizer Statistics Concepts” The notes .

Important basic concepts

execution plan
The combination of steps used by the database to execute a SQL statement. Each step either retrieves rows of data physically from the database or prepares them for the session issuing the statement. You can override execution plans by using a hint.
The database is used to execute SQL Step combinations of statements . Each step either physically retrieves data rows from the database , Or prepare them for the session that issued the statements . You can use the prompt to override the execution plan .
extended statistics
A type of optimizer statistics that improves estimates for cardinality when multiple predicates exist or when predicates contain an expression.
Extended Statistics
When there are multiple predicates or predicates contain expressions , An optimizer statistic improves the estimation of cardinality .
cardinality
The number of rows that is expected to be or is returned by an operation in an execution plan.
The number of rows expected or to be returned by the operation in the execution plan .
synopsis
A set of auxiliary statistics gathered on a partitioned table when the INCREMENTAL value is set to true.
When INCREMENTAL Value is set to true when , A set of auxiliary statistics collected on the partition table .
SQL compilation
In the context of Oracle SQL processing, this term refers collectively to the phases of parsing, optimization, and plan generation.
stay Oracle SQL In the context of processing , This term is collectively referred to as parsing 、 Optimization and plan generation phase .
SQL profile
A set of auxiliary information built during automatic tuning of a SQL statement. A SQL profile is to a SQL statement what statistics are to a table. The optimizer can use SQL profiles to improve cardinality and selectivity estimates, which in turn leads the optimizer to select better plans.
SQL The configuration file
In automatic tuning SQL A set of auxiliary information constructed during a statement . SQL Configuration files are for SQL Statements are like statistics to a table . The optimizer can use SQL Profiles to improve cardinality and selectivity estimates , This leads the optimizer to choose a better plan .
automatic reoptimization
The ability of the optimizer to automatically change a plan on subsequent executions of a SQL statement. Automatic reoptimization can fix any suboptimal plan chosen due to incorrect optimizer estimates, from a suboptimal distribution method to an incorrect choice of degree of parallelism.
The optimizer is SQL The ability to automatically change the plan during subsequent execution of the statement . Automatic re optimization can fix any sub optimal plan selected due to incorrect optimizer estimates , From suboptimal distribution to wrong choice of parallelism .

Oracle Database optimizer statistics describe the details about the database and its objects .

10.1 Introduction to Optimizer Statistics

The optimizer cost model relies on the collected statistics about the objects involved in the query and the database and host where the query runs .

The optimizer uses statistics to estimate from the table 、 Number of rows retrieved from partition or index （ And the number of bytes ）. The optimizer estimates the access cost , Determine the costs of possible programs , Then choose the lowest cost execution plan .

Optimizer statistics include the following ：

Table statistics
- Row number
- Number of blocks
- The average President
Make statistics
- Different values in the column (NDV) The number of
- The number of null values in the column
- The data distribution （ Histogram ）
- Extended Statistics
The index statistics
- Number of leaf blocks
- The layer number
- Index clustering factor
System statistics
- I/O Performance and utilization
- CPU Performance and utilization

Pictured 10-1 Shown , The database will table 、 Column 、 Indexes and optimizer statistics for the system are stored in the data dictionary . You can access these statistics using the data dictionary view .

Be careful ： Optimizer statistics and pass V$ The performance statistics visible in the view are different .
Insert picture description here

10.2 About Optimizer Statistics Types

The optimizer collects statistics about different types of database objects and database environment characteristics .

10.2.1 Table Statistics

The table statistics contain the metadata used by the optimizer in developing the execution plan .

10.2.1.1 Permanent Table Statistics

stay Oracle In the database , Table statistics include information about rows and blocks .

The optimizer uses these statistics to determine the cost of table scans and table joins . The database tracks all relevant statistics about persistent tables . for example , Stored in DBA_TAB_STATISTICS Table statistics in track the following ：

Row number
When determining the cardinality, the database uses storage in DBA_TAB_STATISTICS Row count in .
The average President
Number of data blocks
The optimizer uses with DB_FILE_MULTIBLOCK_READ_COUNT Initialize the number of data blocks of the parameter to determine the base table access cost .
Number of empty data blocks

DBMS_STATS.GATHER_TABLE_STATS Submit... Before collecting statistics for persistent tables .

This sample query sh.customers Table statistics for table .

SELECT NUM_ROWS, AVG_ROW_LEN, BLOCKS, 
       EMPTY_BLOCKS, LAST_ANALYZED
FROM   DBA_TAB_STATISTICS
WHERE  OWNER='SH'
AND    TABLE_NAME='CUSTOMERS';

  NUM_ROWS AVG_ROW_LEN     BLOCKS EMPTY_BLOCKS LAST_ANAL
---------- ----------- ---------- ------------ ---------
     55500         189       1551            0 28-MAY-22

10.2.1.2 Temporary Table Statistics

DBMS_STATS You can collect statistics on permanent and global temporary tables , But there are other considerations for the latter .

10.2.1.2.1 Types of Temporary Tables

Temporary tables are divided into global tables 、 Private or cursor duration .

In all types of temporary tables , The data is only visible to the session in which it is inserted . The differences between these tables are as follows ：

Global temporary tables are explicitly created persistent objects , Intermediate session private data for a specific duration .
The table is global , Because the definition is visible to all sessions . CREATE GLOBAL TEMPORARY TABLE Of ON COMMIT Clause indicates that the table is transaction specific (DELETE ROWS) It is also session specific (PRESERVE ROWS). Optimizer statistics for global temporary tables can be shared or session specific .
A private temporary table is an explicitly created object , Defined by private memory metadata , Store intermediate session private data for a specific duration .
The table is private , Because the definition is only visible to the session that created the table . CREATE PRIVATE TEMPORARY TABLE Of ON COMMIT Clause indicates that the table is transaction specific (DROP DEFINITION) It is also session specific (PRESERVE DEFINITION).
Cursor duration temporary tables are implicitly created memory only objects associated with cursors .
Unlike global and private temporary tables ,DBMS_STATS Unable to collect statistics for cursor duration temporary table .

The difference between these tables is where they store data 、 How they are created and deleted, as well as the duration and visibility of metadata . Please note that , The database allocates storage space when a session first inserts data into a global temporary table , Instead of creating tables .

The following table is an important feature of the temporary table ：

features	Global Temporary Table	Private Temporary Table	Cursor-Duration Temporary Table
Visibility of data	Session insert data	Session insert data	Session insert data
data storage	persistent	Memory or temporary files , But only during a session or transaction	In memory only
Visibility of metadata	All sessions	Create a session for the table （ Based on V$ View's USER_PRIVATE_TEMP_TABLES In the view ）	The session that executes the cursor
Duration of metadata	Until the table is explicitly deleted	Until the table is explicitly deleted , Or the end of the conversation (PRESERVE DEFINITION) Or the transaction ends (DROP DEFINITION)	Until the cursor clears the shared pool
Create table	CREATE GLOBAL TEMPORARY TABLE ( Support AS SELECT)	CREATE PRIVATE TEMPORARY TABLE ( Support AS SELECT)	Implicitly create when the optimizer thinks it is useful
Create impact on existing transactions	No implicit commit	No implicit commit	No implicit commit
Naming rules	Same as permanent table	Must be ORA$PTT_ start	Internally generated unique name
Delete table	DROP GLOBAL TEMPORARY TABLE	DROP PRIVATE TEMPORARY TABLE, Or in a conversation (PRESERVE DEFINITION) And transaction (DROP DEFINITION) Implicitly delete at the end	Implicitly delete at the end of the session

10.2.1.2.2 Statistics for Global Temporary Tables

DBMS_STATS Collect statistics of the same type as permanent tables for global temporary tables .

Be careful ： You cannot collect statistics for private temporary tables .
The following table shows the differences between global temporary tables in collecting and storing optimizer Statistics , It depends on whether the scope of the table is a transaction or a session .

features	Transaction specific	Session specific
DBMS_STATS Impact of collection	No submission	Submit
Statistics storage	Only memory	Dictionary table
Histogram creation	I won't support it	Support

The following procedure does not commit transaction specific temporary tables , Therefore, the rows in these tables will not be deleted ：

GATHER_TABLE_STATS
DELETE_obj_STATS, among obj yes TABLE、COLUMN or INDEX
SET_obj_STATS, among obj yes TABLE、COLUMN or INDEX
GET_obj_STATS, among obj yes TABLE、COLUMN or INDEX

The preceding program unit follows GLOBAL_TEMP_TABLE_STATS Statistics preferences （ This is an initialization parameter , The default is empty. ）. for example , If the table preference is set to SESSION, be SET_TABLE_STATS Set session Statistics , and GATHER_TABLE_STATS Keep all rows in the transaction specific temporary table . however , If the table preference is set to SHARED, be SET_TABLE_STATS Shared statistics will be set , and GATHER_TABLE_STATS All rows are deleted from the transaction specific temporary table .

10.2.1.2.3 Shared and Session-Specific Statistics for Global Temporary Tables

from Oracle Database 12c Start , You can set table level preferences GLOBAL_TEMP_TABLE_STATS To share the global temporary table (SHARED) Or session specific (SESSION) Make statistics .

When GLOBAL_TEMP_TABLE_STATS by SESSION when , You can collect optimizer statistics for global temporary tables in one session , Then use only the statistics for that session . meanwhile , Users can continue to maintain statistics for a shared version . During optimization , The optimizer first checks whether the global temporary table has session specific statistics . If it is , Then the optimizer uses them . otherwise , The optimizer will use shared Statistics （ If there is ）.

Be careful ： stay Oracle Database 12c In the previous version , The database maintains optimizer statistics for global temporary tables and non global temporary tables in the same way . The database maintains a version of statistics shared by all sessions , Even though the data in different sessions may be different .

Session specific optimizer statistics have the following characteristics ：

The dictionary view of tracking statistics displays shared statistics and session specific statistics in the current session .
CREATE … AS SELECT Automatically collect optimizer Statistics . however , When GLOBAL_TEMP_TABLE_STATS Set to SHARED when , You must use DBMS_STATS Collect statistics manually .
These views are DBA_TAB_STATISTICS、DBA_IND_STATISTICS、DBA_TAB_HISTOGRAMS and DBA_TAB_COL_STATISTICS（ Each view has a corresponding USER_ and ALL_ edition ）. SCOPE Columns show whether the statistics are session specific or shared . Session specific statistics must be stored in the data dictionary , So that multiple processes can be in Oracle RAC Access them in .
Pending statistics are not supported .
Other sessions do not share cursors that use session specific statistics .
Different sessions can share a cursor that uses shared Statistics , As in the Oracle Database 12c Same as in previous versions . The same session can share a cursor that uses session specific statistics .
By default , Temporary watch GATHER_TABLE_STATS Immediately invalidate previous cursors compiled in the same session . however , This procedure does not invalidate cursors compiled in other sessions .

10.2.2 Column Statistics

Column statistics track information about column values and data distribution .

The optimizer uses column statistics to generate accurate cardinality estimates , And use... For indexing 、 Connection sequence 、 Connection methods, etc. to make better decisions . for example ,DBA_TAB_COL_STATISTICS Statistics in track the following ：

The number of different values
Null number
Max min
Histogram related information

The optimizer can use extended Statistics , This is a special type of column Statistics . These statistics are useful for informing the Optimizer about logical relationships between columns .

10.2.3 Index Statistics

Index statistics include the number of index levels 、 Information such as the number of index blocks and the relationship between indexes and data blocks . The optimizer uses these statistics to determine the cost of an index scan .

10.2.3.1 Types of Index Statistics

DBA_IND_STATISTICS View tracking index statistics .

The statistics include the following ：

Hierarchy
BLEVEL The column shows the number of blocks required from the root block to the leaf block . B-tree There are two types of blocks in an index ： Branch blocks for searching and leaf blocks for storing values .
Different keys
This column tracks the number of different index values . If a unique constraint is defined , And there's no definition of NOT NULL constraint , Then the value is equal to the number of non null values .
The average number of leaf blocks per different index key
The average number of data blocks pointed to by each different index key

It is known that CUSTOMERS Table by 1664 individual block.

SELECT INDEX_NAME, BLEVEL, LEAF_BLOCKS AS "LEAFBLK", DISTINCT_KEYS AS "DIST_KEY",
       AVG_LEAF_BLOCKS_PER_KEY AS "LEAFBLK_PER_KEY",
       AVG_DATA_BLOCKS_PER_KEY AS "DATABLK_PER_KEY"
FROM   DBA_IND_STATISTICS
WHERE  OWNER = 'SH'
AND    INDEX_NAME IN ('CUST_LNAME_IX','CUSTOMERS_PK');

INDEX_NAME     BLEVEL LEAFBLK DIST_KEY LEAFBLK_PER_KEY DATABLK_PER_KEY
-------------- ------ ------- -------- --------------- ---------------
CUSTOMERS_PK        1     115    55500               1               1
CUST_LNAME_IX       1     141      908               1              10

10.2.3.2 Index Clustering Factor

about B Tree index , Index clustering factor measurement and index value （ For example, last name ） The physical aggregation of the related rows .

The index clustering factor helps the optimizer decide for certain queries , Index scan or full table scan is more effective . The oligomeric set factor indicates that the index scan is more efficient .

Close to the number of blocks in the table The clustering factor of indicates that rows are physically sorted by index key in the table block . If the database performs a full table scan , The database tends to retrieve these rows , Because they are stored on disks sorted by index key . Close to the number of rows The clustering factor of indicates that the rows are randomly scattered in the database block relative to the index key . If the database performs a full table scan , Then the database will not retrieve rows in any sort order by this index key .

The aggregation factor is an attribute of a particular index , Not a watch . If there are multiple indexes on a table , The aggregation factors of different indexes may be different . Trying to reorganize tables to increase the clustering factor of one index may reduce the clustering factor of another index .

This example shows how the optimizer uses the index clustering factor to determine if using an index is more efficient than a full table scan .

SELECT  table_name, num_rows, blocks
FROM    user_tables
WHERE   table_name='CUSTOMERS';
 
TABLE_NAME                       NUM_ROWS     BLOCKS
------------------------------ ---------- ----------
CUSTOMERS                           55500       1551

--  stay customers.cust_last_name  Create index on column 
CREATE INDEX CUSTOMERS_LAST_NAME_IDX ON customers(cust_last_name);

--  Query the index clustering factor of the new index .
SELECT index_name, blevel, leaf_blocks, clustering_factor
FROM   user_indexes
WHERE  table_name='CUSTOMERS'
AND    index_name= 'CUSTOMERS_LAST_NAME_IDX';
 
INDEX_NAME                         BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
------------------------------ ---------- ----------- -----------------
CUSTOMERS_LAST_NAME_IDX                 1         141              9936

--  establish customers  A new copy of the table , The lines in it are pressed cust_last_name  Sort .

CREATE TABLE customers3 AS 
  SELECT * 
  FROM   customers 
  ORDER BY cust_last_name;
  
--  Collect about  customers3  Table statistics .
EXEC DBMS_STATS.GATHER_TABLE_STATS(null,'CUSTOMERS3');

--  Inquire about customers3  Number of rows and blocks in the table .
SELECT    TABLE_NAME, NUM_ROWS, BLOCKS
FROM      USER_TABLES
WHERE     TABLE_NAME='CUSTOMERS3';
 
TABLE_NAME                       NUM_ROWS     BLOCKS
------------------------------ ---------- ----------
CUSTOMERS3                          55500       1550

--  stay customers3  Of cust_last_name  Create index on column .
CREATE INDEX CUSTOMERS3_LAST_NAME_IDX ON customers3(cust_last_name);
 
--  Inquire about customers3_last_name_idx  Index clustering factor of the index .
SELECT INDEX_NAME, BLEVEL, LEAF_BLOCKS, CLUSTERING_FACTOR
FROM   USER_INDEXES
WHERE  TABLE_NAME = 'CUSTOMERS3'
AND    INDEX_NAME = 'CUSTOMERS3_LAST_NAME_IDX';
 
INDEX_NAME                         BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
------------------------------ ---------- ----------- -----------------
CUSTOMERS3_LAST_NAME_IDX                1         141              1516

--  Inquire about customers surface , Show execution plan , Take a full scan 
SELECT cust_first_name, cust_last_name
FROM   customers
WHERE  cust_last_name BETWEEN 'Puleo' AND 'Quinn';
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR());

-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           |       |       |   423 (100)|          |
|*  1 |  TABLE ACCESS FULL| CUSTOMERS |  2335 | 35025 |   423   (1)| 00:00:01 |
-------------------------------------------------------------------------------

--  Inquire about customers3 surface , Show execution plan , Go to the index 
SELECT cust_first_name, cust_last_name
FROM   customers3
WHERE  cust_last_name BETWEEN 'Puleo' AND 'Quinn';
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR());

----------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                     | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                          |       |       |    71 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| CUSTOMERS3               |  2335 | 35025 |    71   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | CUSTOMERS3_LAST_NAME_IDX |  2335 |       |     7   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------------

--  Query customers with hints that force the optimizer to use indexes . The cost is much higher .
SELECT /*+ index (Customers CUSTOMERS_LAST_NAME_IDX) */ cust_first_name, 
       cust_last_name 
FROM   customers 
WHERE  cust_last_name BETWEEN 'Puleo' and 'Quinn';
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR());

---------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                         |       |       |   425 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| CUSTOMERS               |  2335 | 35025 |   425   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | CUSTOMERS_LAST_NAME_IDX |  2335 |       |     7   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------------

--  clear 
DROP TABLE customers3 PURGE;
DROP INDEX CUSTOMERS_LAST_NAME_IDX;

The above plan shows , The cost of using indexes for customers is higher than the cost of full table scanning . therefore , Using indexes does not necessarily improve performance . The index clustering factor is a measure of whether an index scan is more effective than a full table scan .

10.2.3.3 Effect of Index Clustering Factor on Cost: Example

This example illustrates how index clustering factors affect table access costs .

Consider the following scenarios ：

A table contains 9 That's ok , Stored in 3 In blocks of data .
col1 Column current stored value A、B and C.
Of this table col1 There is a file named col1_idx The non unique index of .

Suppose these rows are stored in data blocks , As shown below ：

Block 1       Block 2        Block 3
-------       -------        -------
A  A  A       B  B  B        C  C  C

In this example ,col1_idx The index clustering factor of is low . col1 Rows with the same index column values are in the same data block in the table . therefore , Using the index range scan, all values returned are A The cost of the line is very low , Because only one block in the table needs to be read .

Suppose the same rows are scattered in data blocks , As shown below ：

Block 1       Block 2        Block 3
-------       -------        -------
A  B  C       A  C  B        B  A  C

In this example ,col1_idx The index clustering factor of is high . The database must read all three blocks in the table to retrieve col1 The median is A All of the line .

10.2.4 System Statistics

The system statistics describe the hardware characteristics , for example I/O and CPU Performance and utilization .

System statistics enable the query optimizer to estimate more accurately when selecting an execution plan I/O and CPU cost . When updating system statistics , The database will not make the previously resolved SQL Statement invalidation . The database parses all new with new statistics SQL sentence .

10.2.5 User-Defined Optimizer Statistics

The extensible optimizer enables authors of user-defined functions and indexes to create statistical data collections 、 Selectivity and cost function .

The optimizer cost model is extended to integrate user supplied information to evaluate CPU and I/O cost . The statistics type acts as an interface to user-defined functions that affect the selection of execution plans . however , To use the statistics type , The optimizer needs a mechanism to bind this type to a database object , For example, columns 、 Independent functions 、 object type 、 Indexes 、 Index type or package . SQL sentence ASSOCIATE STATISTICS This binding is allowed .

Functions and usage criteria for user-defined statistics SQL The data type is related to the column of the object type and the domain index . When you associate a statistical type with a column or domain index , as long as DBMS_STATS Collect statistics , The database will call the statistics collection method in the statistics type .

10.3 How the Database Gathers Optimizer Statistics

Oracle The database provides several mechanisms for collecting statistical information .

10.3.1 DBMS_STATS Package

DBMS_STATS PL/SQL Package collects and manages optimizer Statistics .

This package enables you to control what and how statistics are collected , Including parallelism 、 Sampling method and granularity of statistics collection in partition table .

Be careful ： Do not use ANALYZE Of the statement COMPUTE and ESTIMATE Clause to collect optimizer Statistics . These terms have been deprecated . contrary , Please use DBMS_STATS.

Creating an accurate execution plan requires the use of DBMS_STATS Statistics collected by packages . for example ,DBMS_STATS The collected table statistics include the number of rows 、 Number of blocks and average row length .

By default ,Oracle The database uses automatic optimizer statistics to collect . under these circumstances , The database automatically runs for all schema objects with missing or outdated statistics DBMS_STATS Collect optimizer Statistics . This process eliminates many manual tasks associated with managing the optimizer , And significantly reduce the risk of generating sub optimal execution plans due to lack or outdated statistical information . You can also do this manually DBMS_STATS To update and manage optimizer Statistics .

Oracle Database 19c High frequency automatic optimizer statistical information collection is introduced . This lightweight task periodically collects statistics on stale objects . The default interval is 15 minute . Compared with the automatic statistics collection job , High frequency tasks do not perform tasks such as clearing statistics or calling for nonexistent objects Optimizer Statistics Advisor Something like that . You can use DBMS_STATS.SET_GLOBAL_PREFS Procedure to set preferences for high frequency tasks , And use DBA_AUTO_STAT_EXECUTIONS View metadata .

10.3.2 Supplemental Dynamic Statistics

By default , When optimizer statistics are lost 、 Obsolete or insufficient , The database automatically collects dynamic statistics during parsing . The database uses recursion SQL To scan a small sample of random table blocks .

Be careful ： Dynamic statistics supplement rather than replace statistical information .

Dynamic statistics supplement optimizer Statistics , For example, table and index block counts 、 Table and join cardinality （ Estimated number of rows ）、 Join column statistics and GROUP BY Statistics . This information helps the optimizer improve the plan by better estimating predicate cardinality .

Dynamic statistics are useful when ：

Because of the complex predicates , The implementation plan is not ideal .
The sampling time is a fraction of the total query execution time .
The query is executed multiple times , To amortize the sampling time .

10.3.3 Online Statistics Gathering

In some cases ,DDL and DML The operation will automatically trigger online statistics collection .

10.3.3.1 Online Statistics Gathering for Bulk Loads

The database can automatically collect table statistics during the following types of bulk loads ：INSERT INTO … SELECT Insert and... Using the direct path CREATE TABLE AS SELECT.

By default , Parallel inserts use direct path inserts . You can use /*+APPEND*/ Prompt force insert direct path .

10.3.3.1.1 Purpose of Online Statistics Gathering for Bulk Loads

Data warehouse applications typically load large amounts of data into the database . for example , The sales data warehouse may be daily 、 Load data weekly or monthly .

stay Oracle Database 12c In the previous version , The best practice is to manually collect statistics after bulk loading . however , Due to negligence or waiting for the maintenance window to start the collection , Many applications do not collect statistics after loading . The lack of statistical data is the main reason for the sub optimal implementation plan .

Automatic collection of statistics during bulk loading has the following benefits ：

Improve performance
Collecting statistics during load avoids additional table scans to collect table statistics .
Improved manageability
After batch loading, statistics can be collected without user intervention .

10.3.3.1.2 Global Statistics During Inserts into Partitioned Tables

When inserting rows into a partitioned table , The database collects global statistics during insert .

for example , If sales It's a partition table , And if you run INSERT INTO sales SELECT, Then the database will collect global statistics . however , The database does not collect partition level statistics .

Suppose you use partition extension syntax to insert rows into a specific partition or sub partition . The database collects statistics about partitions during insert . however , The database does not collect global statistics .

Suppose you run INSERT INTO sales PARTITION (sales_q4_2000) SELECT. The database collects statistics during insert . If sales To enable the INCREMENTAL Preferences , Then the database will collect sales_q4_2000 Summary . Statistics are available immediately after insertion . however , If you rollback a transaction , Then the database will automatically delete the statistics collected during batch loading .

10.3.3.1.3 Histogram Creation After Bulk Loads

After collecting Online Statistics , The database does not automatically create histograms .

If you need a histogram , Then after batch loading ,Oracle It is recommended to use options=>GATHER AUTO function DBMS_STATS.GATHER_TABLE_STATS.

EXEC DBMS_STATS.GATHER_TABLE_STATS(user, 'MYT', options=>'GATHER AUTO');

Ahead PL/SQL The program only collects missing or outdated Statistics . The database does not collect table and basic column statistics collected during bulk loading .

Be careful ： You can set the table preference to... On tables that you plan to load in batches GATHER AUTO. such , You are running GATHER_TABLE_STATS There is no need to explicitly set options Parameters .

10.3.3.1.4 Restrictions for Online Statistics Gathering for Bulk Loads

In some cases , Batch loading does not automatically collect optimizer Statistics .

say concretely , When any of the following conditions apply to the target table 、 Partition or sub partition , Bulk loading does not automatically collect statistics ：

This object contains data . Batch loading automatically collects online statistics only when the object is empty .
It is located in Oracle In the mode of ownership , for example SYS.
It is one of the following types of tables ： Nested table 、 Index organization table (IOT)、 An external table or definition is ON COMMIT DELETE ROWS Global temporary table for .
Be careful ： The database will automatically collect online statistics for the internal partitions of the mixed partition table .
its PUBLISH The preference is set to FALSE.
Its statistics are locked .
It uses multiple tables INSERT Statement loaded .

10.3.3.1.5 User Interface for Online Statistics Gathering for Bulk Loads

By default , The database collects statistics during bulk loading .

You can use GATHER_OPTIMIZER_STATISTICS Prompt to enable this feature at the statement level . You can use NO_GATHER_OPTIMIZER_STATISTICS Prompt to disable this feature at the statement level . for example , The following statement disables online statistics collection for bulk loading ：

CREATE TABLE employees2 AS
  SELECT /*+NO_GATHER_OPTIMIZER_STATISTICS*/ * FROM employees

10.3.3.2 Online Statistics Gathering for Partition Maintenance Operations

Oracle The database provides similar support for online statistics during specific partition maintenance operations .

about MOVE、COALESCE and MERGE, The database maintains global and partition level statistics , As shown below ：

If the partition uses incremental or non incremental statistics , Then the database will directly update the... In the global table statistics BLOCKS value . Please note that , This update is not a statistics collection operation .
The database generates new statistics for the generated partition . If incremental statistics is enabled , Then the database will maintain the partition profile .

about TRUNCATE or DROP PARTITION, The database updates the... In the global table statistics BLOCKS and NUM_ROWS value . This update does not require a collect statistics operation . Statistics update occurs when incremental or non incremental statistics are used .

Be careful ： The database does not maintain partition level statistics for maintenance operations with multiple target segments .

10.3.3.3 Real-Time Statistics

Oracle The database can be used in the regular DML Automatic collection of real-time statistics during operations .

10.3.3.3.1 Purpose of Real-Time Statistics

Online Statistics , Whether it's batch loading or traditional DML, Designed to reduce the likelihood that the optimizer will be misled by stale Statistics .

Oracle Database 12c by CREATE TABLE AS SELECT Statement and direct path insertion introduce online statistics collection . Oracle Database 19c Real time statistics are introduced , Extend online support to traditional DML sentence . because DBMS_STATS Inter job statistics may be outdated , Therefore, real-time statistics help the optimizer generate more optimized plans .

The bulk load operation collects all necessary statistics , Real time statistics increase rather than replace traditional statistics . therefore , You must continue to use DBMS_STATS Collect statistics regularly , Best use AutoTask Homework .

10.3.3.3.2 How Real-Time Statistics Work

Oracle The database is DML The values of the most important statistics are calculated dynamically during the operation .

Consider a transaction that is currently moving to oe.orders Add tens of thousands of rows to the table . Real time statistics record important statistical changes , For example, the maximum column value . This enables the optimizer to obtain more accurate cost estimates .

When real-time statistics change , Existing cursors will not be marked as invalid .

10.3.3.3.2.1 Regression Models for Real-Time Statistics

from 21c version ,Oracle The database automatically builds regression models to predict different values of variable table statistics (NDV) The number of . The use of models enables the optimizer to produce accurate results at low cost NDV Estimated value .

Be careful ： The time required to establish the regression model may vary . The first step in this process is to NDV How to model over time . This depends on information derived from statistical history about NDV Changed information . If the immediately available information is insufficient , The construction of the model will remain in a waiting state , Until enough historical information is collected .

Use DBMS_STATS Delete 、 Exporting and importing regression models

The regression model is built automatically by the database according to the needs , Unwanted DBA intervention . however , You can use DBMS_STATS Delete 、 Import or export regression models . Default stat_category Include default parameter values MODELS And previously supported values OBJECT_STATS、SYNOPSES and REALTIME_STATS. These are related API：

DBMS_STATS.DELETE_*_STATS
DBMS_STATS_EXPORT_*_STATS
DBMS_STATS.IMPORT_*_STATS

Dictionary view for checking the real-time statistical model

from Oracle Database 21c Start , These new dictionary views can be used to examine saved real-time statistical models .

ALL_TAB_COL_STAT_MODELS
DBA_TAB_COL_STAT_MODELS
USER_TAB_COL_STAT_MODELS

10.3.3.3.3 User Interface for Real-Time Statistics

You can PL/SQL package 、 Data dictionary views and tips use to manage and access real-time statistics .

OPTIMIZER_REAL_TIME_STATISTICS Initialize parameters

When OPTIMIZER_REAL_TIME_STATISTICS The initialization parameter is set to TRUE when ,Oracle The database will be in the regular DML Automatic collection of real-time statistics during operations . The default setting is FALSE, Indicates that real-time statistics is disabled .

By default ,DBMS_STATS Subroutines include real-time statistics . You can also specify parameters to include only these statistics .

Subroutines	describe
EXPORT_TABLE_STATS and EXPORT_SCHEMA_STATS	These subroutines enable you to export statistics . By default ,stat_category Parameters include real-time statistics . REALTIME_STATS Value specifies only real-time statistics .
IMPORT_TABLE_STATS and IMPORT_SCHEMA_STATS	These subroutines enable you to import statistics . By default ,stat_category Parameters include real-time statistics . REALTIME_STATS Value specifies only real-time statistics .
DELETE_TABLE_STATS and DELETE_SCHEMA_STATS	These subroutines enable you to delete statistics . By default ,stat_category Parameters include real-time statistics . REALTIME_STATS Value specifies only real-time statistics .
DIFF_TABLE_STATS_IN_STATTAB	This function compares table statistics from two sources . Statistics always include real-time statistics .
DIFF_TABLE_STATS_IN_HISTORY	This function compares table statistics up to two specified timestamps . Statistics always include real-time statistics .

When NOTES As a STATS_ON_CONVENTIONAL_DML when , You can view statistics in the data dictionary table （ Such as USER_TAB_STATISTICS and USER_TAB_COL_STATISTICS） View real-time statistics in , As shown in the table below .

DBA_* The view has ALL_* and USER_* edition .

View	describe
DBA_TAB_COL_STATISTICS	This view shows from DBA_TAB_COLUMNS Column statistics and histogram information extracted from . Real time statistics by NOTES In column STATS_ON_CONVENTIONAL_DML instructions .
DBA_TAB_STATISTICS	This view displays optimizer statistics for the tables that the current user can access . Real time statistics by NOTES In column STATS_ON_CONVENTIONAL_DML instructions .

NO_GATHER_OPTIMIZER_STATISTICS Prompt to prevent collection of real-time statistics .

10.3.3.3.4 Real-Time Statistics: Example

In this example , Conventional INSERT Statement triggers the collection of real-time statistics .

Before this experiment , Please backup first SH schema Medium sales surface , For subsequent recovery .

create table sales_orig as select * from sales;

This is a Exadata Unique characteristics ：

alter system set "_exadata_feature_on"=true scope=spfile;
shutdown immediate;
startup;

Also set the following parameters ：

alter session set OPTIMIZER_REAL_TIME_STATISTICS=TRUE;

This example assumes sh The user has been granted DBA role , And you have sh Log in to the database as . You can perform the following steps ：

Collect sales table statistics ：

exec DBMS_STATS.GATHER_TABLE_STATS('SH', 'SALES');

Query the column level statistics of the sales table

SET PAGESIZE 5000
SET LINESIZE 200
COL COLUMN_NAME FORMAT a13 
COL LOW_VALUE FORMAT a14
COL HIGH_VALUE FORMAT a14
COL NOTES FORMAT a5
COL PARTITION_NAME FORMAT a13

-- Notes  Field is empty , Indicates that real-time statistics have not been collected 
SELECT COLUMN_NAME, LOW_VALUE, HIGH_VALUE, SAMPLE_SIZE, NOTES
FROM   USER_TAB_COL_STATISTICS
WHERE  TABLE_NAME = 'SALES'
ORDER BY 1, 5;

COLUMN_NAME   LOW_VALUE      HIGH_VALUE     SAMPLE_SIZE NOTES
------------- -------------- -------------- ----------- -----
AMOUNT_SOLD   C10729         C2125349            918843      
CHANNEL_ID    C103           C10A                918843      
CUST_ID       C103           C30B0B              918843      
PROD_ID       C10E           C20231              918843      
PROMO_ID      C122           C20A64              918843      
QUANTITY_SOLD C102           C102                918843      
TIME_ID       77C60101010101 78650C1F010101      918843

Query the table level statistics of the sales table

SELECT NVL(PARTITION_NAME, 'GLOBAL') PARTITION_NAME, NUM_ROWS, BLOCKS, NOTES 
FROM   USER_TAB_STATISTICS
WHERE  TABLE_NAME = 'SALES'
ORDER BY 1, 4;

-- Notes  Field is empty , Indicates that real-time statistics have not been collected 
PARTITION_NAM   NUM_ROWS     BLOCKS NOTES
------------- ---------- ---------- -----
GLOBAL            918843       1874      
SALES_1995             0          0      
SALES_1996             0          0      
SALES_H1_1997          0          0      
SALES_H2_1997          0          0      
SALES_Q1_1998      43687         97      
SALES_Q1_1999      64186        126      
SALES_Q1_2000      62197        125      
SALES_Q1_2001      60608        124      
SALES_Q1_2002          0          0      
SALES_Q1_2003          0          0      
SALES_Q2_1998      35758         86      
SALES_Q2_1999      54233        110      
SALES_Q2_2000      55515        114      
SALES_Q2_2001      63292        124      
SALES_Q2_2002          0          0      
SALES_Q2_2003          0          0      
SALES_Q3_1998      50515        103      
SALES_Q3_1999      67138        128      
SALES_Q3_2000      58950        118      
SALES_Q3_2001      65769        130      
SALES_Q3_2002          0          0      
SALES_Q3_2003          0          0      
SALES_Q4_1998      48874        116      
SALES_Q4_1999      62388        121      
SALES_Q4_2000      55984        112      
SALES_Q4_2001      69749        140      
SALES_Q4_2002          0          0      
SALES_Q4_2003          0          0      

29 rows selected.

Use traditional INSERT Statement will 918,843 Load rows into sales in

INSERT INTO sales(prod_id, cust_id, time_id, channel_id, promo_id, 
                  quantity_sold, amount_sold)
  SELECT prod_id, cust_id, time_id, channel_id, promo_id, 
         quantity_sold * 2, amount_sold * 2 
  FROM   sales;
COMMIT;

Get the execution plan from the cursor

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(format=>'TYPICAL'));

----------------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT                 |       |       |       |  4381 (100)|          |       |       |
|   1 |  LOAD TABLE CONVENTIONAL         | SALES |       |       |            |          |       |       |
|   2 |   OPTIMIZER STATISTICS GATHERING |       |   918K|    25M|  4381   (1)| 00:00:01 |       |       |
|   3 |    PARTITION RANGE ALL           |       |   918K|    25M|  4381   (1)| 00:00:01 |     1 |    28 |
|   4 |     TABLE ACCESS FULL            | SALES |   918K|    25M|  4381   (1)| 00:00:01 |     1 |    28 |
----------------------------------------------------------------------------------------------------------

Pay attention to the OPTIMIZER STATISTICS GATHERING.

For testing purposes , Force the database to immediately write optimizer statistics to the data dictionary .

EXEC DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO;

Query the column level statistics of the sales table ,NOTES The information is listed

COLUMN_NAME   LOW_VALUE      HIGH_VALUE     SAMPLE_SIZE NOTES                    
------------- -------------- -------------- ----------- -------------------------
AMOUNT_SOLD   C10729         C224422D              9070 STATS_ON_CONVENTIONAL_DML
AMOUNT_SOLD   C10729         C2125349            918843                          
CHANNEL_ID    C103           C10A                  9070 STATS_ON_CONVENTIONAL_DML
CHANNEL_ID    C103           C10A                918843                          
CUST_ID       C103           C30B0B                9070 STATS_ON_CONVENTIONAL_DML
CUST_ID       C103           C30B0B              918843                          
PROD_ID       C10E           C20231                9070 STATS_ON_CONVENTIONAL_DML
PROD_ID       C10E           C20231              918843                          
PROMO_ID      C122           C20A64                9070 STATS_ON_CONVENTIONAL_DML
PROMO_ID      C122           C20A64              918843                          
QUANTITY_SOLD C102           C103                  9070 STATS_ON_CONVENTIONAL_DML
QUANTITY_SOLD C102           C102                918843                          
TIME_ID       77C60101010101 78650C1F010101        9070 STATS_ON_CONVENTIONAL_DML
TIME_ID       77C60101010101 78650C1F010101      918843                          

14 rows selected.

Query the table level statistics of the sales table ,NOTES Some have information

PARTITION_NAM   NUM_ROWS     BLOCKS NOTES                    
------------- ---------- ---------- -------------------------
GLOBAL           1837686      16096 STATS_ON_CONVENTIONAL_DML
GLOBAL            918843      16096                          
SALES_1995             0          0                          
SALES_1996             0          0                          
SALES_H1_1997          0          0                          
SALES_H2_1997          0          0                          
SALES_Q1_1998      43687       1006                          
SALES_Q1_1999      64186       1006                          
SALES_Q1_2000      62197       1006                          
SALES_Q1_2001      60608       1006                          
SALES_Q1_2002          0          0                          
SALES_Q1_2003          0          0                          
SALES_Q2_1998      35758       1006                          
SALES_Q2_1999      54233       1006                          
SALES_Q2_2000      55515       1006                          
SALES_Q2_2001      63292       1006                          
SALES_Q2_2002          0          0                          
SALES_Q2_2003          0          0                          
SALES_Q3_1998      50515       1006                          
SALES_Q3_1999      67138       1006                          
SALES_Q3_2000      58950       1006                          
SALES_Q3_2001      65769       1006                          
SALES_Q3_2002          0          0                          
SALES_Q3_2003          0          0                          
SALES_Q4_1998      48874       1006                          
SALES_Q4_1999      62388       1006                          
SALES_Q4_2000      55984       1006                          
SALES_Q4_2001      69749       1006                          
SALES_Q4_2002          0          0                          
SALES_Q4_2003          0          0                          

30 rows selected.

Execute the sample query

SELECT COUNT(*) FROM sales WHERE quantity_sold > 50;

View execution plan , Be careful Notes part

----------------------------------------------------------------------------------------------
| Id  | Operation            | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       |       |       |  4398 (100)|          |       |       |
|   1 |  SORT AGGREGATE      |       |     1 |     3 |            |          |       |       |
|   2 |   PARTITION RANGE ALL|       |     1 |     3 |  4398   (1)| 00:00:01 |     1 |    28 |
|*  3 |    TABLE ACCESS FULL | SALES |     1 |     3 |  4398   (1)| 00:00:01 |     1 |    28 |
----------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - filter("QUANTITY_SOLD">50)
 
Note
-----
   - dynamic statistics used: statistics for conventional DML

recovery

truncate table sales;
insert into sales select * from sales_orig;
commit;

alter session set OPTIMIZER_REAL_TIME_STATISTICS=FALSE;

alter system set "_exadata_feature_on"=false scope=spfile;
shutdown immediate;
startup;

This experiment refers to the following links ：

https://blogs.oracle.com/optimizer/post/optimizer-real-time-statistics-parameter-in-ru-1910-onwards
https://oracle-base.com/articles/19c/real-time-statistics-19c

10.4 When the Database Gathers Optimizer Statistics

The database collects optimizer statistics from different sources at different times .

10.4.1 Sources for Optimizer Statistics

The optimizer uses several different sources for optimizer Statistics .

The sources are as follows ：

DBMS_STATS perform , Automatic or manual
This PL/SQL Packages are the primary method of collecting optimizer Statistics .
SQL compile
stay SQL During compilation , The database can be added previously by DBMS_STATS Statistics collected . At this stage , The database runs additional queries to get information about how many rows in the table satisfy SQL Statement WHERE More accurate information about clause predicates .
SQL perform
During execution , The database can further augment previously collected statistics . At this stage ,Oracle The database collects each row source during execution SQL The number of rows produced during the statement . At the end of execution , The optimizer determines whether the estimated number of rows is sufficiently imprecise to ensure reanalysis at the next statement execution . If the cursor is marked for reparse , Then the optimizer uses the actual row count from the previous execution instead of the estimated value .
SQL The configuration file
SQL A configuration file is a collection of auxiliary statistics for a query . The configuration file stores these supplementary statistics in the data dictionary . The optimizer uses... During optimization SQL Configuration file to determine the best plan .

10.4.2 SQL Plan Directives

SQL Plan instructions are additional information and instructions that the optimizer can use to generate more optimized plans .

This instruction is the optimizer's “ Explain yourself ”, It incorrectly estimates the cardinality of certain types of predicates , It also reminds us that DBMS_STATS Collect statistics needed to correct erroneous estimates in the future . for example , When connecting two tables with data skew in their connected columns ,SQL Planning instructions can instruct the optimizer to use dynamic statistics to obtain more accurate connection cardinality estimates .

10.4.2.1 When the Database Creates SQL Plan Directives

The database is automatically created based on information learned during automatic re optimization SQL Planning instructions . If in SQL Cardinality error occurred during execution , The database will create SQL Planning instructions .

For each new instruction ,DBA_SQL_PLAN_DIRECTIVES.STATE The column shows the value USABLE. This value indicates that the database can use this instruction to correct erroneous estimates .

The optimizer defines... On the query expression SQL Planning instructions , for example , Filter predicates on two columns used at the same time . The instruction is not bound to a specific SQL Sentence or SQL ID. therefore , The optimizer can use instructions for different statements . for example , Directives can help the optimizer process queries that use similar patterns , For example, the same query except for the selection list item .

The comment section of the execution plan indicates the... For the statement SQL Number of planning instructions . By inquiring DBA_SQL_PLAN_DIRECTIVES and DBA_SQL_PLAN_DIR_OBJECTS View to get more information about instructions .

10.4.2.2 How the Database Uses SQL Plan Directives

Compiling SQL When the sentence is , If the optimizer sees an instruction , Then it will comply with the directive by collecting additional information .

The optimizer uses instructions in the following ways ：

dynamic database
As long as there is not enough statistics corresponding to the instruction , The optimizer will use dynamic statistics . for example , Cardinality estimates for queries whose predicates contain specific column pairs can be severely wrong . SQL Planning instructions indicate , Whenever you parse a query that contains these columns , The optimizer needs to use dynamic sampling to avoid serious cardinality error estimation .
Dynamic statistics has some performance overhead . Every time the optimizer hard parses a query that applies dynamic statistics instructions , The database must perform additional sampling .
from Oracle Database 12c The first 2 edition (12.2) Start , The database writes statistics from adaptive dynamic sampling to SQL Plan instruction store , Make it available for other queries .
Line up
The optimizer checks the query corresponding to the instruction . If a rank is missing , And if the affected table DBMS_STATS Preferences AUTO_STAT_EXTENSIONS Set to ON（ The default is OFF）, Then the optimizer will DBMS_STATS This rank is automatically created the next time statistics about the table are collected . otherwise , The optimizer does not automatically create a rank .
If there is a rank , The next time this statement is executed , Whenever possible, the optimizer uses column group statistics instead of SQL Planning instructions （ Equality predicate 、GROUP BY etc. ）. In subsequent execution , The optimizer may create additional SQL Plan instructions to solve other problems in the plan , For example, connect or GROUP BY Base estimate error .
notes ： at present , The optimizer only monitors rank . The optimizer does not create extensions on expressions .

When the problem that caused the instruction is solved , Whether it's because there are better instructions or because there are histograms or extensions ,DBA_SQL_PLAN_DIRECTIVES.STATE Value from USABLE Turn into SUPERSEDED. More information about the status of the instruction is available in DBA_SQL_PLAN_DIRECTIVES.NOTES Column shows .

10.4.2.3 SQL Plan Directive Maintenance

Automatic database creation SQL Planning instructions . You cannot create them manually .

The database initially creates instructions in the shared pool . The database periodically writes instructions to SYSAUX Table space . The database is automatically purged in a specified number of weeks (SPD_RETENTION_WEEKS) Any that has not been used since SQL Planning instructions , The default is 53.

You can use DBMS_SPD Package to manage instructions . for example , You can ：

Enable and disable SQL Planning instructions (ALTER_SQL_PLAN_DIRECTIVE)
change SQL Retention period for planning instructions (SET_PREFS)
Export instructions to temporary tables (PACK_STGTAB_DIRECTIVE)
Delete instruction (DROP_SQL_PLAN_DIRECTIVE)
Force the database to write instructions to disk (FLUSH_SQL_PLAN_DIRECTIVE)

10.4.2.4 How the Optimizer Uses SQL Plan Directives: Example

This example shows how the database is SQL Statements are created and used automatically SQL Planning instructions .

hypothesis ： You plan to target sh Run query in mode , And you have this architecture as well as the data dictionary and V$ Permissions for views .

Inquire about sh.customers surface ：

SELECT /*+gather_plan_statistics*/ * 
FROM   customers 
WHERE  cust_state_province='CA' 
AND    country_id=52790;

gather_plan_statistics The prompt displays the actual number of rows returned from each operation in the plan . therefore , You can compare the optimizer estimate with the actual number of rows returned .

Query the plan of the previous query .


SQL_ID  ayd76b1zycdwr, child number 0
-------------------------------------
select /*+ gather_plan_statistics */     * FROM     customers WHERE     
    cust_state_province = 'CA'     AND country_id = 52790
 
Plan hash value: 2008213504
 
--------------------------------------------------------------------------------------------------
| Id  | Operation         | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           |      1 |        |   3341 |00:00:00.01 |    1529 |   1512 |
|*  1 |  TABLE ACCESS FULL| CUSTOMERS |      1 |     20 |   3341 |00:00:00.01 |    1529 |   1512 |
--------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter(("CUST_STATE_PROVINCE"='CA' AND "COUNTRY_ID"=52790))
 

19 rows selected.

The actual number of rows returned by each operation in the plan (A-Rows) And estimates (E-Rows) There's a big difference . This statement is a candidate for automatic re optimization .

Check whether the customer query can be re optimized .
Show sh Mode instruction .

EXEC DBMS_SPD.FLUSH_SQL_PLAN_DIRECTIVE;
 
SELECT TO_CHAR(d.DIRECTIVE_ID) dir_id, o.OWNER AS "OWN", o.OBJECT_NAME AS "OBJECT", 
       o.SUBOBJECT_NAME col_name, o.OBJECT_TYPE, d.TYPE, d.STATE, d.REASON
FROM   DBA_SQL_PLAN_DIRECTIVES d, DBA_SQL_PLAN_DIR_OBJECTS o
WHERE  d.DIRECTIVE_ID=o.DIRECTIVE_ID
AND    o.OWNER IN ('SH')
ORDER BY 1,2,3,4,5;

DIR_ID                  OWN    OBJECT       COL_NAME               OBJECT_TYPE    TYPE                STATE     REASON                                  
14911729199042574566    SH     CUSTOMERS    COUNTRY_ID             COLUMN         DYNAMIC_SAMPLING    USABLE    SINGLE TABLE CARDINALITY MISESTIMATE    
14911729199042574566    SH     CUSTOMERS    CUST_STATE_PROVINCE    COLUMN         DYNAMIC_SAMPLING    USABLE    SINGLE TABLE CARDINALITY MISESTIMATE    
14911729199042574566    SH     CUSTOMERS                           TABLE          DYNAMIC_SAMPLING    USABLE    SINGLE TABLE CARDINALITY MISESTIMATE

first , Database will SQL Planning instructions are stored in memory , Then each 15 Minutes to write them to disk . therefore , The previous example calls DBMS_SPD.FLUSH_SQL_PLAN_DIRECTIVE To force the database to write instructions to SYSAUX Table space .

Use view DBA_SQL_PLAN_DIRECTIVES and DBA_SQL_PLAN_DIR_OBJECTS Monitoring instructions . Three entries appear in the view , One for the customer table itself , One for each related column . because customers Of the query IS_REOPTIMIZABLE The value is Y, If you re execute the statement , Then the database will be hard parsed again , Then generate the plan according to the previous execution statistics .

Execute the query again

SELECT /*+gather_plan_statistics*/ * 
FROM   customers 
WHERE  cust_state_province='CA' 
AND    country_id=52790;

View execution plan

----------------------------------------------------------------------------------------- 
| Id  | Operation         | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |    
----------------------------------------------------------------------------------------- 
|   0 | SELECT STATEMENT  |           |      1 |        |   3341 |00:00:00.01 |    1528 |    
|*  1 |  TABLE ACCESS FULL| CUSTOMERS |      1 |   3341 |   3341 |00:00:00.01 |    1528 |    
----------------------------------------------------------------------------------------- 
                                                                                             
Predicate Information (identified by operation id):                                          
--------------------------------------------------- 
                                                                                             
   1 - filter(("CUST_STATE_PROVINCE"='CA' AND "COUNTRY_ID"=52790))                           
                                                                                             
Note                                                                                         
----- 
   - statistics feedback used for this statement

Note Part indicates that the database used re optimization for this statement . Estimated number of rows （E-Rows） Now it's right . SQL Planning instructions have not been used .

SELECT SQL_ID, CHILD_NUMBER, SQL_TEXT, IS_REOPTIMIZABLE
FROM   V$SQL
WHERE  SQL_TEXT LIKE 'SELECT /*+gather_plan_statistics*/%';

SQL_ID           CHILD_NUMBER    SQL_TEXT                                                                                                           IS_REOPTIMIZABLE    
3q5u7q9vq52xm                  0 SELECT /*+gather_plan_statistics*/ *  FROM   customers  WHERE  cust_state_province='CA'  AND    country_id='US'    N                   
3q5u7q9vq52xm                  1 SELECT /*+gather_plan_statistics*/ *  FROM   customers  WHERE  cust_state_province='CA'  AND    country_id='US'    N

customers Query that a new plan exists , There is also a new child cursor .

confirm SQL Planning instructions exist and can be used in other statements .

SELECT /*+gather_plan_statistics*/ CUST_EMAIL
FROM   CUSTOMERS
WHERE  CUST_STATE_PROVINCE='MA'
AND    COUNTRY_ID=52790;

Query the plan in the cursor .

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(FORMAT=>'ALLSTATS LAST'));

----------------------------------------------------------------------------------------- 
| Id  | Operation         | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |    
----------------------------------------------------------------------------------------- 
|   0 | SELECT STATEMENT  |           |      1 |        |    181 |00:00:00.01 |    1522 |    
|*  1 |  TABLE ACCESS FULL| CUSTOMERS |      1 |     20 |    181 |00:00:00.01 |    1522 |    
----------------------------------------------------------------------------------------- 
                                                                                             
Predicate Information (identified by operation id):                                          
--------------------------------------------------- 
                                                                                             
   1 - filter(("CUST_STATE_PROVINCE"='MA' AND "COUNTRY_ID"=52790))                           
                                                                                             

19 rows selected.

10.4.2.5 How the Optimizer Uses Extensions and SQL Plan Directives: Example

This example shows how the database uses SQL Planning instructions , Until the optimizer verifies that the extension exists and the statistics apply .

here , The command changes its state to SUPERSEDED. Subsequent compilation uses statistics instead of instructions .

This experiment depends on the above experiment . Here is a brief introduction .

10.4.3 When the Database Samples Data

from Oracle Database 12c Start , The optimizer automatically determines whether dynamic statistics are useful and useful for all SQL The sample size used by the statement .

Be careful ： In previous releases , Dynamic statistics is called dynamic sampling .

The main factor in determining the use of dynamic statistics is whether the available statistics are sufficient to generate the best plan . If the statistics are not enough , Then the optimizer uses dynamic statistics .

When OPTIMIZER_DYNAMIC_SAMPLING Initialization parameter is not set to 0 Enable automatic dynamic statistics . By default , The dynamic statistics level is set to 2.

Usually , The optimizer uses default statistics instead of dynamic statistics to calculate optimization tables 、 Statistics required during indexing and columns . The optimizer determines whether to use dynamic statistics based on several factors , Include ：

SQL Statements are executed in parallel .
There is SQL Planning instructions .

The following figure illustrates the process of collecting dynamic statistics .
Insert picture description here

As shown in the figure above , The optimizer automatically collects dynamic statistics in the following cases ：

Missing Statistics
When the table in the query has no statistics , The optimizer collects the basic statistics of these tables before optimization . Statistics may be lost , Because the application has no subsequent calls DBMS_STATS To collect statistics in the case of creating new objects , Or because the statistics were locked on the object before the statistics were collected .
under these circumstances , Statistics are not like using DBMS_STATS As high quality or complete as the statistics collected by the package . This trade-off is to limit the impact on statement compilation time .
The statistics are insufficient
Whenever the optimizer estimates predicates （ Filter or connection ） or GROUP BY Clause without considering the correlation between columns 、 Skew of column data distribution 、 Statistics of the expression, etc , Statistics may be insufficient .
Extended statistics help the optimizer to obtain accurate mass cardinality estimates for complex predicate expressions . The optimizer can use dynamic statistics to compensate for the lack of extended statistics , Or when it cannot use extended Statistics , for example , For unequal predicates .

Be careful ： To contain AS OF Clause , The database does not use dynamic statistics .

10.4.4 How the Database Samples Data

At the beginning of optimization , When determining whether a table is a candidate for dynamic statistics , The optimizer checks the table for persistent SQL Planning instructions .

For each instruction , The optimizer registers a statistical expression , The optimizer evaluates the expression when determining the cardinality of predicates involving tables . In the figure 10-2 in , The database issues a recursive SQL Statement to scan a small random sample of a table block . Database applies related single table predicates and joins to estimate predicate cardinality .

The database saves the results of dynamic statistics as sharable statistics . The database can be found in a query SQL During compilation, the results are shared with recompilation of the same query . The database can also reuse the results for queries with the same schema .

原网站

版权声明
本文为[dingdingfish]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/163/202206122129528049.html