9
PROFESSIONAL DBA
Oracle12c Database Administration
Performance Tuning
SQL TUNING
SQL Statement Processing
Processing Phases
The four most important phases in SQL statement processing are
• Parsing
• Binding
• Executing and
• Fetching
Note
The Fetch phase is only to queries.
DataSphere
Parse Phase
• Checks shared pool for the statement.
• Checks the syntax, specifications of the statement.
• Checks the semantics of the statement and ensures that objects required are
valid.
• Check for the privileges to process the statement.
• If required transforms the statement into an equivalent statements, if the
statement is on a view, a subquery, rewriting the statement.
• Builds the Execution plan if required.
• Stores the execution plan or uses an existing execution plan.
Bind Phase
• Checks for references of bind variables in the statement.
• A value to each variable is assigned or reassigned.
Consideration of Binding
The bind variable values are not knows to optimizer while building the execution
plan.
Advantage
Rebind & Execute without reparsing is done faster, thus saves time and memory.
Disadvantage
Cannot estimate predicate selectivity.
DataSphere
Execute Phase
• The parsing tree is used to access the data buffers.
• Same parse tree is used by multiple users.
• Physical reads, logical reads, writes for DML statements is performed.
Fetch Phase
If statement is SELECT then fetch the data.
Typically multiple rows are retrieved using an array fetch.
V$LIBRARYCACHE view
The V$LIBRARYCACHE view the amount of caching of SQL statements.
SQL>select gethitratio, pinhitratio
from v$librarycache
where namespace = ’SQL AREA’;
V$SQLAREA view
The V$SQLAREA view provides the SQL statements and their usage.
SQL>select sql_text, version_count, loads, invalidations, parse_calls, sorts
from v$sqlarea
where parsing_user_id > 0
and command_type = 3
Order by sql_text;
DataSphere
Cursor Sharing
Whenever a statement is issued, optimizer checks the shared SQL area for the
statement.
If the statement is present in the SQL area then parsing is avoided by using the
existing cursor.
Cursor Sharing Benefits
• Avoids parsing and thus saves time.
• Based on the statement executed, the memory is dynamically adjusted.
• Memory usage performance is improve dramatically.
Which statement can share cursor
The SQL statements those having identical elements such as
• Text
▪ Uppercase and lowercase
▪ White space like spaces, tabs, carriage returns
▪ Comments
• Referenced objects
◦ The SQL statements must resolve to the same objects to which they are
referenced.
• Data types of the bind variables
◦ SQL statements must usue the same type of bind variables.
DataSphere
Cursor_Sharing parameter
Cursor_Sharing parameter values are
• Exact
• Simlar
• Force
EXACT
Is the default value.
This value force the parser to use a statement in the cursor only if it is identical is all
aspects to use the existing cursor.
SIMILAR
The parser is allowed to use a statement in the cursor that is identical except literal
values.
After the statement is identified the parsing check that the execution plan is
applicable for the statement.
FORCE
Is same as SIMILAR, except that the execution plan is always used, regardless of its
applicability.
Note
CURSOR_SHARING = SIMILAR or FORCE is not recommended for Complex
Quiries, Decision Support Systems, Data Warehousing environments.
DataSphere
Histograms
The optimizer must estimate the number of rows processed by a given query.
By default optimizer assumes that the data is evenly distributed.
The accuracy is estimated by the optimizer based on its knowledge on the data
distribution.
Histograms provides the proper data distribution information to the optimizer.
When to Use Histograms
• When the data distribution is not even and it is skewed then use histograms.
• If the column is used in WHERE clause.
• If the column is using non unique and using other than equality predicates
Avoid using histograms whenever
• All predicates on the column use bind variables
• The column data is uniformly distributed
Popular and Nonpopular Values
Popular Value
For a bucket endpoint value, the Popular values appear multiple times.
Popular Value
If the values appear only once or does not appear at all, then it is considered as the
NonPopular value.
Density
Is calculated based on 1 divided by NDV - Number of Distinct Values.
This avoids bais of optimizer it causes due to repetition of popular values.
DataSphere
Types of Histograms
Frequency Histogram
In Frequency Histogram, each buckets stores a distinct column value.
As the number of values vary for each distinct value, the buckets will contain
different number of values.
The Frequency Histogram is used when the NDV – Number of Distinct Values is
less than or equal to the n - number of buckets.
Default and maximum number of buckets is 254.
Note
Default Frequency Histogram is captured from Oracle12c if the
AUTO_SAMPLE_SIZE is used while gathering statistics.
Top Frequency Histogram
Is same as Frequency Histogram except that it ignores the Non Popular values those
are insignificant in numbers.
The Top Fequency Histogram is used when
1. NDV - Number of Distinct values are greater than n – number of buckets.
2. The percentage of rows occupied by top n frequent values is equal to greater
than p - the threshold. Where p is (1 - ( 1/n))*100
3. If AUTO_SAMPLE_SIZE is used for Estimate_Percent while gathering
statistics.
DataSphere
Height Balanced Histogram
Is a legacy method of histogram.
In height balanced histograms, each bucket will more or less contains same number
of rows.
The rows are distributed across the available number of buckets.
If many rows contains the same value then it may be spread to multiple buckets to
balance the height of the bucket.
The Height Balance Histogram is used when
• NDV - Number of Distinct values are greater than n – number of buckets.
• If AUTO_SAMPLE_SIZE is NOT used for Estimate_Percent while gathering
statistics.
Gathering Histogram
Using Analyze
• Generate statistics for the CUSTOMERS table and for the STATE_ID column
by specifying maximum 50 buckets.
SQL>analyze table customers compute statistics
for table for columns STATE_ID
size 50 ;
• Compute the statistics in Column Level for a column without specifying the
number of buckets.
SQL> analyze table customers compute statistics
for columns STATE_ID ;
DataSphere
Using DBMS_STATS
• Generate statistics for the CUSTOMERS table and for the STATE_ID column
by specifying maximum 50 buckets.
SQL> EXECUTE DBMS_STATS.GATHER_TABLE_STATS
('RAM','CUSTOMERS', METHOD_OPT =>
'FOR COLUMNS SIZE 50 STATE_ID') ;
• The SIZE specifies the maximum number of buckets for the histogram.
or
SQL> EXECUTE DBMS_STATS.GATHER_TABLE_STATS
(ownname => 'RAM',
tabname => 'CUSTOMERS',
METHOD_OPT => 'FOR COLUMNS STATE_ID SIZE 50') ;
or
SQL> EXECUTE DBMS_STATS.GATHER_TABLE_STATS
(ownname => 'RAM',
tabname => 'CUSTOMERS',
METHOD_OPT => 'FOR COLUMNS STATE_ID SIZE 10',
estimate_percent => AUTO_SAMPLE_SIZE ) ;
DataSphere
Histogram Consideration
• Use the FOR ALL INDEXED COLUMNS option.
• If data distribution is not static then gather histograms frequently.
• Do not gather histograms for columns using bind variables in WHERE clauses.
• Unless improvement in performance avoid using histograms.
• Histograms are stored in data dictionary and needs substantial additional
storage.
VIEWS
SQL> Select table_name, column_name, num_distinct, histogram
From User_Tab_Col_Statistics
Where table_name = 'CUSTOMERS'
and column_name = 'STATE_ID';
SQL> Select endpoint_number, endpoint_value
From user_histograms
Where table_name = 'CUSTOMERS'
and column_name = 'STATE_ID';
DataSphere
Eg.
exec dbms_stats.gather_dictionary_stats;
exec dbms_stats.gather_fixed_objects_stats;
exec dbms_stats.gather_database_stats(cascade => TRUE, method_opt => 'FOR ALL COLUMNS SIZE SKEWONLY', degree => 18 );
No comments:
Post a Comment