Carrie Ballinger, Senior database analyst, Teradata Certified Master
Tech2Tech
Tech Support
May I Lend
a Hand?
Teradata Certified Master assists readers by answering their technical questions.
by Carrie Ballinger
Volunteering has always had a place in my life-from working the phones at a community help line to teaching English as a second language at a night school to organizing bingo games at my neighborhood nursing home.
Numerous opportunities exist to help others, no matter where you live and what skills and gifts you have to offer. For example, writing this column is one way for me to keep the spirit of volunteerism alive while providing a forum for recycling interesting technical tidbits.
Cache flag confusion
Dear Carrie:
I've gotten into the habit of keeping my eye on the Database Query Log (DBQL) cache flag to make sure my tactical query plans are still being cached. Now that we've upgraded to Teradata 12, I've noticed that instead of the expected "T" in the cache flag column for my parameterized queries, I sometimes see an "S" or a "G." What's up with that? -Caching Out
Dear Caching Out:
What you are seeing reflects enhancements made to the cached plan functionality in Teradata 12. The Optimizer may peek at the values carried in the USING clause during parsing. This enables it to produce better plans for moderate or complex parameterized requests, which are eligible for having their plans cached in the request cache.
Sometimes, by incorporating specific values into the production of a parameterized query plan, better plan decisions can be made, such as which of several tables to join first, or what type of access to use for a table.
By the way, this peeking approach applies only to parameterized requests of some complexity. Simple requests that have an equality condition on a single primary index, unique primary index or partitioning column value will have the same cache flag values ("T") as they always have after the plan has been cached.
So, assuming your parameterized request is not such a simple request, here's how peeking works: The first time a parameterized request is seen, the parser peeks at the values in the USING clause and a plan specific to those values is produced. This will set the cache flag for that query's execution to "S" (specific plan). Since the value-specific plan is just now produced, no plan is actually cached for reuse at this time. A value-specific plan may be different for each execution of the same parameterized request, which means it will be cached only when it is seen for the second time, containing the identical values.
After this first specific-plan execution is complete, metrics and estimates produced from the execution are saved in the request cache in preparation for additional decisions that will be made should this request be seen again. If the parsing engine (PE) time that resulted from the specific plan was a very small percentage of the total query execution time, then the Optimizer may set a flag to always use a specific plan for this query. In that case, the cache flag going forward will contain an "A" (always specific) and no generic plans will be generated for this request as long as its metrics remain in the request cache. (See figure 1 to the right.)
On the other hand, if the PE time is non-trivial, the second time that parameterized request is seen a generic plan is produced, resulting in a cache flag "G." Once a specific and a generic plan have been generated, their estimates and runtime metrics can be compared side by side and a decision can be made as to which plan to always produce going forward.
A common pattern I've observed, assuming the parameterized request is repetitively executed on the same PE, is a cache flag sequence of "S," "G," "T," "T," "T" and so on. The first time the query is seen, a specific plan is produced ("S"). The second time, a generic plan ("G") is executed. From that point onward, the same generic plan ("T") is produced until that plan is flushed from the cache.
Two examples of how cache flag values can change in Teradata 12 are described below and shown in tables 1 and 2.
Parameterized SQL
This first example of selected DBQLog-Tbl output illustrates rows executing the same parameterized request, sorted by start date and start time. Notice the different values for the cache flag, and the different reported start date and times. (See table 1.)
The first row represents a parameterized query. Because it is seen for the first time by this PE, the Optimizer peeks at the values in the USING clause, a specific plan is built and an "S" appears in the cache flag column.
The second row displays information for the second time the same parameterized request is seen. Because the parsing time used to generate the specific plan in the first row was not considered low enough to discount producing a generic plan going forward, the generic plan is now produced and results in a cache flag of "G." Note that this results in a parsing time (0.06 seconds) similar to that shown in the first row.
In the third row, the query is seen for the third time within a short period of time on the same PE. In this case, the "T" in the cache flag indicates that this query used the generic cached plan that had been put in the cache by the second query's execution. Because the plan came from the request cache, the parsing time displays zero.
The fourth row is identical to the third row with a cache flag of "T"-again, to show that it used the plan already in cache.
The same query could have continued to use the cached plan in the fifth row, except that so much time has passed (16 days) that the cached plan on the PE has been flushed. The request cache on each AMP and its metrics are flushed every four hours, causing generic plans to be removed from the cache.
The process, therefore, will begin anew with a specific plan. The sixth row is the same as the fifth row, with the passage of time causing the metrics about the previous specific plan to be flushed from the cache.
NON-Parameterized SQL
In the following example, a non-parameterized request is repeated five times, with only three to four seconds between each execution. For this query either nothing or a "T" shows up in the cache flag column. (See table 2.)
Because the request does not contain parameterized SQL and there is no USING clause, no peeking can take place and an "S" or a "G" in the cache flag column will never appear. A decision to use a specific or a generic plan is never made under these conditions. However, this example illustrates a non-parameterized query that will have its plan cached when it is seen the second time with identical SQL and identical values. This means that the third execution of the same query can take advantage of a cached plan even though the query does not contain parameterized SQL.
In the table's first two entries, the query is parsed. The "T" in the cache plan column for the remaining three executions indicates that the cached plan is used.
Query session states
Dear Carrie:
I understand the individual descriptions of query session states under normal circumstances (e.g., IDLE, ACTIVE), but how should I interpret the particular combination states that I've been seeing recently when our platform is congested?
For instance, what's the difference between IDLE/DISPATCHING and ACTIVE/DISPATCHING? -Stuck on States
Dear States:
In Teradata Manager, the first part of a combined query session state, such as ACTIVE/DISPATCHING, refers to the state of the AMP; the second part refers to the state of the PE.
The PE state of DISPATCHING does not exactly mean the steps are being dispatched right at that time. It means the steps have been generated and sent to the dispatcher module from the parser module, and the dispatcher has started sending them to the AMPs. Normally, a query will remain in this state until all the steps are sent.
The AMP state of being ACTIVE or IDLE indicates whether steps have been sent to the AMPs: ACTIVE means steps are processing; IDLE means that no processing is under way, possibly because a step is waiting for an AMP worker task. ACTIVE/DISPATCHING is a normal, combined state when AMP steps are executing and when the request contains multiple steps.
Session states are ordered (from low to high) as: IDLE, ACTIVE, BLOCKED and ABORT, with the highest state value across all AMPs being reported. So, if all of the AMPs are idle, the session state will appear as IDLE. If at least one AMP is active, even if the rest are idle, the state is ACTIVE. If at least one is blocked and the rest are idle or active, it reports BLOCKED.
ALTER TABLE and CPU limits
Dear Carrie:
I've set system-level CPU limits on our recently expanded system to hold back some of the processing power until we grow into it. One of my DBAs ran an ALTER TABLE command yesterday, and I was surprised to see that it caused the CPU limit to be exceeded by 20%. I thought a system-level limit held down CPU usage to the set level. What gives? -Beyond my Limits
Dear Beyond:
By design, several database activities do not honor CPU limits-the most obvious are rollbacks. It was determined several releases ago that rollbacks were too critical to keep under the control of a CPU limit, and that it is always desirable to have a rollback complete as soon as possible.
In addition to rollbacks, some other internal activities occur with which CPU limits are not enforced. Most of these are short-lived and use little CPU. For example, highly critical database code running in the system performance group will never be held back.
An ALTER TABLE is an example of a user-submitted request that will not conform to a CPU limit. Some of the time during the execution of an ALTER TABLE is spent changing the underlying data in the base table. (See figure 2.) This type of highly critical work is so important to complete once it begins that it is exempted from interruptions, even at the expense of violating a CPU limit. CHECK TABLE and TABLE REBUILD have similar but less severe characteristics.
An alternative is to run lengthy ALTER TABLE jobs outside of the main processing hours when over-consumption of resources is less disruptive. Another option is to create a new table with the new definitions, and then use the INSERT/SELECT statement to move the data from the old table to the new one. An INSERT/SELECT statement will respect the CPU limit and, in some cases, may be a better-performing alternative.
As mentioned earlier, a CPU limit will never hold back a rollback or an ALTER TABLE. However, when Priority Scheduler calculates total CPU usage for the purposes of enforcing a system-level limit, the amount of CPU used by these activities is counted.
For example, in an attempt to enforce the CPU limit, if a rollback requires a large amount of CPU and runs for a long time, then other work will be throttled back more than it would if the rollback were not happening. In these cases, other active work will simply have less CPU available. However, even during such a rollback or ALTER TABLE, anti-starvation mechanisms will make sure that all active work will get some level of CPU.
Collecting statistics on PARTITION
Dear Carrie:
We're on Teradata Database V2R6.2, and every time data is added to any partition-typically daily-we collect statistics on PARTITION. On a weekly basis, we also collect statistics on the partitioning column, which is usually a date. After performing some simple tests with no joins, it seems that collecting statistics on PARTITION has little or no effect on the Optimizer when dealing with single-table estimates. When we fill a new partition, do we need to change our approach to get more reliable statistics? I'm particularly concerned because when a query has a BETWEEN clause on the partitioning column, it gives an estimate of "1 row." --One Day at a Time
Dear One Day:
Statistics on the system-derived column PARTITION are not used for single-table cardinality estimates. Therefore, I would not expect estimates in the plan to change based on whether PARTITION statistics were refreshed after filling up a new partition. Only after the data is loaded into an empty partition and statistics are collected on the partitioning column (date) do the single-table estimates change.
You are getting an estimate of "1 row" because the statistics on the partitioning column (date) have not been refreshed since the empty partition was created. My advice is to continue collecting daily statistics on PARTITION because it is an inexpensive operation to repeat. Since you are on Teradata Database V2R6.2, if you are experiencing sub-optimal plans, and consequently, badly running queries, think about re-collecting statistics on the partitioning column more frequently than weekly.
In Teradata 12, statistics extrapolation was developed to avoid having to re-collect on partitioned primary index (PPI) tables when only one or two new partitions are affected. But until you upgrade your system, you won't get reliable statistics against the new partition without re-collecting on the partitioning column.
This example of statistics extrapolation uses a 20-million-row table with a PPI that is partitioned by orderdate. Below is the table definition, a query showing the table count, minimum and maximum values for the partitioning column, followed by a display of the currently collected statistics:
CREATE MULTISET TABLE CAB.txnppi
(
ORDERKEY INTEGER NOT NULL,
CUSTKEY INTEGER NOT NULL,
TOTALPRICE DECIMAL(15,2) NOT NULL,
ORDERDATE DATE FORMAT 'YY/MM/DD' NOT NULL)
PRIMARY INDEX ( ORDERKEY )
PARTITION BY RANGE_N (orderdate BETWEEN DATE '1992-01-01'
AND DATE '1998-12-01' EACH INTERVAL '1' MONTH);
SELECT COUNT(*), MIN(orderdate), MAX(orderdate) FROM txnppi;
| Count(*) | Minimum(ORDERDATE) | Maximum(ORDERDATE) |
| 20495830 | 01/01/1992 | 12/31/1994 |
HELP STATISTICS txnppi;
| Date | Time | Unique Values | Column Names |
| 08/09/03 | 09:16:03 | 20,495,830 | ORDERKEY |
| 08/09/03 | 09:15:36 | 2,985,070 | CUSTKEY |
| 08/09/03 | 09:15:11 | 1,096 | ORDERDATE |
| 08/09/03 | 09:16:19 | 36 | PARTITION |
The txnppi table has been loaded with rows that have dates up to the end of 1994. The statistics are current up to that date. Next, a staging table is loaded with 3.4 million rows that reflect transactions from the first six months of 1995. These rows will be inserted into the txnppi table. Below is a query that shows the count of rows in the staging table, as well as the minimum and maximum orderdate values.
SELECT COUNT(*), MIN(orderdate), MAX(orderdate) FROM txnstage;
| Count(*) | Minimum(ORDERDATE) | Maximum(ORDERDATE) |
| 3407807 | 01/01/1995 | 06/30/1995 |
All of the rows in the staging table are then inserted into the txnppi table.
INSERT INTO txnppi
SELECT * FROM txnstage;
This INSERT from the staging table has caused the txnppi table to increase its number of rows by more than 10% (3,407,807 are added to a table with 20,495,830 rows), as seen in the count below. However, statistics have not been re-collected, so the original statistics histograms are still intact.
SELECT COUNT(*), MIN(orderdate), MAX(orderdate) FROM txnppi;
| Count(*) | Minimum(ORDERDATE) | Maximum(ORDERDATE) |
| 23903637 | 01/01/1992 | 07/01/1995 |
HELP STATISTICS txnppi;
| Date | Time | Unique Values | Column Names |
| 08/09/03 | 09:16:03 | 20,495,830 | ORDERKEY |
| 08/09/03 | 09:15:36 | 2,985,070 | CUSTKEY |
| 08/09/03 | 09:15:11 | 1,096 | ORDERDATE |
| 08/09/03 | 09:16:19 | 36 | PARTITION |
Extrapolation can be illustrated in the following query explain. The query that accesses the txnppi table contains selection criteria for a range of dates. The selection criteria in the query asks for a range of dates that represent rows which have recently been added to the table but are not known to the statistics histogram. In the query associated with the explain below, rows with order dates in February 2005 are requested.
EXPLAIN SELECT * FROM txnppi
WHERE orderdate BETWEEN 950201 AND 950228;
Explanation
- First, we lock a distinct CAB."pseudo table" for read on a RowHash to prevent global deadlock for CAB.txnppi.
- Next, we lock CAB.txnppi for read.
- We do an all-AMPs RETRIEVE step from 2 partitions of CAB.txnppi with a condition of ("(CAB.txnppi.ORDERDATE >= DATE '1995-02-01') AND (CAB.txnppi.ORDERDATE >= DATE '1995-02-28')") into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with high confidence to be 561,457 rows (23,019,737 bytes). The estimated time for this step is 0.14 seconds.
-
Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 0.14 seconds.
The explain estimates in step 3 show that the Optimizer is estimating 561,457 rows will be returned by this query. The following SQL determines the exact number of rows that will be returned.
SELECT COUNT(*) FROM txnppi
WHERE orderdate BETWEEN 950201 AND 950228;
Count(*)
543053
Instead of getting 1 as the estimated number of rows, as the questioner above did, the estimated number of rows here is within 3% of the actual number of rows returned when a count with the same criteria is executed. This illustrates the effectiveness of statistics extrapolation under these conditions.