Loading...
Carrie Ballinger, Senior technical consultant, Teradata Certified Master

Carrie Ballinger, Senior technical consultant, Teradata Certified Master

Tech2Tech

Tech Support

Communication
is key

A Teradata Certified Master answers readers’ technical questions.

Most of us like to watch people. Whether in an airport, a doctor’s office or a restaurant, there’s a natural curiosity about the people sitting near you, who they are and what they might be up to. While the growing popularity of e-mails, texting and blogging offers us more avenues for fast communication, they also take us a step further away from those insightful and interesting impressions that we all get when experiencing other people face-to-face.

Although I use electronic communica­tions regularly, I most enjoy the occasions when I get a chance to exchange ideas the old-fashioned way. The fact is, I’d much rather be talking to you in person.

This column is a collection of questions I’ve been asked and have answered that do not rely exclusively on electronic exchanges.

Moving through data blocks with a table scan

When I was at the 2009 Teradata PARTNERS Conference in Washington, D.C., I had a conversation with an attendee about the way data placement is handled in the Teradata system. He expressed a concern that rows are initially stored on disk in row hash order, but over time, some data blocks are re-sized and moved to make room for new rows. Consequently, these data blocks end up out of order.

I pointed out that, no matter where the rows had been placed, the master and cylin­der indexes on each AMP are used to access the data in row hash order during a table scan. Still concerned, he then asked me the following question: "Wouldn’t this still be inefficient because the database would have to jump back and forth on the disk, instead of reading the data blocks serially?”

Figure 1: Random updates

Click to enlarge

Here’s my response:

The Teradata system maintains a flexible approach to data growth. Rows are stored in variable-length blocks that grow or shrink as updates occur. Data blocks that grow beyond a DBA-specified maximum size automati­cally split to make room for more rows. If a particular block needs to grow beyond the space it has on its cylinder, that block can be moved to a different location on the same cylinder, or to a different cylinder. The appropriate cylinder index is updated to reflect the new physical location. When blocks have to be shifted, the logical and physical row hash order of the rows may be slightly different. (See figure 1.)

This adaptable structure in which random growth is accommodated at the time it happens is an advantage. Rows can easily be moved from one location to another without affecting in-flight work or any other data objects that reference that row. Transparent to the user, the database continuously performs cleanup tasks, such as new alloca­tions and space de-fragmentation. This background housekeeping work is done only when required and at a low-priority setting so as not to affect other active work. Consequently, DBA intervention or system downtime is typically not required to keep the rows organized.

In terms of table scan efficiency, several internal techniques keep rows from the same table physically co-located and in row hash sequence whenever possible. For example, a strict ordering of rows across cylinders is based on these enforcements:

  • A data block can only contain rows from one sub-table.
  • Row hash values present in a data block are either all greater than or all less than the row hashes maintained on any other data block of the same table.
  • Key values for the internal index present on any cylinder are either all greater than or all less than the values maintained on any other cylinder within an AMP.

As a result, when a table is scanned in row hash order, all of the rows in one data block can be processed before moving on to the next data block. While that data block may not be adjacent to the current one within the cylinder, all data blocks on one cylinder can be read before moving on to the next cylinder. There is a never a need to go back and reread part of a cylinder. This also means that when using a cylinder read, all data blocks that are on the cylinder can be processed using a single I/O.

When the clock starts ticking on qualification time

Dear Carrie:

I’m setting up Teradata Active System Management exception rules and am considering using an exception on query skew percent that requires a qualification time. When does the clock start ticking on a 180-second qualification time? It’s pos­sible that the query being considered for an exception has been demoted from a higher-priority workload, so it may have been showing skew before it got demoted.

—Rules Follower

Dear Follower:

Exceptions are optional rules to identify and manage queries that are running in a work­load but exhibiting problematic behavior. Qualification time is a method of making sure that a particular exception condition, such as skewed processing, continues for a reasonable amount of time before it is treated as an exception. Monitoring for the exception will not begin until the request has been moved to the workload that has the skew exception rule defined.

When a skew is initially detected, qualifica­tion time starts to accumulate on the next exception interval—a DBA-defined global parameter that controls how frequently the system checks for exceptions. Note that for an exception to take place, its condition must be continuous across the qualification time. If a skew is not detected during one of the excep­tion intervals that fall within the qualification time, the qualification time is reset and another skew must occur before the clock begins again.

Figure 2: Exception rules and qualification time

Click to enlarge

In general, when a request is moved to a different workload by means of a demotion, all exceptions of the original workload and their associated qualification times are cleared unless a global exception is applied to the original and new workloads. If two local skew exceptions are defined for two workloads in succession, qualification time will start anew after the demotion. (See figure 2.)

Improving request cache hits

On a recent conference call, a Teradata DBA asked me about the request cache. He specifically wanted to know how to keep query plans in the cache so the query does not have to undergo parsing and opti­mizing as often. He believes that time spent in the parsing engine (PE) sometimes slows down Web queries.

I reviewed some of the ways they could improve request cache hits, starting with the basics. A request cache exists on each PE and holds generic plans that have been identified as potentially reusable—for example, plans produced for queries that use parameterized SQL. Taking advantage of reusable plans already in the request cache can reduce your elapsed time for short queries and provide more consistent response time. So the longer the reusable plans stay in the cache, the better.

Skew percent exception rules

Skew is a processing imbalance across the AMPs in a configuration, usually due to an uneven distribution of data values. The following shows how skew is measured within Teradata Active System Management:

  • Skew is determined based on the highest AMP value and the average value across all active AMPs for that request.
  • Skew is calculated at the end of each exception interval, not at the end of each query step.
  • Skew percent only considers CPU or I/O consumed during the recent exception interval.
  • When using CPU skew percent, a value of 0 means no skew, and the larger the number, the greater the skew. A skew percent of 25 means the work is taking 1.33 times longer than if there were no skew. The formula used is:

(HighAMPCPU – AvgAMPCPU) / HighAMPCPU * 100

These are some of the reasons that entries are removed from the request cache:

  • Periodic cache flush. The diction­ary and request cache is periodically flushed, or emptied out, on each PE to ensure that demographic information and query plans are never stale. This event occurs every four hours and is staggered so that no two PEs undergo flushing at the same time. The diction­ary cache is always completely flushed first; next, the request cache is flushed, but only partially. In-use plans or plans not dependent on statistics (such as pri­mary index plans) are kept in the cache.
  • Operations performed on objects. Collecting statistics, ALTER or DROP commands, or INSERT/SELECTs that change more than 10% of the rows will cause specific objects to be removed from the request cache. To avoid this, restrict these operations against objects used by your Web queries, or other tactical appli­cations, when they are running.
  • MaxRequestsSaved too low. Request cache entries per PEare not allowed to exceed the number set in the perfor­mance field No. 16 of the DBS Control record. Changing the setting from the default of 600 to the maximum of 2,000 is often helpful for keeping more plans in cache.
  • Very large plans. The request cache is limited to 100MB. If caching a new plan would cause the cache to exceed that size, then smaller plans are some­times removed to make room.
Table 1: MaxRequestSaved = 600 (the default)

Click to enlarge

  • Available swap space on the node has fallen. When the system is tight on memory, a minimal number of entries will be purged from the request cache to conserve memory for other, more criti­cal needs. In these cases, the older plans are purged first.
  • Utility locks in MultiLoad jobs. Utility locks cause the table header to change, resulting in all plans related to that table getting flushed from the request cache. This phenomenon was observed during an exploratory tactical query test in which short mini-batch loads were exe­cuted against the same table that tactical queries were running against. Increasing the MaxRequestsSaved parameter reduced this impact significantly.

To evaluate how effectively caching works for you, turn on default logging for the tacti­cal application in DBQLogTbl. After a day, examine the CacheFlag value. A value of “T” is an indication that the plan used by that query came from the request cache.

Below are results from a three-hour mixed workload test showing the num­ber of tactical queries, listed by query name, that were not cached (CacheFlag in DBQLogTbl was blank). The total num­ber of tactical queries executed in this test was 14,284. For the first part of this test, MaxRequestSaved was set at the default of 600, and more than 1,000 iterations of each query were executed. (See table 1.)

During the test, Q8 and Q9 were accessing the Orders table, which was also undergoing frequent mini-batch MultiLoads. Note that these two queries have cache-miss counts in the hundreds—16 times more cache misses than the other queries.

When MaxRequestSaved was increased to 900, the numbers decreased, including those for Q8 and Q9. The results were shown in table 2.

Table 2: MaxRequestSaved = 900

Click to enlarge

While all queries had a reduced number of cache-misses with this tuning step, the change did not completely eliminate cache misses for Q8 and Q9. This is because the plans for Q8 and Q9 were removed from the request cache each time a MultiLoad job began or ended.

This example illustrates how mini-batch loading into a table that is being accessed by a tactical application can cause more frequent flushing of cached plans. Additionally, such queries will exhibit, on average, more PE over­head as a result. Longer-running, less-frequent mini-batch jobs may reduce this overhead. Increasing the MaxRequestsSaved parameter may also provide a benefit, as shown in table 2.

INSERT/SELECT processing with compression

Dear Carrie:

I’m performing an INSERT/SELECT into an empty table on our Teradata 12 system. The source and target table columns and primary index (PI) definitions are identi­cal, but the explain plan shows that a spool file is being generated. I didn’t expect any spooling to occur because the PI of both tables is identical. Is there something I’m missing? These are multi-terabyte tables, so the impact is significant. Any help would be appreciated.

—Thinking Big

Table 3: INSERT/SELECT processing differences

Click to enlarge

Dear Big:

Because the Teradata Optimizer does a compression match check when producing the INSERT/SELECT plan, the first thing to check is whether compression is identically defined on both tables. If it is the same on both tables and other definitions are also the same, rows are inserted directly into the empty table without spooling.

If compression is defined differently, then the data blocks must be reformatted. This requires spooling the rows, followed by the insert.

However, even if compression is differ­ent and spooling is required, you are still getting block-at-a-time processing. Transient journals will be written for each block insert, not for each row, so there are some optimi­zations being performed. Table 3 illustrates four different types of INSERT/SELECT statements, and the transient journal and spool requirements for each.

Below is an explain showing an INSERT/SELECT in which both the source and the target tables have identical compression defined. The table definitions follow the explain. The target table is empty.

EXPLAIN
INSERT INTO Filled_Orders_New
SELECT * FROM Filled_Orders;


Explanation
1) First, we lock a distinct CAB.”pseudo table” for write on a RowHash to prevent
global deadlock for CAB.Filled_Orders_New.
2) Next, we lock a distinct CAB.”pseudo table” for read on a RowHash to prevent
global deadlock for CAB.Filled_Orders.
3) We lock CAB.Filled_Orders_New for write, and we lock CAB.Filled_Orders for read.
4) We do an all-AMPs MERGE into CAB.Filled_Orders_New from CAB.Filled_Orders.
The size is estimated with no confidence to be 45,028,360 rows. The estimated time
for this step is 2 minutes and 15 seconds.
5) We spoil the parser’s dictionary cache for the table.
6) Finally, we send out an END TRANSACTION step to all AMPs involved in processing
the request.

CREATE MULTISET TABLE CAB.Filled_Orders ,NO FALLBACK,
    NO BEFORE JOURNAL,
    NO AFTER JOURNAL,
    CHECKSUM = DEFAULT
    (
     O_ORDERKEY INTEGER NOT NULL,
     O_CUSTKEY INTEGER NOT NULL,
     O_ORDERSTATUS CHAR(1) CHARACTER SET LATIN CASESPECIFIC NOT NULL
     COMPRESS (‘F’,’O’,’P’),
     O_TOTALPRICE DECIMAL(15,2) NOT NULL,
     O_ORDERDATE DATE FORMAT ‘yyyy-mm-dd’ NOT NULL,
     O_ORDERPRIORITY CHAR(15) CHARACTER SET LATIN CASESPECIFIC NOT
     NULL
    COMPRESS (‘1-URGENT ‘,’2-HIGH ‘,’3-MEDIUM ‘,’4-NOT SPECIFIED’,
     ‘5-LOW ‘),
    O_CLERK CHAR(15) CHARACTER SET LATIN CASESPECIFIC NOT NULL,
    O_SHIPPRIORITY INTEGER NOT NULL COMPRESS 0 ,
    O_COMMENT VARCHAR(79) CHARACTER SET LATIN CASESPECIFIC NOT NULL)
UNIQUE PRIMARY INDEX ( O_ORDERKEY )
INDEX ( O_ORDERDATE );

CREATE MULTISET TABLE CAB.Filled_Orders_New ,NO FALLBACK ,
    NO BEFORE JOURNAL,
    NO AFTER JOURNAL,
    CHECKSUM = DEFAULT
(
    O_ORDERKEY INTEGER NOT NULL,
    O_CUSTKEY INTEGER NOT NULL,
    O_ORDERSTATUS CHAR(1) CHARACTER SET LATIN CASESPECIFIC NOT NULL
    COMPRESS (‘F’,’O’,’P’),
    O_TOTALPRICE DECIMAL(15,2) NOT NULL,
    O_ORDERDATE DATE FORMAT ‘yyyy-mm-dd’ NOT NULL,
    O_ORDERPRIORITY CHAR(15) CHARACTER SET LATIN CASESPECIFIC NOT
    NULL
    COMPRESS (‘1-URGENT ‘,’2-HIGH ‘,’3-MEDIUM ‘,’4-NOT SPECIFIED’,
    ‘5-LOW ‘),
    O_CLERK CHAR(15) CHARACTER SET LATIN CASESPECIFIC NOT NULL,
    O_SHIPPRIORITY INTEGER NOT NULL COMPRESS 0 ,
    O_COMMENT VARCHAR(79) CHARACTER SET LATIN CASESPECIFIC NOT NULL)
UNIQUE PRIMARY INDEX ( O_ORDERKEY )
INDEX ( O_ORDERDATE );

The next explain shows an INSERT/SELECT in which the source has compression defined differently compared with the target table. The table definitions follow the explain. The target table is empty.

EXPLAIN INSERT INTO Filled_Orders_New SELECT * FROM Filled_Orders; 

Explanation
1) First, we lock a distinct CAB.”pseudo table” for write on a RowHash to prevent
global deadlock for CAB.Filled_Orders_New.
2) Next, we lock a distinct CAB.”pseudo table” for read on a RowHash to prevent
global deadlock for CAB.Filled_Orders.
3) We lock CAB.Filled_Orders_New for write, and we lock CAB.Filled_Orders for read.
4) We do an all-AMPs RETRIEVE step from CAB.Filled_Orders by way of an all-rows scan
with no residual conditions into Spool 1 (all_amps) (compressed columns allowed), which is built locally on the AMPs. The input table will not be cached in mem­ory, but it is eligible for synchronized scanning. The result spool file will not be cached in memory. The size of Spool 1 is esti­mated with low confidence to be 45,028,360 rows (4,322,722,560 bytes). The estimated time for this step is 51.75 seconds.
5) We do an all-AMPs MERGE into CAB.Filled_Orders_New from Spool 1 (Last Use).
The size is estimated with low confidence to be 45,028,360 rows. The estimated time for this step is 2 minutes and 15 seconds.
6) We spoil the parser’s dictionary cache for the table.
7) Finally, we send out an END TRANSACTION step to all AMPs involved in processing
the request.
-> No rows are returned to the user as the result of statement 1.

CREATE MULTISET TABLE CAB.Filled_Orders_New ,NO FALLBACK,
    NO BEFORE JOURNAL,
    NO AFTER JOURNAL,
    CHECKSUM = DEFAULT
(
    O_ORDERKEY INTEGER NOT NULL,
    O_CUSTKEY INTEGER NOT NULL,
    O_ORDERSTATUS CHAR(1) CHARACTER SET LATIN CASESPECIFIC NOT NULL,
    O_TOTALPRICE DECIMAL(15,2) NOT NULL,
    O_ORDERDATE DATE FORMAT ‘yyyy-mm-dd’ NOT NULL,
    O_ORDERPRIORITY CHAR(15) CHARACTER SET LATIN CASESPECIFIC NOT
     NULL
    COMPRESS (‘1-URGENT ‘,’2-HIGH ‘,’3-MEDIUM ‘,’4-NOT SPECIFIED’,
     ‘5-LOW ‘),
    O_CLERK CHAR(15) CHARACTER SET LATIN CASESPECIFIC NOT NULL,
    O_SHIPPRIORITY INTEGER NOT NULL COMPRESS 0 ,
    O_COMMENT VARCHAR(79) CHARACTER SET LATIN CASESPECIFIC NOT NULL)
UNIQUE PRIMARY INDEX ( O_ORDERKEY )
INDEX ( O_ORDERDATE );
CREATE MULTISET TABLE CAB.Filled_Orders ,NO FALLBACK,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT
(
O_ORDERKEY INTEGER NOT NULL,
O_CUSTKEY INTEGER NOT NULL,
O_ORDERSTATUS CHAR(1) CHARACTER SET LATIN CASESPECIFIC NOT NULL
COMPRESS (‘F’,’O’,’P’),
O_TOTALPRICE DECIMAL(15,2) NOT NULL,
O_ORDERDATE DATE FORMAT ‘yyyy-mm-dd’ NOT NULL,
O_ORDERPRIORITY CHAR(15) CHARACTER SET LATIN CASESPECIFIC NOT
NULL
COMPRESS (‘1-URGENT ‘,’2-HIGH ‘,’3-MEDIUM ‘,’4-NOT SPECIFIED’,
‘5-LOW ‘),
O_CLERK CHAR(15) CHARACTER SET LATIN CASESPECIFIC NOT NULL,
O_SHIPPRIORITY INTEGER NOT NULL COMPRESS 0,
O_COMMENT VARCHAR(79) CHARACTER SET LATIN CASESPECIFIC NOT NULL)
UNIQUE PRIMARY INDEX ( O_ORDERKEY )
INDEX ( O_ORDERDATE );

In the above example, the O_Orderstatus column has compression in the Filled_Orders table (the source table), but not in the Filled_Orders_New table (the target table). That difference required the creation of spool 1 to reformat the data blocks.


Your Comment:
  
Your Rating:

Comments