Loading...
Carrie Ballinger, senior technical consultant, Teradata Certified Master

Carrie Ballinger, senior technical consultant, Teradata Certified Master

Tech2Tech

Tech Support

Straight Talk

A Teradata Certified Master answers readers’ technical questions.

I admit it. I’m a sloucher. I’ve had bad posture most of my life. Then last year, a healthcare provider made me face my bad habit and convinced me that slouching negatively affected my health and comfort. Within a month, I had adapted to a new, more natural way of standing, walking and sitting. This health tip led to big improvements in how I look and feel.

Sometimes, small words of advice can have a large impact, especially when taken seriously. I hope you’ll seriously consider my responses to the questions in this article, as perhaps they will help you improve your own job skills. Just make sure you’re sitting up straight!

TPump Inserts with Fallback

Dear Carrie:

Could you explain how AMP worker tasks (AWTs) are used and held when a row is inserted into a table with fallback? We’re doing TPump loads into tables with fallback, and from time to time we run out of AWTs. During those times, our TPump throughput goes way down.

—Keep on Pumping

Dear Pumping:

When an insert is performed on a table that has fallback, a point-to-point message is spawned from the primary AMP to the fallback AMP. The fallback AMP then inserts the fallback copy. So the entire operation will require two AWTs, twice the number needed for an insert into a non-fallback table.

Initially, an AWT is acquired on the AMP that will hold the primary copy of the row. Only after the primary copy of the row has been successfully inserted will the message to the fallback AMP be spawned. The step that inserts the primary copy will hold on to its AWT and wait until the fallback copy of the row has been inserted. The primary AMP will free up its AWT only after the fallback activity is complete.

Because of this sequential approach, you can expect a small delay in releasing the primary copy’s AWT, even if there are no out-of-AWT conditions. This delay could be even longer if the message spawned to the fallback AMP has to wait for an AWT. (See figure 1.)

Figure 1: TPump Loads into Tables with Fallback

Click to enlarge

You also have a couple of options to better manage your TPump jobs. First, you can specify reserved AWTs and then expedite the TPump workload so it will draw from those reserves.

Second, you can use object and/or workload throttles to reduce the con­currency of the less critical work, which should allow the TPump inserts to continue to have good throughput.

Overview of TOP N

Dear Carrie:

We have several queries that are part of our Web appli­cation that use TOP 5000 to limit the rows that the Web tool returns to the screen. Would you expect such a query to run slower than the same query without the TOP 5000? The TOP 5000 query has an extra step in the Explain, but it’s run­ning in two seconds, while the standard query runs in 10.

—Top of My Game

Dear Game:

The TOP 5000 is faster because it reads only the first 5,000 rows then stops. Since there is no need to do random access when you use TOP N, the query can go to one AMP to do this read, if the number of rows selected is a small part of the total table. The TOP N feature will be more efficient than using the retlimit option because limiting the result happens inside the database with TOP N, while a retlimit has to build up the entire final spool file with the limit being applied to the rows to be returned.

Behind the scenes, TOP N is implemented as either a QUALIFY RANK or QUALIFY ROWNUMBER ordered analytic function. Within this function, the single-AMP optimization chooses an AMP based on session number, so each session will select different rows.

If there were a sort in your query, the TOP 5000 version might be longer, because it would have to scan the entire 52 million-row table, sort the rows, then do the stat function step to pick out the top 5,000 rows. So you can’t generalize that TOP N will always be faster, although it usually will be without ORDER BY in the query.

Even when ORDER BY is specified in the query, additional optimi­zations may be taking place internally. For example, selecting sorted TOP N rows may be performed without sorting the entire table, if the value of N is small (less than 1,000 rows). A running selection of the TOP N rows is maintained in memory while the table scan is taking place. The row lowest in the sort order is replaced with a more current row as the scan proceeds, if the more current row value is higher than the ones being held in memory.

Figure 2 illustrates different ways that TOP N may be executed. See “Different Options for TOP N” farther down the page for Explain text that illustrates some of these differences.

Figure 2: TOP N Functionality

Click to enlarge

How Long Utilities are Delayed

Dear Carrie:

We have utility throttles enabled. I’m trying to identify how many jobs got delayed by utility throttles, and for how long, from historical data. Delay time columns in the DBQLogTbl apply only to queries. Any ideas?

—Identifying Delays

Dear Delays

One approach is based on the LogTime and LogonTime columns in the DBC.LogOnOff view. The difference between those columns can be used to deduce whether a utility was delayed because of utility throttles.

In most cases the difference between these two times is very small (less than one second), so a somewhat higher difference (say, five or 10 seconds) could be used to identify a delay. This threshold may need to be adjusted for each site. One way to do that is to consider what the usual difference between those fields is for non-delayed utilities, and set the threshold somewhat greater than this norm.

The SQL I have used (see “SQL for Determining Load Utility Delay Times” farther down the page) joins LogOnOff with DBQLogTbl, based on match­ing session numbers. The query looks only at logon activity whose session has a valid LSN number in the Database Query Log (DBQL) (which identifies it as a utility session rather than a query).

This approach will work well in Teradata 12.0 and Teradata 13.0 because in those releases a utility job is subject to delay during the logon of the first session. But with Teradata 13.10, utilities will be considered for delay based on an internal SQL request. So you will want to discon­tinue this technique when you get to Teradata 13.10.

Enforcing a Spool Limit

Dear Carrie:

I had a table that was performing a large product join, and we decided to control it by lowering the spool allocation for that user while the query was running. It took longer than I expected to take effect. Does either spool limit or the enforce­ment of a change to a spool limit have to wait until the end of the currently running query step to have an impact?

—Spool Police

Dear Police:

Spool limits are allocated for each user either in the User record or in the user’s Profile. This limit provides a means to restrict how much spool space each user can utilize at any point in time. When a user’s query causes his spool allocation to be exceeded, the query responsible for reaching that limit is aborted.

Spool limits are useful because they can reduce the impact of poorly written queries on other active work, or they can keep users from being too adventuresome in what they ask of the database.

The number of bytes of spool that is currently being used by each user on each AMP is recorded in a cache in AMP memory. A column in that cache, CurrentSpool, is updated every time the spool’s size changes.

The comparison of the spool size against the limit set in the User/Profile occurs whenever the CurrentSpool column in the cache is updated. When this increase (or decrease) in spool size has been detected, the comparison is made against the limit, and the query is aborted if the limit is found to have been exceeded.

If you reduce the spool limit for a user while that user’s query is running, and that query has already exceeded the new spool limit before the change, that state will not be detected until the query performs an additional write to that spool. If there is a lot of reading going on, but it takes some number of seconds before any writes to spool are issued, then that’s how long you will have to wait for the query to be aborted.

It’s also possible that the Modify User statement might not execute instantaneously, as it needs an exclusive lock on the target database. However, once that change to the spool limit has taken place, it is not required that the query step be completed before the comparison to CurrentSpool is made, and the query is aborted. A change to the spool limit is effective immediately.

TDWMEstTotalTime Column

Dear Carrie:

I’m really digging deep into Teradata Active System Management. Can you tell me the difference between the DBQLogTbl TDWMEstTotalTime and total estimated time in the Explain text? I’m researching how some of our queries get classified to workloads, and I noticed those numbers are often different.

—Digging Deep

Dear Deep:

Teradata Dynamic Workload Manager adds up all of the step estimates to determine a total estimated processing time of a query. This total is compared against the value given in a workload classification parameter, if estimated time is used for this purpose. Sometimes these totaled estimates do not align perfectly with the total estimate you see at the bottom of an Explain. This is due to a difference in how parallel steps are considered. In my experience, this difference is generally slight.

In building the query plan, the optimizer determines that because some of the steps are done in parallel, the total estimated time in the plan should not include the time of each parallel step, but rather only one of them. However, the Teradata Dynamic Workload Manager definition of total estimated processing time, and the one used for query classification, is the total time across all steps, whether or not they run in parallel.

By the way, these new Teradata Dynamic Workload Manager col­umns, of which TDWMEstTotalTime is one, appear only if workloads are enabled. There is no benefit in calculating and logging these values unless classification to a workload is going to take place. And you won’t see them in QryLog view, only in TDWMQryLog.

See “Estimated Time Differences” farther down the page for an example that illus­trates the sometimes small discrepancy between these types of estimates.

Different Options for TOP N

The query below accesses all rows without the TOP N function. The estimated number of rows in the Explain reflect the size of the table in its entirety. Only one step is required to perform the database activity involved in such a full table scan.

EXPLAIN 
SELECT *
FROM Orders_Filled;
 
1) First, we lock a distinct CAB."pseudo table" for read on a RowHash to prevent
   global deadlock for CAB.Orders_Filled.

2) Next, we lock CAB.Orders_Filled for read.
3) We do an all-AMPs RETRIEVE step from CAB.Orders_Filled by way of an all-rows scan
   with no residual conditions into Spool 1 (group_amps), which is built locally on
   the AMPs. The input table will not be cached in memory, but it is eligible for
   synchronized scanning. The result spool file will not be cached in memory.
   The size of Spool 1 is estimated with high confidence to be 150,085,000 rows
   (13,357,565,000 bytes). The estimated time for this step is 34.62 seconds.
4) Finally, we send out an END TRANSACTION step to all AMPs involved in processing
   the request. The contents of Spool 1 are sent back to the user as the result of
   statement 1. The total estimated time is 34.62 seconds.

This second query accesses the same table but now specifies TOP 5000. There are now two stat function steps in the plan replacing the single step in the previous plan that simply scanned the table. Notice in step 3 that the Explain text says one AMP will be selected to access those 5,000 rows. The estimated row count for that step is 5,000 rows, which is what TOP 5000 specified. Step 4 is executed only if the requested number of rows could not be retrieved from step 3.

EXPLAIN 
SELECT TOP 5000 *
FROM Orders_Filled;
 
1) First, we lock a distinct CAB."pseudo table" for read on a RowHash to prevent
   global deadlock for CAB.Orders_Filled.
2) Next, we lock CAB.Orders_Filled for read.
3) Then we do an all-AMPs STAT FUNCTION step from CAB.Orders_Filled by way of an all-
   rows scan with no residual conditions into Spool 5, which is redistributed by hash
   code to all AMPs. The result rows are put into Spool 1 (group_amps), which is built
   locally on the AMPs. This step is used to retrieve the TOP 5000 rows. One AMP is
   randomly selected to retrieve 5000 rows. If this step retrieves less than 5000 rows,
   then execute step 4. The size is estimated with high confidence to be 5,000 rows
   (445,000 bytes).
4) After that we do an all-AMPs STAT FUNCTION step from CAB.Orders_Filled by way of
   an all-rows scan with no residual conditions into Spool 5 (Last Use), which is
   redistributed by hash code to all AMPs. The result rows are put into Spool 1
   (group_amps), which is built locally on the AMPs. This step is used to retrieve the
   TOP 5000 rows. The size is estimated with high confidence to be 5,000 rows (445,000
   bytes).
5) Finally, we send out an END TRANSACTION step to all AMPs involved in processing
   the request.
-> The contents of Spool 1 are sent back to the user as the result of statement 1.

Now, with this third query, we will show what happens if TOP N requests more than 5,000 rows. In this case, the query requests the top 50,000 rows. Notice that in step 3 that “load distribution optimization” is used, instead of single-AMP access, to get those 50,000 rows. This is because for higher numbers of rows it becomes more efficient to spread the work across all AMPs.

EXPLAIN 
SELECT TOP 50000 *
FROM Orders_Filled;
 
1) First, we lock a distinct CAB."pseudo table" for read on a RowHash to prevent
   global deadlock for CAB.Orders_Filled.
2) Next, we lock CAB.Orders_Filled for read.
3) Then we do an all-AMPs STAT FUNCTION step from CAB.Orders_Filled by way of an all-
   rows scan with no residual conditions into Spool 5, which is redistributed by hash
   code to all AMPs. The result rows are put into Spool 1 (group_amps), which is built
   locally on the AMPs. This step is used to retrieve the TOP 50000 rows. Load
   distribution optimization is used. If this step retrieves less than 50000 rows, then
   execute step 4. The size is estimated with high confidence to be 50,000 rows
   (4,450,000 bytes).
4) After that we do an all-AMPs STAT FUNCTION step from CAB.Orders_Filled by way of
   an all-rows scan with no residual conditions into Spool 5 (Last Use), which is
   redistributed by hash code to all AMPs. The result rows are put into Spool 1
   (group_amps), which is built locally on the AMPs. This step is used to retrieve the
   TOP 50000 rows. The size is estimated with high confidence to be 50,000 rows
   (4,450,000 bytes).
5) Finally, we send out an END TRANSACTION step to all AMPs involved in processing
   the request.
-> The contents of Spool 1 are sent back to the user as the result of statement 1.

SQL for Determining Load Utility Delay Times

Here is SQL for determining the number and the time delayed for utility jobs. In this example, the threshold for determining what is a delay is 10 seconds. You may want to alter that threshold, based on specifics at your site.

SELECT DISTINCT 
    L.SessionNo, L.LogonDate, CAST(L.LogTime AS Time ), CAST(L.LogonTime AS time),
        L.username(char(15)),
((extract(HOUR FROM L.LogTime ) * 3600)
+(extract(MINUTE FROM L.Logtime ) * 0060)
+(extract(SECOND FROM L.LogTime ) * 0001))
-((extract(HOUR FROM L.LogonTime) * 3600)
+(extract(MINUTE FROM L.LogonTime) * 0060)
+(extract(SECOND FROM L.LogonTime) * 0001))
(dec(8,2)) AS SesDelayTime, L.Event,
     Q.LSN
FROM dbc.logonoff L, dbc.dbqlogtbl Q
WHERE L.Event = 'Logon'
    AND (L.LogTime - L.LogonTime > 10.0) /* logon time exceeds threshold */
    AND L.LogonDate >= DATE '2010-03-23' /* from date */
    AND L.LogonDate <= DATE '2010-03-24' /* to date */
    AND (Q.LSN <> 0)
    AND (Q.SessionID = L.SessionNo)
ORDER by 2,4;

 
SessionNo LogonDate LogTime LogonTime UserName SesDelayTime Event LSN

1,054

3/23/2010 07:28:01 07:27:29 DBC 31.85 Logon 1,021
1,145 3/23/2010 10:58:06 10:57:32 DBC 33.86 Logon 680
1,145 3/23/2010 10:58:06 10:57:32 DBC 33.86 Logon 740
2,212 3/23/2010 23:24:42 22:47:23 CRASHDUMPS 2,238.57 Logon 835
1,003 3/23/2010 23:24:42 22:57:29 DBC 1,632.66 Logon 705



Estimated Time Differences

When the optimizer calculates the total estimated time for the plan, which appears at the bottom of the Explain, only the estimated time of the longer of the two parallel steps is added in. In the example below, there is an Explain with two parallel steps, each using 0.02 seconds, plus a few other steps. At the bottom of the Explain, you can see a total of 0.14 seconds for all steps. Only one of the parallel steps is included in this total. The Teradata Dynamic Workload Manager total, as it appears in the Database Query Log (DBQL) column TDWMEstTotalTime, is 0.16, which adds in the 0.02 seconds from the parallel step that was omitted in the Explain total. You can see the DBQL output further below.

SELECT explaintext 
FROM DBQLExplainTbl
WHERE QueryID=110940;
 
1) First, we lock a distinct dbcmngr."pseudo table" for read on a RowHash to prevent
   global deadlock for dbcmngr.alertcontrol.
2) Next, we lock dbcmngr.alertcontrol for read.
3) Then we execute the following steps in parallel:

   1) We do a single-AMP RETRIEVE step from dbcmngr.alertcontrol by way of the
      unique primary index into Spool 2 (all_amps), which is duplicated on
      all AMPs. Exit this retrieve step after the first row is found.The size
      of Spool 2 is estimated with high confidence to be two rows. The
      estimated time for this step is 0.02 seconds.

   2) We do a single-AMP RETRIEVE step from dbcmngr.alertcontrol by way of the
      unique primary index with no residual conditions into Spool 3 (all_amps),
      which is built locally on that AMP. The size of Spool 3 is estimated with
      high confidence to be one row. The estimated time for this step is 0.02
      seconds.

4) We do an all-AMPs JOIN step from dbcmngr.alertcontrol by way of an all-rows scan,
   which is joined to Spool 2 (Last Use) by way of an all-rows scan.
   dbcmngr.alertcontrol and Spool 2 are joined using a Not Exists join, with a join
   condition of ("(1=1)"). The result goes into Spool 3 (all_amps), which is built
   locally on the AMPs. The size of Spool 3 is estimated with no confidence to be 2
   rows. The estimated time for this step is 0.03 seconds.
5) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of an all-rows
   scan into Spool 1 (group_amps), which is redistributed by hash code to all AMPs. The
   size of Spool 1 is estimated with no confidence to be two rows. The estimated time
   for this step is 0.04 seconds.
6) We do an all-AMPs RETRIEVE step from dbcmngr.alertcontrol by way of an ll-rows
   scan with no residual conditions into Spool 1 (group_amps), which is redistributed
   by hash code to all AMPs. The size of Spool 1 is estimated with low confidence
   to be four rows. The estimated time for this step is 0.03 seconds.
7) We do an all-AMPs RETRIEVE step from dbcmngr.alertcontrol by way of an all-rows
   scan with no residual conditions into Spool 1 (group_amps), which is redistributed
   by hash code to all AMPs. Then we do a SORT to order Spool 1 by the sort key in
   spool field 1 eliminating duplicate rows. The size of Spool 1 is estimated with low
   confidence to be three rows. The estimated time for this step is 0.02 seconds.
8) Finally, we send out an END TRANSACTION step to all AMPs involved in processing
   the request.
-> The contents of Spool 1 are sent back to the user as the result of statement 1. The
   total estimated time is 0.14 seconds.

Below is the output from the view QryLogTDWM. You must use this view to see the new TDWM columns contained in the DBQLogTbl, rather than the QryLog view.

SELECT * 
FROM QryLogTDWM
WHERE queryid=110940;
 
*** Query completed. One row found. 21 columns returned.
*** Total elapsed time was 1 second.
 
ProcID                 16383
CollectTimeStamp       2005-07-14 10:10:37
QueryID                110940
UserID                 00000100
UserName               DBC
DefaultDatabase        DBC
AcctString             DBC
DelayTime              ?
WDID                   73
WDPeriodID             8
LSN                    ?
NoClassification
WDOverride
SLGMet                 T
ExceptionValue         ?
FinalWDID              73
TDWMEstMaxRows         4
TDWMEstLastRows        3
TDWMesTotalTime        .164438
TDWMAllAmpFlag         T
TDWMConfLevelUsed      N

The reason that the TDWMEstTotalTime column produced by the QryLogTDWM view has multiple decimal places is that milliseconds are used internally in building up the estimate. In fact, milliseconds are reported in the base DBQLogTbl table in the TDWMEstTotalTime column. However, in the QryLogTDWM view, used in the example above, the number of milliseconds is divided by 1,000 as part of the view definition.


Your Comment:
  
Your Rating:

Comments
 
can any one please tell me where should i place my technical questions

1/6/2011 6:27:56 AM
— Anonymous
 
You are absolutely correct! I appreciate your catching this and letting us know. It's fixed now. Thanks so very much, -Carrie

9/16/2010 6:03:10 PM
— Anonymous
 
Carrie I think there is a typo in Figure 1 of the Tpump article ---- step 7 should be AMP0 not AMP2

8/26/2010 12:28:42 PM
— Anonymous