Loading...
Carrie Ballinger, senior technical consultant,
Teradata Certified Master

Carrie Ballinger, senior technical consultant, Teradata Certified Master

Tech2Tech

Tech Support

Simple Pleasures

A Teradata Certified Master provides solutions to technical challenges.

“Simplicity is the ultimate sophistication.” I’m reading the quotation attributed to Leonardo Da Vinci that is hanging on my cubicle wall. Those words have always resonated with me. Simple approaches are easier to understand, faster to implement and deliver broader satisfaction.

Doing things the simple way is also more comfortable, less stressful and a lot easier to change or build upon if you need to. So enjoy these questions that I’ve received and answered as the simple pleasures they are intended to be.

AMP-less Nodes and Message Queue Length

Dear Carrie:

I am working on a system that has 25 nodes total. Twenty-one have AMPs, and four are parsing engines (PEs) only with the mainframe channel connec­tions. Since I have 21 with AMPs, and I know that the maximum message queue length per work type is calculated as the number of nodes plus five, is that maximum going to be 21 + 5 = 26? Or will all nodes, with or without AMPs, be considered, which would make the maximum 25 + 5 = 30?

—Queue Quizzer

Dear Quizzer:

Each AMP has a queue that temporar­ily stores messages that require, but are unable to get, an AMP worker task (AWT) when the messages arrive on the AMP. On systems with fewer than 16 nodes, the message queue has a limit of 20 for the number of messages of each work type that it will hold. On a large system, the message queue length (or depth) limit equals the number of nodes, plus five, for each work type.

An enhancement made in Teradata 12.0 accounts for nodes with no AMPs. From that release on, only nodes that have AMPs are included in the maximum message queue depth calculation.

This is an approximation of the current formula used to determine the maximum number of messages that can be contained within an AMP’s message queue for each work type: Message queue length limit = MAX (20, 5 + number of AMP-full non-HSN nodes). As you can see, both AMP-less nodes and hot standby nodes are excluded from the calcula­tion. Figure 1 makes this point based on the situation that you described.

Figure 1: Calculating the Message Queue Length on an AMP

Click to enlarge

For more information on what happens when the message queue reaches its limit, see my blog “Controlling the Flow of Work in Teradata” on Developer.Teradata.com.

Maximum Data Block Size

Dear Carrie:

In your Orange Book “Compression in Teradata 13.10,” you recommend setting the data block size to the maximum of 127.5KB or 255 sectors for tables that are going to be compressed. I’ve been told that with a 255-sector data block, a cylinder can only hold 14 blocks, but it can hold 15 with a data block size of 254 sectors. I’m concerned about wast­ing disk space if I go with the largest possible data block size, as you suggest.

—Block Buster

Dear Buster:

Your point is generally correct, but it applies only to uncompressed data blocks as they are written to a cylinder. With block-level compressed data blocks, the recommendation is to set the parameter to the largest possible size just before the table is compressed. The expectation is that the data blocks will be smaller at the time they are written to the cylinder.

During the compression process, a temporary block is built in memory that reflects the specified maximum block size.

As one block is built, it is immediately compressed before being placed into a cylinder alongside other already-compressed blocks. When the compressed data blocks are placed in the cylinder, they are usually smaller than the 255 sectors specified as the maximum uncompressed block size.

What Is a Sector?

The Teradata file system allocates space in sectors. Each sector is equivalent to 512 bytes (by default). In a given data block, no more than 256 bytes, on average, will ever be wasted because of this sector-based allocation. The minimum data block size is 9KB or 18 sectors, while the largest is 127.5KB or 255 sectors.

Additional header data is added to each compressed block, and none of the block headers and trailers is ever compressed. One reason maximum data block size is recom­mended is to maximize the percent of user data versus block overhead being allocated. This improves your compression ratio.

A standard cylinder consists of 3,872 sectors. Forty-eight sectors are set aside for the two-cylinder indexes supporting that cylinder, leaving 3,824 sectors available for data blocks.

This is the cylinder utilization if you define 254 sectors as your maximum data block size:

  • 3824÷ 254 = 15.055 or 15 blocks
  • 15 × 254 = 3,810 sectors of a cylinder being utilized, or 99.6%

Cylinder utilization is slightly less for data blocks not undergoing compression if you define 255 sectors as your maximum data block size:

  • 3824 ÷ 255 = 14.996 or 14 blocks
  • 14 × 255 = 3,570 sectors of a cylinder being utilized, or only 93.4%
Table: Cylinder Utilization

Click to enlarge

Figure 2: Cylinder Utilization

Click to enlarge

The table below shows another way to look at these numbers. With block-level compres­sion, the data blocks are compressed before being written to disk, and the final block size is not known ahead of time.

Look at figure 2 for an illustration of the point you were making.

Changing Partitioning Column Values

Dear Carrie:

What will happen if we run a massive update against a large partitioned pri­mary index (PPI) table and we update a date column that is used in the PPI partitioning definition? What would be the overall impact on performance?

—Pondering Partitioning

Dear Pondering:

I don’t encourage you to change partition­ing column values, although the database will allow you to do so. If you do, that action will be treated as an unreasonable update. All the rows to be changed would have to be written to the transient journal. In addition, the changes will be written to a spool file and then deleted from the base table. This will be followed by a sort of the spool, and finally, the rows will be re-inserted into the base table from the spool.

Updating the partitioning column can cause the location of a row to change because it will have to be moved to a new partition, so changing that date value is considered an unreasonable update. This is true even if the partitioning column is not one of the columns with the primary index (PI) definition. If partitioning columns are updated without updating any PI columns, it is still an unreasonable update and each affected row must be deleted from one partition, then reinserted into another, even though the row will remain on the same AMP.

If there is a need to update partitioning column values, a better choice is to create a new table, do an insert into that table with the changed data, then drop the old table, rename the new table to the desired name and grant privileges as needed.

Being Unreasonable

An unreasonable update is one that causes the location of the row to change and is most commonly thought of as an update that changes the pri­mary index value. This type of update involves a lot of extra work compared to a simple update. This is because the primary index value of a row deter­mines the physical location of that row on disk. When you change a primary index value, it requires removing the row from its current location and then inserting it into a new location, usually on a different AMP.

Passing Query Bands in Macros

Dear Carrie:

I am trying to pass two input parameters into my macro and use them for setting a query band for the transaction that executes within the macro. One param­eter is for the query band name, and the other is for the value. However, I am get­ting a syntax error. I’m on Teradata 13.0. Any idea what I am doing wrong?

—Band Breaker

New Ways to Utilize Query Bands in Teradata 13.0

  • With the Trusted Sessions feature, the middle-tier application can set the active session identity and role for the end user using query bands.
  • Block-level compression uses them to indicate whether a table should be compressed as it is loaded.
  • Teradata Virtual Storage uses TVSTemperature query bands to indicate the preferred initial tem­perature of the data being loaded.
  • Utility throttle enhancements allow for reserved query band names to be set by the client utility, passing information such as utility name and data set size that can help the database select the appropriate number of AMP sessions for the job.

Dear Breaker:

Unfortunately, you are attempting to do something that is not supported. You can indeed set a query band from within a macro and pass it as a parameter. This is a new feature in Teradata 13.0. However, you must send the entire string—the name and the value together—in a single parameter. You received an error because you broke it into two pieces.

This is the macro as originally defined, passing the Query Band name and value as separate parameters. This statement will not parse:

REPLACE MACRO MyMacro (Input_Field1 Decimal (12,0) 
, Input_Field2 Char(20)
)
AS
(
SET QUERY_BAND ='''First_Input='||:Input_Field1||'Second_Input='||:Input_Field2||';'''
FOR TRANSACTION;
SELECT MyColumns
FROM MyTable1;
INSERT INTO MyTable2;);

This is an example of what a macro that is passing a Query Band should look like. This syntax parses and executes:

REPLACE MACRO MyMacro (QBAND VARCHAR(2000))
AS
(
SET QUERY_BAND= :QBAND FOR TRANSACTION;
SELECT MyColumns FROM MyTable1;);










EXEC MyMacro('ClientUser=WCS47;Group=Sales;' );

*** Set QUERY_BAND accepted.
*** Total elapsed time was 1 second.

*** Query completed. One row found. One column returned.
  mycolumns
-----------       
        
100


Your Comment:
  
Your Rating:

Comments
 
Carrie it is always a pleasure to read your articles. I've learned so much from you. Thanks.

2/25/2012 1:50:03 AM
— Anonymous
 
The following that pertains to Teradata is part of a letter we are sending out to gather support for an Occupy Dental Office protest in Las Vegas. ...We further clarify the cause of the Rush Limbaugh meanness in today's America with the story on our website of a sadistic fraternity brother I knew (now a senior manager at Teradata) who got down on his knees to suck for his position when he was young and never got off them. Michael is your typical power mad corporate conservative, closet homosexual bully who gives a clear picture of what Newt Gingrich and his ilk are all about. You have to have known one of these jerks on the right close up to make sense of their vicious idiocy. The key to these double-talking, ever sneaky robots is their inability to get excited about anything but the cruelty they impose on others that partially relieves their humiliation in life from having had to submit to get power in the slave colony America horror movie all of us are forced to live in or run like hell from. We

12/10/2011 12:41:40 PM
— Anonymous
 
please forgive me if it is not the right place to post this type of questions if so please let me know where i can post my questions. which is the best option for the question What is a use of the Primary Index in the Teradata Database ? 1.)To support efficient data access 2.)To ensure enough AMPs are involved to reduce query processing time

9/5/2011 8:30:34 AM
— Anonymous