Tech2Tech
Hands On
Extend the possibilities
Choosing the extensibility feature allows for database behavior change and customization.
by Lorenzo Danesi
Businesses are facing challenges with increased computing requirements and data volume. It is difficult to meet service level agreements (SLAs) with certain computing methods—particularly those that require a lot of maintenance, which leads to supportability challenges. Database extensibility can replace these methods and offer scalability and parallelism, producing better performance.
Extensibility
Database extensibility is a mechanism by which you, as the user or developer, can extend the functionality of the database. Database extensibility allows you to change the behavior of the database and customize it to fit your needs.
User-defined functions
Structured query language (SQL) provides a set of useful functions, but they might not satisfy every requirement you have while dealing with your data. User-defined functions (UDFs) allow you to extend SQL by writing functions in the C or C++ programming language, installing them on the database, and then using them like standard SQL functions.
The Teradata Database supports three types of UDFs:
- Scalar, allows one row of input columns and returns a single output column. Scalar UDFs do not maintain context memory, which means once a row is processed, the next row does not have any recollection of the previous row’s information or calculations. The following example executes the CreditLimit scalar function for each row. The CreditLimit function uses the CustSales information to calculate a credit limit for the customer. Every row is independently processed.
SELECT CustNo, CustName, CreditLimit(CustSales) FROM Customer;
- Aggregate, also allows one row of input columns and returns one output column, this type using interim memory to bring together groups of rows according to the GROUP BY statement. The GROUP BY statement is used by the aggregate function to group the result-set by one or more columns. The RegressionUplift aggregate function, as shown in the example below, groups all of the weekly sales rows together by customer. The UDF applies custom logic to each group, producing one aggregated outcome.
SELECT c.CustNo, c.CustName, RegressionUplift(s.WeeklySales)
FROM Customer c, Sales s
WHERE c.CustNo = s.CustNo
GROUP BY CustNo, CustName;
- Table, allows one row of input columns but returns a table row at a time in a loop. The table function is defined in the FROM clause of the SQL statement and is prefixed by the TABLE keyword. This type of UDF can hold context memory for the entire result-set. The SalesFcst function example uses sales information to calculate the sales forecast for a stock keeping unit (SKU).
SELECT sku, forecast FROM TABLE (SalesFcst(sales.amount, sales.rules,
sales.promo)) AS t1;
Teradata UDFs have many benefits. They execute in the database distributed on all AMPs, thus automatically inheriting the scalability and parallelism of the Teradata system. In addition, Teradata UDFs simplify SQL, which provides better maintenance and supportability. The ability to customize SQL offers flexibility and allows encapsulation of proprietary business logic. And because the UDFs execute in the database, they can take advantage of workload management.
Stored procedure
A stored procedure (SP) is a combination of SQL statements and control along with condition handling statements that provides an interface to the Teradata Database. An SP is written by a developer and is compiled as a database object, which is executed on the Teradata Database. Typically, an SP consists of a procedure name, input and output parameters, and a procedure body. Teradata Database extensibility offers two types of SP, both invoked by the SQL CALL statement:
- Stored procedure language (SPL) combines SQL, control and condition handling statements.
- External stored procedure (XSP) offers the C/C++ or Java programming language.
The benefits in using SPs include performance by reducing network traffic between the client and server. Supportability is much easier because logic is encapsulated and enforced on the server. Security is increased by restricting user access to procedures rather than requiring them to access data tables directly. SPs support transactions, allowing better control. SQL language statements are embedded in an SP to be executed on the server through one CALL statement, making the user interface (UI) tier less complex.
Processing methodologies
Extract, process and load/extract, transform and load (EPL/ETL) is a common methodology used for dealing with large volumes of data. Data is extracted out of the database using SQL-based extract utility and managed on one or more external application nodes. The data is then loaded back into the database using SQL-based load utility. This is done over a high-speed network, which introduces an extra, complex layer to support and maintain.
Database extensibility technology can easily replace this methodology and produce better performance. Executing in the database eliminates a lot of I/O by distributing the function on all AMPs. Each AMP works on its own slice of information, resulting in increased throughput. As data volume increases, adding nodes automatically scales, ensuring consistent SLA.
The combination of SPs and UDFs keeps the processing within the database. This allows the client to make only one call to the database for the entire processing requirement. For example, a sales forecasting scenario could look like this:
Prepare data
1. Insert summarized point-of-sale data into a temporary table.
2. Identify store locations to process.
Condition handling
3. If no locations are found, exit.
Main
4. Execute the sales forecasting table UDF including the temporary table of prepared data as a part of the input.
Post-process
5. SQL statement(s) to roll up forecasting UDF results by location and merchandise hierarchies.
Maintenance
6. Collect statistics.
7. Clear temporary table.

Click to enlarge
The figure shows the outcome of a performance characterization between an EPL/ETL application that was ported to an extensibility application as conducted on the same hardware and data.
The outcome from the performance characterization confirms a dramatic decrease in I/O and wall time. Central processing unit (CPU) usage can fluctuate depending on how CPU-intensive the function is. External application server CPU usage was not included in the result. There was also a significant performance improvement executing the UDFs in unprotected mode versus protected mode.
When to use extensibility
When business requirements become too complex and SQL cannot meet performance requirements, database extensibility should be leveraged to meet performance goals. Using Teradata Database extensibility features can reduce complexity and increase performance, but like any other tool, you must know when and how to use it. Developing best practices, including determining the type of UDF and SP to use, will help you make the decision and ultimately satisfy your requirements.
Lorenzo Danesi is the manager of Database Architecture for Teradata Business Application including Teradata Solution for Teradata Demand Chain Management Forecasting and Replenishment.