Loading...
Teradata 13.10 heals the rift in the space-time data continuum.

Tech2Tech

Ask the Experts

Where No Release Has Gone Before

Teradata 13.10 heals the rift in the space-time data continuum.

Every new release of the Teradata Database offers improved functionality and performance enhancements. But some releases go beyond form and function to deliver game-changing capabilities that promise greater enterprise agility.

That’s certainly the case with Teradata 13.10, affectionately referred to by Teradata insiders as the "space-time" release. With temporal and geospatial processing front and center, this new release gives organizations the ability to better analyze geographic and datetime data in context with other reference data, such as customer records and inventory.

But that’s not all. Teradata 13.10 offers new data compression options, improved workload management, increased availability and better performance. Todd Walter, Chief Technologist, Americas, sat down with Teradata Magazine to answer some questions and explain the most significant enhancements in this release.

The ability to access and analyze temporal data is a marquee capability [of Teradata 13.10], representing a fundamental innovation in data warehousing technology.

What innovations are in the new Teradata 13.10 release?

The ability to access and analyze temporal data is a marquee capability, representing a fundamental innovation in data warehousing technology. The idea behind the use of temporal data is to better understand the time dimension of data that exists within the warehouse. Many companies already do this through "effective date" processing, but that method is complex, and some special skills are required to do it correctly, efficiently and effectively. The optional temporal implementation in Teradata 13.10 makes it easier and, therefore, more practical.

Do I have to learn a whole new area of SQL semantics or write very complex queries to take advantage of the temporal nature of my data?

This release takes away a lot of complexity from the end user. You don’t have to understand a lot of detail about dates or know an entire time series of data or even understand what data represents the current state of the organization. Instead, you have to know only what question you need answered, like "Who are my current customers?" or "Who were my customers on a particular date in the past?" or "How has this particular customer changed over the past five years?"

Those types of questions can be difficult to answer using effective dates—you’d have to be a SQL expert to write such a query. And those questions are nearly impossible to answer if you don’t have effective dates. Temporal processing means any user can simply ask the question and get the answer needed.

Trying to capture this changing data sometimes highlights extract, transform and load (ETL) complexities and data quality issues. How does Teradata 13.10 address these concerns?

It does so by allowing automatic processing of bi-temporal updates. So it’s a similar story here; it would take a real specialist in effective date processing to be able to do even simple updates and loads, and if you have two different dates—the transaction date and the date it’s actually valid—it becomes extraordinarily complex and consumes a lot of time and resources. With the temporal processing functionality in Teradata 13.10, all the user has to do is make the SQL update; the system takes care of all other table and row updates automatically.

Top 8 Features and Options in Teradata 13.10

  1. Temporal option
  2. Geospatial enrichments
  3. Compression
    • Multi-value compression of VARCHAR columns
    • Algorithmic compression
    • Block level compression
  4. Performance
    • Partitioned Primary Index (PPI) character and timestamp column support
    • FastExport without spool
    • Merge data blocks during full table modify operations
  5. Workload management
    • Teradata Active System Management moves to Teradata Viewpoint portlets
    • Teradata Active System Management manages utilities
  6. Quality
    • Fault isolation
    • Read from fallback
    • Automatic cylinder packing
  7. Extensibility
    • User-defined SQL operators
  8. Other enhancements
    • Large cylinder support
    • New hashing function
    • Security
    • Built-in arithmetic calculations and byte/bit manipulation
    • And more

What other major new features are in Teradata 13.10?

Geospatial data processing has been significantly improved. People are beginning to recognize that everything represented in the data warehouse has a location and that the location needs to be part of the overall analysis. For example, you can determine the location of a customer relative to the location of a store or to a cell tower coverage area. The spatial question doesn’t stand alone, either; it needs to be answered in the context of everything else you know about the customer—whether he is a high-value customer or she has a history of service issues.

Interestingly, the data already exists, and most customers know, organizationally speaking, that it’s there. But too often it’s stored in some separate system isolated from everything else. For example, the system might have customer demographic data, which you can use when you’re deciding whether to open a new store, but you can’t analyze that same information in context to determine whether those customers are high-value and can support a new store. Conversely, the data warehouse might have an address, but the system doesn’t know anything but a string of characters if it isn’t mapped or translated into distance between points A and B, and it’s not connected to customer purchases.

Enhanced geospatial data processing capabilities add another dimension to the questions and answers that are already available within the data warehouse.

How does the system cost-effectively manage the additional data that today’s business needs require?

The short, and obvious, answer is data compression. CPU power has been growing exponentially for a while now, so those leaps in power are becoming huge. At the same time, I/O capability hasn’t increased at even remotely the same rate, so I/O rates are falling way behind available CPU power. We’ve gotten to the point where CPUs have to wait for I/O, and that means we can no longer use all of the CPU power for processing applications. Compression takes advantage of available CPU power to reduce the size of the data and thereby reduce the amount of I/O necessary to read it.

For a long time, we’ve offered multi-value compression, which is great because it doesn’t require any additional CPU power. The tradeoff, however, is that multi-value compression doesn’t compress as much as some other methods. With this release, we added multivalue compression of VARCHAR columns, which is something users have requested, but we also added two new types of compression:

  • Algorithmic compression allows people to apply a specific algorithm to a column within a table. By applying an algorithm built into Teradata 13.10, this method is particularly effective for long character string columns. Using a second built-in algorithm also makes it really effective for global companies that have to use Unicode storage, which requires twice as much space. When much of the data is traditional Latin characters, you see lots of empty space. This type of compression fixes that. Customers can install additional compression algorithms to company-or industry-specific data.
  • Block level compression operates at a low level and takes all of the data in a storage block and compresses it before storage, then decompresses the entire block before the data is used again. This costs more in terms of CPU time, but it offers a greater reduction in data size and, therefore, it reduces the total I/O requirement, which ultimately can improve performance.

Many organizations store a lot of historical data because of legal or compliance requirements, or because they have some interesting business problems that could be solved by looking at historical detail. Using compression means that data can be stored on fewer devices or disks, saving money on data storage. DBAs can now make informed choices when balancing performance with storage and CPU time.

What performance enhancements are included in Teradata 13.10?

Teradata works on performance features for every new release. In Teradata 13.10, we’ve added some Partitioned Primary Index (PPI) improvements that allow character columns and timestamp columns to serve as partitioning columns. This release also allows users to manage a current partition by moving a current date on a PPI table.

We’ve also added a new mode for FastExport. When exporting a larger table, you can now experience much better performance by extracting the data without having to spool or pre-process it. This uses fewer system resources and so offers a significant performance improvement. Three client products take advantage of this enhancement: Teradata Parallel Transporter, FastExport and JDBC Driver.

Another enhancement is that the system can now merge data blocks automatically during full-table modify operations. Before this release, a system administrator or DBA would have had to recognize the situation, make a decision to do something about it and run that operation. Now it’s done automatically.

The functional improvements enabled by this release will probably lead to greater use of the data warehouse. What about workload management?

As workloads continue to become more and more complex, we’ll continue to introduce new ways to manage workload complexity. In Teradata 13.10, we’ve enhanced Teradata Active System Management so that utility management has access to all of the options available to query management; that is, you’ll be able to classify utilities, throttle them, filter them, decide which users can run them at what time of day, how many can be run by a particular user or application—just as you can with queries.

At the same time, we moved session management into the database, increased the maximum number of workload definitions from 40 to 250 and moved all of workload management and monitoring into Teradata Viewpoint portlets. These changes provide greater control and flexibility, and they offer a more intuitive and efficient user interface for workload management.

As workloads increase, so do availability requirements. What does Teradata 13.10 offer in this area?

For several releases, we’ve been addressing fault isolation to minimize the impact of internal faults or data issues. The new model is to identify which session, user or group of users is affected by the fault and repair the problem in a way that affects only those users instead of shutting down and restarting the entire system. Teradata 13.10 has several fault isolation features, including AMP, expression evaluation subsystem (EVL), unprotected user-defined function (UDF) and dictionary cache.

If data in the file storage for the database is damaged in some way (e.g., a disk starts reading a sector track badly), then we can read data from the Fallback copy to complete the query without aborting the query or initiating a reset. Perhaps applications run a little slower while the Fallback data is used, but we avoid a total shutdown and allow repair of the primary data table to be scheduled.

We added Auto Cylinder Packing for more efficient system operation and less obtrusive maintenance. In the past, if the system was running tight on space, the DBA had to run regular pack disk operations. Today it’s an automatic function that happens behind the scenes.

What’s new in the area of extensibility?

In this release, we offer new extensibility capabilities with user-defined SQL operators, which round out the ways companies can bring their own functionality into the database. This enhancement allows DBAs and application developers to take a complex SQL expression and package it in a function that is available to their users. Now, users don’t have to know that complex expression and utilize it in queries. They can just use the name of the operator written and installed by the DBA or application developer.

Teradata 13.10 makes it possible to code user-defined ordered analytics, in which the person writing the function can have access to an ordered row set, just like they do with built-in ordered analytics functions.

What else can we expect with Teradata 13.10?

This version of the database provides large cylinder support, a new hashing function and security improvements, plus many new built-in functions for uses such as arithmetic calculations, byte/bit manipulation and Oracle compatibility. We are addressing a lot in this "minor" release. We believe the impact of those enhancements will be anything but minor.


Your Comment:
  
Your Rating:

Comments
 


9/7/2011 6:17:39 AM
— Anonymous
 
Temporal option : Does it mean the ability to create snapshots (both periodic and cummulative) on-the-fly from transaction data? Something like GFI in version 4 of Vertica database?

10/5/2010 8:42:13 AM
— Anonymous