Tech2Tech

Applied Solutions 3

System evolution

Design enables technological enhancements in the Teradata Purpose-Built Platform Family.

As the need to maximize the analytical agility among competitive businesses grows, technology must progress to accommodate those changes. The basic principle in data warehouse system design—and a key challenge—must be to adopt new technology as it becomes available without a significant re-engineering effort that might affect customer cost or delay the benefits.

Teradata has developed a modular approach that encapsulates system components into common functional areas such as CPU nodes, storage and system interconnect. The collection of components—or blocks—can be independently enhanced to accommodate each changing technology area while minimizing time-consuming and expensive system re-engineering.

AppliedSolutions3_figure_tn

Click to enlarge

Built for technology advances

Based on its intended use, every member of the Teradata Purpose-Built Platform Family is designed from a unique, integrated combination of these blocks. Using this modular approach, key industry-leading technologies are incorporated to enable platform family members to be tailored for specific functions. (See table)

Features and performance are optimized for individual platform application spaces by engineering an appropriate balance of system resources, such as CPU power, I/O throughput, storage capacity and functionality. And platform management is greatly simplified by Teradata tools and processes.

The massively parallel processing (MPP) architecture, nodes and CPU, enterprise storage and infrastructure, BYNET interconnect, systems management, availability features, and virtualization make up the system’s multiple unique technology blocks. (See figure.)

AppliedSolutions3_table_tn

Click to enlarge

Massively parallel Processing

All platform family members use the MPP architecture that is implemented with virtual and physical hardware and software techniques. This means the system’s multiple computer nodes and their associated storage and infrastructure act independently in a shared-nothing configuration.

The MPP shared-nothing value is to deliver linear increments of system scalability. This is important, as it fully supports business growth and data warehouse maturation by delivering efficient and predictable performance along with data storage increase via system expansion or replacement.

The multiple-node architecture provides built-in availability capabilities. For instance, when a node—the basic processor block—is taken out of operation, other nodes take over the processing load, minimizing or eliminating any impact to the user community.

Nodes and CPU

The heart of the node technology is the processor component. Teradata has long utilized Intel Xeon processors, which have proven to be the highest-performing and most rapidly evolving CPU technology available for enterprise computing.

Starting with the single-core technology and moving to the dual-core and then the quad-core CPU, development of Teradata nodes has gracefully followed Intel’s technological advances. The Teradata MPP architecture enables each node evolution to fully deliver the performance potential of Intel’s evolving processor technology to the data warehouse workload.

Enterprise storage and I/O

A key to harvesting the full power of the node technology is to provide data to the node quickly enough to keep the processor fully busy without waiting for data storage to be available. To achieve this level of productivity, each node in the Teradata system is equipped with multiple high-speed storage adapters that operate in parallel to provide the necessary data bandwidth to the attached disk storage array subsystems.

This subsystem can scale as needs change and is configured with a sufficient number of high-performance disk drives to meet the requirements of the node. As an example, to meet the performance needs of some members of the platform family, large numbers of enterprise-class 15,000 rpm drives are necessary to satisfy the data I/O requirements of the node in those platforms. As these needs change, the drives and storage arrays are enhanced to maintain compatibility.

BYNET interconnect

Unlike a general-purpose network, the BYNET interconnect subsystem, the heart of the MPP system, intelligently links the nodes together to enhance the performance of the basic parallel database operations.

For instance, because BYNET executes the final merge/sort of the answer set for a query in parallel, the potential performance bottleneck of a single resource carrying out this key operation is eliminated. BYNET provides guaranteed query message delivery to each unit of MPP parallelism, freeing the database from the costly overhead of implementing this check. And because it scales proportionally with the size of the system, BYNET supports full system performance.

A key value of the BYNET subsystem is its ability to leverage hardware interconnect technology that best meets the scalability needs of each platform family member. As a result, the scaling capabilities range from tens of nodes with 1GB or 10GB Ethernet technology to 1,024 nodes with the BYNET switch infrastructure.

Systems management

The Teradata integrated systems management infrastructure provides a single operational view to simplify platform command, control and monitoring. Independent management computers are embedded within the platform system cabinets and connected to a dedicated, parallel management system network. Intelligent software agents communicate to all of the platform system components via their standard interfaces for management and event notification.

Systems management performs routine tasks, such as orderly start-ups and shutdowns, and protects the platform from potential disruptive failure due to power losses or dangerous temperature conditions. It also scales to keep pace with the platform’s growth and can easily accommodate the thousands of components and subsystems in large system configurations.

Availability features

To provide the varied levels of high availability required across the Teradata platform family, individual members leverage built-in features such as redundancy and failover. Redundancy provides multiple components that take over full operation if one component fails. For nodes and cabinets, this includes power and cooling elements and network interface cards. Storage subsystems are built with mirrored redundant arrays of independent disks, along with redundant disk array controllers and power/cooling components. System-level components, such as BYNET and systems management, are also fully redundant to ensure complete system availability in case of a failure.

As a multi-node system, every MPP-based family member has a built-in automatic capability for node failover. When a node fails, its workload is migrated to a functioning node, increasing the workload of that node, which necessarily reduces total system throughput. However, to eliminate the performance impact of a failed node, a hot standby node capability can be provided in the Teradata Active Enterprise Data Warehouse platform for mission-critical workloads.

Innovation marches on

Rapid innovation in two key areas of computer technology has significantly affected the Teradata architecture and modular approach. The first of these is the gain in processor performance delivered by the development of the CPU multiple core design. This approach creates multiple processing engines, or CPUs, on a single processor chip by leveraging the massive number of transistors, which double in number every one-and-a-half to two years, as Moore’s law states.

Chip designers initially created two CPUs on a chip where one used to be. These dual-core chips were soon followed by quad-core chips with four CPUs. Regular increases in the number of cores contained on a processor chip will continue to enhance performance, with six- and eight-core chips in sight. Unlike the gradual performance increases formerly realized from higher clock cycle speeds, the multi-core architecture delivers dramatic gains at reduced power consumption.

Through AMPs, the virtualization of system resources enables the Teradata platform to scale up the work assigned to the full potential of each multi-core chip. Also, as the latest processor chips are developed, they can be quickly introduced into the Teradata modular system design.

The second key development in data warehouse technology is high-performance disk storage. Unlike processors, the mechanical hard disk drives (HDDs) have nearly reached the physical limits of performance, with minimal potential for improvement going forward.

A new, rapidly maturing technology for storage, called solid state disk (SSD), uses flash memory technology (the basic technology used in camera memory cards and USB drives) to deliver breakthrough performance, with the promise of continuing dramatic increases in the future. Performance gains for the latest SSD over HDD technology scale from more than 20 times faster on random data operations to 150 times for basic I/O throughput. These levels of gains provide a path for matching the rapid performance advancement of multi-core processor technology. Drawbacks of SSD are reduced storage capacity and higher price per unit of storage as compared with HDD, but this will improve over time.

The modular approach to the Teradata Purpose-Built Platform Family design provided the means for Teradata to bring to market the industry’s first SSD-based data warehouse appliance.

As CPU and disk storage technology continue on a steep innovation ramp, the massively parallel Teradata Database easily accommodates these advancements while delivering ultimate performance.

—K.R. and J.D.

Virtualization

A key component of the platform design is the virtualization of system resources. Virtualization is a software or firmware layer that can define and translate the characteristics of an underlying hardware component or subsystem to a standard set of functions. This approach isolates the physical characteristics of a component from the services using it, thus allowing for flexible system enhancements and configurations while optimizing the performance of a component. The Teradata MPP architecture is based on the virtualization of the processor node resources with the assignment of parallel work units, called AMPs, to each node as determined by the capability of the node and platform.

Likewise, by eliminating the dependence on direct linkage to each physical disk drive, Teradata Virtual Storage enables the system to contain disk drives of different capacity and performance characteristics.

Investment protection

Data warehouses are seldom static. They must be able to grow as technology advances. The ability to easily blend new technology—and its improved price, performance and functionality—with the existing technology eliminates the need to replace complete systems. It also avoids the risk of expanding a system with out-of-date technology. Using two approaches, the MPP platform members can individually grow by adding nodes or cabinets of the next generation to a system that has current technology:

  • Coexistence. For platforms with flexible configurations, this approach balances the data storage capacity and database workload proportionally to achieve the full performance capabilities of each node generation.
  • Coresidence. In systems that use the fixed appliance packaging approach, the added-on, new-generation nodes will perform at the same level as the existing nodes, thereby protecting the original investment.

The Teradata Purpose-Built Platform Family is designed to be continuously refined as technology responds to business growth needs. This adaptability enables each family member to deliver maximum investment protection.


Your Comment:
  
Your Rating:

Comments
Datawatch Q4-2014