The Grande-Duchesse Josephine-Charlotte Concert Hall in Luxembourg has 823 facade columns made of white steel.
Pillar of Performance
Teradata Columnar offers a hybrid row and column database to deliver extreme performance along with dramatic compression.
Teradata Columnar offers DBAs a new physical database design option to achieve extreme performance and dramatic compression. While delivering the performance and data compression promise of columnar technology, Teradata Columnar is easy to use and integrated into the Teradata Database. No longer do you have to sacrifice other database attributes to take advantage of column-oriented database tables to reduce I/O and improve performance.
Extreme Performance by Design
Performance is always an underlying goal in the data warehouse environment. Teradata Columnar provides a physical database design technique that allows DBAs to specify that a table should be vertically partitioned so that the data for a column is stored together in a columnar manner. This allows the physical storage to be matched to the query patterns, eliminating unnecessary I/O when queries require only some of the columns in a table. Removing the I/O from these common queries increases query performance and system throughput.
Teradata Columnar is built upon the existing no primary index (NoPI) table and row partitioning capabilities of a multi-level partitioned primary index (ML-PPI) of the Teradata Database. Now a NoPI table can be vertically partitioned by column and horizontally partitioned by row.
Column-partitioned data can be stored in a new structure called a “container.” A container has many data values packed into it with a single header. The physical space is allocated and managed by the Teradata file system, along with all other space in the Teradata Database. By filling a container with values from a single column and applying partition elimination logic, only the columns referenced in a query are read.
If row partitioning is also defined on the table, the multi-level partitioning capabilities of the Teradata Database combine the column and row partitioning definitions to make the partitions defined on each level. Both row and column partition elimination reduce the partitions that are accessed for each level so that only the individual column data for the proper row ranges is scanned for any query. I/O can be reduced to a fraction of a nonpartitioned and noncolumn table design.
Dramatic Data Compression
In addition to changing the internal data organization for the table, Teradata Columnar includes automatic dynamic compression and new compression mechanisms. Because all the data in a column storage container is for the same column, it is more consistent and lends itself to more compression algorithms than row-oriented data. To take advantage of the consistency within the data, Teradata Columnar includes various compression mechanisms, such as:
- Run length encoding
- Trim leading or trailing bytes or characters
- Delta from mean
- UNICODE to UTF8
Teradata Columnar can apply multiple compression mechanisms to a column partition. Applying this broad set of mechanisms to homogeneous data in columnar storage achieves dramatic compression rates.
Ease of Use
Teradata Columnar is automated and easy to use. Column partitioning is actually simpler to specify than row partitioning. Because column partitioning separates the storage for the columns from each other, the boundaries are defined by the normal column definitions. Therefore, all that’s required is to use the COLUMN keyword in the PARTITION BY clause.
Once a container is filled during the load process, the data is automatically analyzed to determine the best compression mechanisms and dictionary compression values. Auto-compression then applies the identified compression unless the data definition language (DDL) turned the auto-compression off. The resulting space is filled and compressed with the identified mechanisms until the container is filled with compressed data.
This process is repeated for each container representing the column, so if the data evolves over time the best compression mechanisms are dynamically chosen for each container of data. No DBA analysis is required to ensure that the applied compression mechanisms remain optimal over time and, if dictionary compression is used, dictionary compression values cannot become stale over time.
Once PARTITION BY COLUMN is specified in a CREATE TABLE statement, SELECT statements do not need to be changed in any way. The database knows how to access the data and the optimizer creates the most efficient query execution plan. But, if the data in a column partition is not referenced (e.g., isn’t a return value or in a predicate), it is not read from the disk. It’s that easy!
Integrated in the Teradata Database
Teradata Columnar is integrated in the Teradata Database so it works with all of the other features of the database. While many columnar databases are targeted for limited-use data marts, Teradata Columnar enables the integrated data warehouse to benefit from the extreme performance and dramatic compression resulting from storing the appropriate portion of the data with a column orientation.
By building Teradata Columnar on the existing partitioning infrastructures and integrating it into the Teradata architecture, column orientation becomes an integral part of the database. Capabilities valued by Teradata users aren’t compromised when taking advantage of Teradata Columnar:
- Rich SQL language
- Performance features such as indexes
The most efficient plan for each query is built whether the tables (or their join indexes) are row-oriented, column-oriented or a mix of both.
Each member of the Teradata purpose-built platform family takes unique advantage of Teradata Columnar, just as they take advantage of other features. For example, the Active Enterprise Data Warehouse platform features Teradata Virtual Storage and hybrid storage work with Teradata Columnar to improve multi-temperature data management.
Isolating the storage for each column makes data temperature measurements from Teradata Virtual Storage more targeted by applying them to the data in a column rather than a mix of data from all columns in a table. The data in the most-used columns naturally becomes hot and migrates to the fastest available storage. And dramatically compressed columnar data increases the amount of raw user data that fits into the fast storage.
Columns of Power
The Teradata Database now offers the benefits of row and column orientation. Users will benefit from extreme performance and dramatic compression as a result of this new, easy-to-use capability integrated into the Teradata Database.