Tech2Tech
Applied Solutions
Simple-to-use tools empower data stewards to measure and improve data quality.
by Danny Maddox and Dr. Frank Capobianco
Everyone, especially senior managers, understands the cost of poor data quality. In its report “Data Quality and the Bottom Line,” The Data Warehouse Institute (TDWI) estimated that data quality issues cost US businesses more than $600 billion per year.
It doesn’t matter how much information your organization captures if it’s not accurate. Poor data quality creates inaccurate and inconsistent reporting, which leads to uninformed or bad decision making, the inability to set and execute strategies, increased operating costs, lower customer satisfaction, and probably decreased business and customer attrition. As a result, decision makers will lose confidence in the data warehouse, limiting its potential for impact and benefit.
Quality Rules

Click to enlarge
Data quality refers to fitness-for-use, i.e., the suitability of the data for its intended use. Information used for marketing, for instance, does not need the same level of accuracy (to x decimal points) as that used for financial reporting.
Rules for data quality fit into the quality assurance cycle. (See figure 1.) A well-designed scorecard, based on the quality rules, provides the insight for data stewards and governance bodies to:
- Understand quality issues
- Evaluate improvement opportunities
- Measure improvement over time
Exploring Data
Critical elements in the quality assurance process are the ability to explore data and ask both open-ended and closed-ended questions. As part of the Teradata Data Quality Scorecard Service, specific tools are provided to the data steward to enable these capabilities in a simple and cost-effective manner:
- Teradata Warehouse Miner—Teradata Analytic Data Set (ADS) Generator and Teradata Profiler
- Teradata Professional Services—Teradata Data Quality Rules Manager (DQRM)
The Teradata Warehouse Miner tools offer a graphic, point-and-click interface that transparently generates and executes SQL to perform the data steward’s bidding. Teradata Profiler does not require any knowledge of SQL for data exploration. To ask specific questions of the information, the data steward might need only to learn the syntax of a SQL Where clause. The ADS Generator contains all of the functionality of Teradata Profiler but extends it with the ability to address data quality questions requiring complex joins or transformations. Knowledge of SQL structure—but not the details of the syntax—is required to use it.
These tools derive their simplicity from their architecture and ease of use. Unlike other data profiling options that require a three-tier architecture, Teradata Warehouse Miner tools are two-tier: The first tier is the Windows client on which it runs, and the second is the Teradata system containing the data. Microsoft’s open database connectivity (ODBC) protocol connects the tiers.
The Teradata Warehouse Miner tools allow the data steward to see the SQL that is invisibly generated. This facilitates the data steward’s use of DQRM by allowing the user to easily cut and paste relevant clauses from the SQL into the tool.
Teradata DQRM is also a two-tier application, with the Java database connectivity (JDBC) protocol connecting the tiers. It transparently generates and executes SQL optimized for Teradata systems and requires no knowledge of SQL to define, implement, execute and maintain for data exploration. To ask the data questions, the data steward might need only to know the syntax of a SQL From and Where clause.
PLAYS WELL WITH OTHERS

Click to enlarge
Teradata Profiler, ADS Generator and DQRM can be used with third-party data integration and data quality tools to deliver a quality solution. (See figure 2.) Teradata Profiler and ADS Generator help discover data anomalies that can be evaluated for perpetual quality rules and entered into DQRM. Using Teradata DQRM Utility Pak, data rules can be triggered by third-party integration tools, which can then trigger integration workflows if data errors are discovered.
Simple Approach to Quality
The combination of Teradata Warehouse Miner tools and DQRM provides data stewards with a self-service, in-database data quality solution that is optimized for the Teradata Database. These tools offer a simple approach to check and monitor data quality, ensuring the data has the accuracy, reliability and consistency that decision makers need. Any organization can dramatically improve its data quality by utilizing an innovative quality solution that combines the speed, simplicity, self-service and low cost that these tools offer.
Danny Maddox, a solution architect, has been with Teradata for more than 20 years.
Dr. Frank Capobianco, an advanced analytics/data mining consultant for Teradata, has more than 20 years of data warehousing experience.