Loading...

Tech2Tech

Applied Solutions

Simple-to-use tools empower data stewards to measure and improve data quality.

Everyone, especially senior managers, understands the cost of poor data quality. In its report “Data Quality and the Bottom Line,” The Data Warehouse Institute (TDWI) estimated that data quality issues cost US businesses more than $600 billion per year.

It doesn’t matter how much information your organization captures if it’s not accurate. Poor data quality creates inaccurate and inconsistent reporting, which leads to uninformed or bad decision making, the inability to set and execute strategies, increased operating costs, lower customer satisfaction, and probably decreased business and customer attrition. As a result, decision makers will lose confidence in the data warehouse, limiting its potential for impact and benefit.

Quality Rules

Figure 1: Data Quality Improvement Model

Click to enlarge

Data quality refers to fitness-for-use, i.e., the suitability of the data for its intended use. Information used for marketing, for instance, does not need the same level of accuracy (to x decimal points) as that used for financial reporting.

Rules for data quality fit into the quality assurance cycle. (See figure 1.) A well-designed scorecard, based on the quality rules, provides the insight for data stewards and governance bodies to:

  • Understand quality issues
  • Evaluate improvement opportunities
  • Measure improvement over time

Exploring Data

Critical elements in the quality assurance process are the ability to explore data and ask both open-ended and closed-ended questions. As part of the Teradata Data Quality Scorecard Service, specific tools are provided to the data steward to enable these capabilities in a simple and cost-effective manner:

  • Teradata Warehouse Miner—Teradata Analytic Data Set (ADS) Generator and Teradata Profiler
  • Teradata Professional Services—Teradata Data Quality Rules Manager (DQRM)

7 Steps to Data Quality Compliance

Follow these seven steps to explore the business rule “the value of age should never be negative” with Teradata Warehouse Miner’s Teradata Profiler and Teradata Professional Services Teradata Data Quality Rules Manager (DQRM):

  1. Connect to the Teradata system containing the data.
  2. Create a new (or open an existing) project to hold the analyses that the data steward wishes to create for data exploration.
  3. Add at least one analysis to the project. For example, pick a Teradata Profiler Frequency Analysis.
  4. Configure the analysis by picking the tables and column of interest—age or date of birth—from the drop-down menu.
  5. Set any non-default output options or configure a Where clause, such as "age < 0."
  6. Execute the analysis using the run icon.
  7. Examine, interpret and use the results.

The data steward can repeat steps 3-7 for any data quality question he or she wishes to ask, either as a prelude to entry in DQRM or as a follow-up to rules violations reported by that tool.

The Teradata Warehouse Miner tools offer a graphic, point-and-click interface that transparently generates and executes SQL to perform the data steward’s bidding. Teradata Profiler does not require any knowledge of SQL for data exploration. To ask specific questions of the information, the data steward might need only to learn the syntax of a SQL Where clause. The ADS Generator contains all of the functionality of Teradata Profiler but extends it with the ability to address data quality questions requiring complex joins or transformations. Knowledge of SQL structure—but not the details of the syntax—is required to use it.

Create and Try Out Your Own Solution

Here’s how to set up a data quality solution in a four-week proof of concept (POC):

Follow POC data quality business rules:

  • Identify key data stewards and IT users
  • Document 10 representative data quality business rules
  • Implement the rules
  • Populate the data quality rules data model with all 10 rules
  • Test the rules

Create a POC environment:

  • Acquire Teradata Data Quality Rules Manager (DQRM) and Teradata Warehouse Miner’s Teradata Profiler
  • Install the software

Produce data quality reports and scorecard:

  • Identify and design 10 data quality reports and scorecard
  • Configure the reporting tool to produce the reports and scorecard
  • Implement and test them

Implement a knowledge transfer:

  • Develop documentation on the rules, reports and scorecard
  • Deliver knowledge transfer onto Teradata Profiler and DQRM for data stewards and IT users

These tools derive their simplicity from their architecture and ease of use. Unlike other data profiling options that require a three-tier architecture, Teradata Warehouse Miner tools are two-tier: The first tier is the Windows client on which it runs, and the second is the Teradata system containing the data. Microsoft’s open database connectivity (ODBC) protocol connects the tiers.

The Teradata Warehouse Miner tools allow the data steward to see the SQL that is invisibly generated. This facilitates the data steward’s use of DQRM by allowing the user to easily cut and paste relevant clauses from the SQL into the tool.

Teradata DQRM is also a two-tier application, with the Java database connectivity (JDBC) protocol connecting the tiers. It transparently generates and executes SQL optimized for Teradata systems and requires no knowledge of SQL to define, implement, execute and maintain for data exploration. To ask the data questions, the data steward might need only to know the syntax of a SQL From and Where clause.

PLAYS WELL WITH OTHERS

Figure 2: Enterprise Data Management Architecture

Click to enlarge

Teradata Profiler, ADS Generator and DQRM can be used with third-party data integration and data quality tools to deliver a quality solution. (See figure 2.) Teradata Profiler and ADS Generator help discover data anomalies that can be evaluated for perpetual quality rules and entered into DQRM. Using Teradata DQRM Utility Pak, data rules can be triggered by third-party integration tools, which can then trigger integration workflows if data errors are discovered.

Simple Approach to Quality

The combination of Teradata Warehouse Miner tools and DQRM provides data stewards with a self-service, in-database data quality solution that is optimized for the Teradata Database. These tools offer a simple approach to check and monitor data quality, ensuring the data has the accuracy, reliability and consistency that decision makers need. Any organization can dramatically improve its data quality by utilizing an innovative quality solution that combines the speed, simplicity, self-service and low cost that these tools offer.


Your Comment:
  
Your Rating:

Comments