Loading...

Tech2Tech

Applied Solutions 3

Maintaining relationships

Teradata Master Data Management eases hierarchy and
cross-reference management.

Organizing and understanding key reference and relationship data are essential in business analysis. But as an organization evolves, this information will change, posing a challenge to keep the data clean, organized and consistent. To ensure a complete and accurate view of the business, these changes to the relationship data must be managed with the same rigor and process as is applied to master reference data.

Key terms

Reference data: Uniquely identifies a product, customer or other business entity.

Relationship data: Describes how an entity relates to other entities.

Hierarchy: Relationship where logical groupings of objects or components are organized into a tree-like structure with levels arranged above (parent), below (child) and at the same level.

Cross-reference: Relationship of one object to another, typically applied to link a single golden record to one or more source records.

Master data design patterns

Teradata Master Data Management provides a framework to more easily manage relationships among reference data elements. This framework includes an internal metadata model that is based on the master data design patterns and Web-based user interfaces (UIs) to define and manage relationships and to view and interact with the relationships and the referenced master data.

As reference data is updated in the Teradata system, its relationship data can be simultaneously maintained and managed by the Teradata Master Data Management solution. This process offers a single view of an organization’s master reference and relationship data.

The same business rules, process flows, approvals and publishing capabilities that are applied to the master data are also applied to the relationship data. Thus, as changes occur, the relationship data can be maintained cleanly and accurately to ensure that it meets the organization’s quality and change-management requirements. Business intelligence (BI) applications and tools can then confidently leverage the data to produce accurate reports at any time.

Manage and view hierarchies

The Teradata Master Data Management solution can work with any type of hierarchy structure, including balanced (meaningful levels/consistent depth), unbalanced (inconsistent levels/inconsistent depth) and ragged (consistent levels/inconsistent depth). Furthermore, as data changes over time, the user can create numerous hierarchies as well as different versions of the hierarchies. This allows users to not only view the data’s current structure but also its past and future versions.

The solution’s viewing and manipulation capabilities, based on Adobe Flex technology, allow users to easily search the hierarchies, zoom in and out on their structures and manipulate them through a drag-and-drop interface. By clicking on an element in the hierarchy, users are directed to a table editor, or other custom function, where they can manage their underlying master data. (See figure 1.)

image

Click to enlarge

The user can perform various tasks from the Teradata Master Data Management UI, such as cross-reference relationship management, which identifies relationships between the data’s source and its target. A common pattern is to map multiple source systems to a pool of target master entities. Within this pool, a relationship can be defined to a standard entity, known as a golden master. In this way, duplicate entities can be related to one golden master. For example, if two source master records are linked to a common golden master record using the underlying relationship management model, the user can navigate from the golden master record to view any related cross-referenced records. The reverse is also possible because the model supports bi-directional navigation.

Another core service provided through Teradata Master Data Management is cross-reference matching and linking. This allows the user to organize records using specified parameters, such as Soundex-based matching, defined fields or characters. The service can be leveraged when a master data management (MDM) solution is being built or purchased as an out-of-the-box dashboard with the Teradata Customer Data Integration application accelerator.

Teradata is also a partner of numerous companies that provide best-of-breed matching and cleansing engines, such as Trillium, Informatica, SAP BusinessObjects and SAS DataFlux. These can be seamlessly leveraged within the Teradata Master Data Management framework.

In action

To answer fundamental business questions, a complete and accurate view of the organization’s information is required, making clean and integrated relationship data necessary. Simple questions such as number of customers, number of products sold, most profitable channel and which suppliers to conduct business with cannot be easily answered if the organization of that data (hierarchy), or the definition of product or supplier (cross-reference of multiple identities), changes frequently and is not properly managed and maintained.

BI tools can dynamically leverage the hierarchy and cross-reference implementations of managed relationship data to reflect a total and accurate single view of the enterprise through hierarchy or cross-reference management.

Hierarchy Management

The relationship data in MDM (hierarchy) can be populated and maintained to show typical product and brand relationships for any standard product hierarchy. Although the example diagrammed in figure 2 covers product relationships, the same concept can be applied to any domain of managed master reference data, such as customer, supplier and so on. With this hierarchy pattern in place, a BI tool can easily aggregate transactions for all products and report sales by brand against the current view of the hierarchy.

Changes to the product hierarchy can be made simply by performing a “drag and drop” operation, through either the stand-alone hierarchy viewer or a hierarchy viewer/report combination UI. Dragging an element from one parent to another—for example, moving the product Tie from the brand Gucci to the brand Allen Solly—affects only the relationship data and not the product or brand reference data. The move can be seen immediately in the UI viewers and in any subsequent BI reports. Without modifying the report definition, the new aggregated data—in this case, newly aggregated sales amounts for both brands—will be promptly updated.

Altering the hierarchy, however, can automatically invoke a workflow rule where an approval is required before the change can become active. For this reason, all modifications have date and time auditing so that when an analysis is performed, it can be applied to the data at a particular point in time or to a revised version of the hierarchy.

image

Click to enlarge

Product hierarchy example

The data constituting the Category -> Brand -> Product Hierarchy is shown in table 1.

Product master tables and views that constitute the product hierarchy for this example are shown in code sample 1.

Cross-Reference Management

Multiple product records can be related to different brands but linked to a common golden master. (See figure 3.) With a golden master cross-reference design pattern in place, a BI tool can easily aggregate transactions of all product duplicates and report it against the golden master entity, using the related cross-reference entries.

image

Click to enlarge

Some products may be the same but are sourced from multiple suppliers and exist under different, brand-specific product IDs. In these cases, as new product brand records are loaded, matched and linked, the cross-reference table can be updated manually or automatically in Teradata Master Data Management with the results of a golden master product record search displaying the relationship of all of the related product records.

When a new product is introduced into the workflow process and linked to a brand, the product cross-reference function will publish event or SQL updates that are tied to the product. Consequently, as in the hierarchy management function, any subsequent execution of the BI report will automatically show the new aggregated data. For example, as new products by brand are introduced, a BI report can show consolidated orders by golden master product ID, organized by supplier instead of by brand.

Product cross-reference example

The data constituting the “Golden” product master cross-reference to the “Source” product SX master and the “Source” product SY master is shown in table 2 and table 3.

Product master tables, views and sample data that constitute the product cross-reference for this example are shown in code sample 2.

Secure relationships

With the process-driven framework and the modeling design patterns provided through the Teradata Master Data Management solution, businesses have a powerful and easy-to-use platform to manage relationship and reference data.

Additionally, having the relationship data managed in Teradata Database tables makes it easy to map BI tools directly to the relationship data, enabling companies to gain complete and accurate insight into their business events and transactions.

image

Click to enlarge

image

Click to enlarge

image

Click to enlarge

Code sample 1

CREATE SET TABLE MST_PRODUCT 
(
PRODUCT_ID INTEGER NOT NULL,
PRODUCT_DESCRIPTION VARCHAR(100) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_TARGET_ID INTEGER,
SYS_AUTH_ID VARCHAR(25) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_SOURCE VARCHAR(25) CHARACTER SET LATIN NOT CASESPECIFIC DEFAULT ‘BackEnd’,
SYS_CREATED_BY VARCHAR(70) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_CREATION_DATE TIMESTAMP(0),
SYS_ENT_STATE VARCHAR(25) CHARACTER SET LATIN NOT CASESPECIFIC CHECK (
SYS_ENT_STATE IN (‘ACTIVE’,’INACTIVE’,’NEW’,’DELETED’) ),
SYS_LAST_MODIFIED_BY VARCHAR(70) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_LAST_MODIFIED_DATE TIMESTAMP(0),
SYS_NC_TYPE VARCHAR(25) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_ERR_CODE VARCHAR(10000) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_ERR_SVRTY VARCHAR(30) CHARACTER SET LATIN NOT CASESPECIFIC CHECK (
SYS_ERR_SVRTY IN (‘SEVERE_ERROR’,’ERROR’,’WARNING’) ))
UNIQUE PRIMARY INDEX MST_PRODUCT_PK ( PRODUCT_ID );

CREATE SET TABLE MST_BRAND
(
BRAND_ID INTEGER NOT NULL,
BRAND_DESCRIPTION VARCHAR(100) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_TARGET_ID INTEGER,
SYS_AUTH_ID VARCHAR(25) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_SOURCE VARCHAR(25) CHARACTER SET LATIN NOT CASESPECIFIC DEFAULT ‘BackEnd’,
SYS_CREATED_BY VARCHAR(70) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_CREATION_DATE TIMESTAMP(0),
SYS_ENT_STATE VARCHAR(25) CHARACTER SET LATIN NOT CASESPECIFIC CHECK (
SYS_ENT_STATE IN (‘ACTIVE’,’INACTIVE’,’NEW’,’DELETED’) ),
SYS_LAST_MODIFIED_BY VARCHAR(70) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_LAST_MODIFIED_DATE TIMESTAMP(0),
SYS_NC_TYPE VARCHAR(25) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_ERR_CODE VARCHAR(10000) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_ERR_SVRTY VARCHAR(30) CHARACTER SET LATIN NOT CASESPECIFIC CHECK (
SYS_ERR_SVRTY IN (‘SEVERE_ERROR’,’ERROR’,’WARNING’) ))
UNIQUE PRIMARY INDEX MST_BRAND_PK ( BRAND_ID );

CREATE SET TABLE MST_CATEGORY
(
CATEGORY_ID INTEGER NOT NULL,
CATEGORY_DESCRIPTION VARCHAR(100) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_TARGET_ID INTEGER,
SYS_AUTH_ID VARCHAR(25) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_SOURCE VARCHAR(25) CHARACTER SET LATIN NOT CASESPECIFIC DEFAULT ‘BackEnd’,
SYS_CREATED_BY VARCHAR(70) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_CREATION_DATE TIMESTAMP(0),
SYS_ENT_STATE VARCHAR(25) CHARACTER SET LATIN NOT CASESPECIFIC CHECK (
SYS_ENT_STATE IN (‘ACTIVE’,’INACTIVE’,’NEW’,’DELETED’) ),
SYS_LAST_MODIFIED_BY VARCHAR(70) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_LAST_MODIFIED_DATE TIMESTAMP(0),
SYS_NC_TYPE VARCHAR(25) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_ERR_CODE VARCHAR(10000) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_ERR_SVRTY VARCHAR(30) CHARACTER SET LATIN NOT CASESPECIFIC CHECK (
SYS_ERR_SVRTY IN (‘SEVERE_ERROR’,’ERROR’,’WARNING’) ))
UNIQUE PRIMARY INDEX MST_CATEGORY_PK ( CATEGORY_ID );

CREATE SET TABLE Product_Day_Sales
(
SALE_DATE DATE FORMAT ‘YY/MM/DD’,
PRODUCT_ID INTEGER,
SALES DECIMAL(18,4))
PRIMARY INDEX ( PRODUCT_ID );

The following is a view on top of the MST_REL_OBJ_DATA_VAL table, which is essentially a table consisting of the known relationships (and levels). For our example, the relationships that were created for Brand and Product was “1301” and for the Brand and Category was “1302.” (See figure 2.)

REPLACE	VIEW  PROD_HIER 
(ID, DESCRIPTION, PARENT_ID, PARENT_DESCRIPTION)
AS
SELECT DISTINCT CHILD_ROKD_VAL1 (INTEGER) AS ID, PRODUCT_DESCRIPTION AS DESCRIPTION,

PARENT_ROKD_VAL1 (INTEGER) AS PARENT_ID, BRAND_DESCRIPTION AS PARENT_DESCRIPTION
FROM MST_REL_OBJ_DATA_VAL REL, MST_PRODUCT PRD, MST_BRAND BRD
WHERE REL. CHILD_ROKD_VAL1 = PRD.PRODUCT_ID
AND REL.PARENT_ROKD_VAL1 = BRD.BRAND_ID
AND REL.ROM_ID = 1301
UNION
SELECT DISTINCT CHILD_ROKD_VAL1 (INTEGER) AS ID, BRAND_DESCRIPTION AS DESCRIPTION,

PARENT_ROKD_VAL1(INTEGER) AS PARENT_ID , CATEGORY_DESCRIPTION AS PARENT_DESCRIPTION
FROM MST_REL_OBJ_DATA_VAL REL, MST_CATEGORY CAT, MST_BRAND BRD
WHERE REL. CHILD_ROKD_VAL1 = BRD.BRAND_ID
AND REL.PARENT_ROKD_VAL1 = CAT.CATEGORY_ID
AND REL.ROM_ID = 1302
UNION
SELECT CATEGORY_ID(INTEGER) AS ID, CATEGORY_DESCRIPTION AS DESCRIPTION,

NULL (INTEGER) AS PARENT_ID, NULL (VARCHAR(100)) AS PARENT_DESCRIPTION
FROM MST_CATEGORY;

The view depicting the product hierarchy recursively is shown below. There could be many different ways and formats in which a report could be developed depending on the features supported by the particular reporting tool. The main obejctive of this illustration is to emphasize the depth provided by the Teradata Master Data Relationship Management data model in supporting BI reporting:

REPLACE	RECURSIVE VIEW PH (LVL,ID,DESCRIPTION,PARENT_ID, PARENT_DESCRIPTION) AS
(
SELECT 1 AS LVL, ID, DESCRIPTION, PARENT_ID, PARENT_DESCRIPTION
FROM PROD_HIER
WHERE
PARENT_ID IS NULL
UNION
ALL
(
SELECT LVL+1,PROD_HIER. ID, PROD_HIER.DESCRIPTION, PROD_HIER.PARENT_ID,
PROD_HIER.PARENT_DESCRIPTION
FROM
PROD_HIER, PH
WHERE
PROD_HIER.PARENT_ID=PH.ID
)
);

Product master tables, views and sample data that constitute the product cross-reference for this example are:

Code sample 2

CREATE SET TABLE MST_PRODUCT_MASTER 
(
PRODUCT_ID INTEGER NOT NULL,
PRODUCT_DESCRIPTION VARCHAR(100) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_TARGET_ID INTEGER,
SYS_AUTH_ID VARCHAR(25) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_SOURCE VARCHAR(25) CHARACTER SET LATIN NOT CASESPECIFIC DEFAULT ‘BackEnd’,
SYS_CREATED_BY VARCHAR(70) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_CREATION_DATE TIMESTAMP(0),
SYS_ENT_STATE VARCHAR(25) CHARACTER SET LATIN NOT CASESPECIFIC CHECK (
SYS_ENT_STATE IN (‘ACTIVE’,’INACTIVE’,’NEW’,’DELETED’) ),
SYS_LAST_MODIFIED_BY VARCHAR(70) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_LAST_MODIFIED_DATE TIMESTAMP(0),
SYS_NC_TYPE VARCHAR(25) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_ERR_CODE VARCHAR(10000) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_ERR_SVRTY VARCHAR(30) CHARACTER SET LATIN NOT CASESPECIFIC CHECK (
SYS_ERR_SVRTY IN (‘SEVERE_ERROR’,’ERROR’,’WARNING’) ))
UNIQUE PRIMARY INDEX MST_PROD1_PK ( PRODUCT_ID );

CREATE SET TABLE MST_PROD_SX_MASTER
PRODUCT_NUMBER INTEGER NOT NULL,
PRODUCT_DETAILS VARCHAR(100) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_TARGET_ID INTEGER,
SYS_AUTH_ID VARCHAR(25) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_SOURCE VARCHAR(25) CHARACTER SET LATIN NOT CASESPECIFIC DEFAULT ‘BackEnd’,
SYS_CREATED_BY VARCHAR(70) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_CREATION_DATE TIMESTAMP(0),
SYS_ENT_STATE VARCHAR(25) CHARACTER SET LATIN NOT CASESPECIFIC CHECK (
SYS_ENT_STATE IN (‘ACTIVE’,’INACTIVE’,’NEW’,’DELETED’) ),
SYS_LAST_MODIFIED_BY VARCHAR(70) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_LAST_MODIFIED_DATE TIMESTAMP(0),
SYS_NC_TYPE VARCHAR(25) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_ERR_CODE VARCHAR(10000) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_ERR_SVRTY VARCHAR(30) CHARACTER SET LATIN NOT CASESPECIFIC CHECK (
SYS_ERR_SVRTY IN (‘SEVERE_ERROR’,’ERROR’,’WARNING’) ))
UNIQUE PRIMARY INDEX MST_PROD2_PK ( PRODUCT_NUMBER );

CREATE SET TABLE MST_PROD_SY_MASTER
(
PRODUCT_IDENTITY INTEGER NOT NULL,
PRODUCT_NAME VARCHAR(100) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_TARGET_ID INTEGER,
SYS_AUTH_ID VARCHAR(25) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_SOURCE VARCHAR(25) CHARACTER SET LATIN NOT CASESPECIFIC DEFAULT ‘BackEnd’,
SYS_CREATED_BY VARCHAR(70) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_CREATION_DATE TIMESTAMP(0),
SYS_ENT_STATE VARCHAR(25) CHARACTER SET LATIN NOT CASESPECIFIC CHECK (
SYS_ENT_STATE IN (‘ACTIVE’,’INACTIVE’,’NEW’,’DELETED’) ),
SYS_LAST_MODIFIED_BY VARCHAR(70) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_LAST_MODIFIED_DATE TIMESTAMP(0),
SYS_NC_TYPE VARCHAR(25) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_ERR_CODE VARCHAR(10000) CHARACTER SET LATIN NOT CASESPECIFIC,
SYS_ERR_SVRTY VARCHAR(30) CHARACTER SET LATIN NOT CASESPECIFIC CHECK (
SYS_ERR_SVRTY IN (‘SEVERE_ERROR’,’ERROR’,’WARNING’) ))
UNIQUE PRIMARY INDEX MST_PROD3_PK ( PRODUCT_IDENTITY );

The SQL below creates an operational data store table for the consolidated transactions (this could also have been developed in a view):

CREATE TABLE PRODUCT_DAY_ORDER_ODS
(PRODUCT_ID INTEGER,
PRODUCT_DESCRIPTION VARCHAR(100),
CONS_ORD_DATE DATE,
ORD_QUANTITY INTEGER,
ORDER_AMOUNT DECIMAL (18,4)
)
PRIMARY INDEX ( PRODUCT_ID)

CT DATE_TAB
( DATE_COL DATE);

INS DATE_TAB
SEL CURRENT_DATE
;

In the next set of SQL, RANDOM is used to simulate transactions amounts directly in the source master tables. This data would normally be collected from actual transaction tables and then joined with the source master tables to produce data as shown below:

INS  PRODUCT_DAY_ORDER_ODS 
SELECT
PRODUCT_NUMBER AS PRODUCT_ID ,
PRODUCT_DETAILS AS PRODUCT_DESCRIPTION,
DT.DATE_COL AS CONS_ORD_DATE,
1.5* RANDOM ( 1,100) AS ORD_QUANTITY,
423* RANDOM ( 1,100) AS ORDER_AMOUNT
FROM
MST_PROD_SX_MASTER, DATE_TAB DT
UNION
SELECT
PRODUCT_IDENTITY ,
PRODUCT_NAME,
DT.DATE_COL AS CONS_ORD_DATE,
1.5* RANDOM ( 23,77) AS ORD_QUANTITY,
423* RANDOM ( 23,77) AS ORDER_AMOUNT
FROM
MST_PROD_SY_MASTER, DATE_TAB DT;

The subsequent view generates the aggregate order quantities as it relates to the single golden product master from the cross-referenced source product master tables. Note that when recreating this view in your environment, you will replace the ROM_ID’s with the values generated in your system:

REPLACE VIEW  AGG_CUST_ORD
AS
(
SELECT CONS_ORD_DATE,PRODUCT_ID, PRODUCT_DESCRIPTION,
SUM( ORD_QUANTITY) AS SUM_ORD_QTY,
SUM( ORDER_AMOUNT) AS SUM_ORD_AMNT FROM
(
SELECT ODS.CONS_ORD_DATE AS CONS_ORD_DATE,
COALESCE ( G. PRODUCT_ID, ODS.CUSTOMER_ID) AS PRODUCT_ID ,
COALESCE (G.PRODUCT_DESCRIPTION,
ODS.PRODUCT_DESCRIPTION) AS PRODUCT_DESCRIPTION,
RODV. PARENT_ROKD_VAL1 AS PARENT_ROKD_VAL1,
SUM( ODS.ORD_QUANTITY) AS ORD_QUANTITY, SUM( ODS.ORDER_AMOUNT) AS ORDER_AMOUNT
FROM CUSTOMER_DAY_ORDER_ODS ODS INNER JOIN MST_REL_OBJ_DATA_VAL RODV
ON ODS. PRODUCT_ID = RODV. CHILD_ROKD_VAL1
INNER JOIN MST_PRODUCT_MASTER G
ON G. CUSTOMER_ID = RODV. PARENT_ROKD_VAL1
WHERE RODV.ROM_ID IN ( 1601, 1602)
GROUP BY 1,2,3,4) A
GROUP BY 1, 2,3 );


image

Click to enlarge


Your Comment:
  
Your Rating:

Comments
 


9/16/2009 10:00:54 PM
— Anonymous