Tech2Tech
Data Governance Glossary
Important terms and definitions to know and understand.
by Larissa T. Moss
Here are some basic definitions of data governance and the principal responsibilities business and IT people must assume in order to manage the information as a business asset:
Data Governance
Is having control over the information, restraining it, or having influence over or being the law for data.
Data Ownership
Means having authority over data assets. Data assets are owned by the organization, but certain senior executives who represent the organization in a business process capacity inherently have authority to define rules and policies for data under their control.
Data Stewardship
Business analysts and subject matter experts, who are accountable for the quality of the information under their responsibility, are data stewards. They must also ensure that their information is managed from an enterprise perspective so that it can be used and shared by all business units.
Enterprise Information Management
Enterprise information management (EIM) combines business intelligence (BI) and enterprise content management (ECM) to find solutions for the optimal use of information within organizations, such as decision making and day-to-day operations. The term often refers to the treatment of information as a business asset to be valued and managed just like any other investment. There are a number of other terms that also relate to EIM:
Single View of the Business
Every unique piece of data in the organization has only one identifier, one definition and one set of approved values. This cannot be achieved through a database or a system because, by necessity, data must be stored redundantly in multiple files and databases for performance reasons. Therefore, data analysis techniques, such as enterprise data modeling, normalization, data rationalization and business metadata are used against redundant and inconsistent source data to distill a single version of every unique business entity and attribute.
Normalization
Normalization ensures that each attribute remains unique within the data universe of the organization. The six normalization rules (1NF, 2NF, 3NF, BCNF, 4NF, 5NF) put one fact in one place. This means that every attribute must be unique (it must have only one semantic meaning), and it can be assigned (or placed) into only one entity as either an identifier (primary key) of that entity or as a descriptive attribute of that and no other entity.
Enterprise Data Modeling
Construction of an enterprise data model (EDM) does not need to be fully completed. Instead, the EDM evolves over time and may never be finished because the objective of the modeling process is not to produce a final data model but to discover and resolve data discrepancies resulting from different views of the same data among different business units.
Data Standards
An EDM is not merely a pictorial representation (E/R diagram) of an organization’s data assets. Its ultimate value comes from applying stringent data administration principles during the logical data modeling process. For example, there are formal rules for writing data definitions, for creating data names and for defining valid data content (data domain).
Data Definitions
A data definition should be short, precise and meaningful (a short paragraph). It must thoroughly describe the data element name and, optionally, it may contain an example. Michael Brackett’s book "The Data Warehouse Challenge" offers examples of a poor and a better data definition for the attribute "well depth feet." The definition "The depth of the well in feet" is poor because it is not clear how the depth is measured. A much better definition is "The total depth of the well in feet from the surface of the surrounding ground to the deepest point dug or drilled regardless of the depth of the well casing."
Data Names
Using "favorite" data names or blindly copying informal names from existing systems is not an acceptable standard. There are numerous data naming conventions, the most popular being the "prime-qualifier-class word." It prescribes that every attribute (data element) must have one prime word, have one or more qualifiers and end in one class word. Class words are predetermined and documented on a published list. (See table.)

Click to enlarge
Furthermore, every attribute must be fully qualified in order to avoid homonyms and limitations on naming future attributes, and it must be fully spelled out. An example of a standardized attribute name is "Checking Account Monthly Average Balance." The main component (prime word) is "Account," which is qualified by the word "Checking," to indicate the type of account. The class word indicating the type of data value contained in this attribute is "Balance," which is qualified by the words "Monthly" and "Average," to indicate the type of balance.
Data Domains
All attributes must be atomic, which means they cannot be further decomposed. For example, the attribute "Customer Name" is not atomic because it can be decomposed into "Customer First Name," "Customer Initial" and "Customer Last Name." Every attribute must also have a predefined data domain, which refers to data values that are allowed in accordance with its name, definition, and business and quality rules.
Data Quality Rules
One of the most important benefits of enterprise data modeling is the purposeful application of data quality rules, which apply to entities, relationships and attributes.
Entities
The identity rules apply to the primary keys, which are called unique identifiers in logical data modeling terminology. The reference rules apply to the foreign keys, which are the physical implementations of data relationships on a logical data model. The inheritance rules apply to supertype/subtype structures on a logical data model. The cardinal rules apply to the cardinality as well as to the optionality notations on the logical data model.
Relationships
The relationship dependency rules apply only to optional data relationships. Three dependency rules dictate whether an optional data relationship must be instantiated. The relationship state dependency rule applies to relationships between two entities where the state (status) of one entity determines whether a data relationship to another entity should be instantiated. The relationship mutual dependency rule mandates that if one relationship exists between two entities, then another relationship must also exist. The relationship mutual exclusivity rule mandates that if one data relationship exists between two entities then another relationship cannot exist.
Attributes
The attribute domain rules apply to the content (domain) of the attributes. The attribute dependency rules apply to domains of dependent attributes. Four dependency rules dictate what the content of an attribute should be. The attribute state dependency rule applies to two or more dependent attributes where the state (status) of one determines the values of the others. The attribute mutual dependency rules come in two types: derived and constrained. The attribute mutual dependency derived rule applies to two or more dependent attributes where the value of one is determined by a calculation that uses the domains of the others. The attribute mutual dependency constrained rule applies to two or more dependent attributes where the value of one is determined by a business rule and/or by the value of other attributes. The attribute mutual exclusivity rule mandates that if a valid value exists for one attribute then another attribute cannot contain any value (must be NULL).
This glossary was developed by Larissa T. Moss, president of Method Focus Inc., who has 30 years of IT experience with a focus on data warehousing and data management.