Tech2Tech
Ask The Experts
The 'lazy' DBA
Automating processes, documenting procedures and implementing standards are smart ideas for time-pressed administrators.
by Steve Wilmes and Steve Savoye
The best DBA is a “lazy” DBA. However, this does not imply laziness in a literal sense. On the contrary, it means being extremely efficient by using technology to its fullest advantage. The more knowledge administrators gain and use, the more valuable they become.
In the end, that isn’t being lazy at all. It’s just plain smart.
What does it take to be “lazy” and successful at it? Let’s explore several strategies to assist in rapid adoption of efficient approaches to solving many data warehouse deployments today.
Learn
First, working smarter means expanding one’s knowledge. This can be obtained from several avenues: Take classes, attend the PARTNERS Conference along with regional Teradata User Group meetings, and absolutely—get certified. The certification process promotes systematic learning on a broad spectrum of topics. Becoming a Teradata Certified Master provides access to an elite group to share information, knowledge and experiences.
Automate
Automate everything possible by developing proficiency with SQL, macros, stored procedures and dynamic SQL. Most repetitive tasks can be automated to some degree, if not completely. If you haven’t already, it is time to stop manually running reports.
There are many tools such as Teradata Query Scheduler, Teradata Manager, business intelligence (BI) tools and UNIX cron jobs that can help set up and execute these processes. Considerable time will be invested initially to create automated processes, but this will quickly free up more time later on to enable efficient and immediate response when requests come in.
Communicate and document
Documenting and publicizing a service catalog for DBA activities will benefit both the DBA team as well as client organizations. These documents provide a checklist to consider during project planning and help to set clear expectations.
Test
A proper performance benchmark should be one of the first automation exercises. At a minimum, this process should be run before and after every major and patch upgrade so that there is a baseline to compare relative performance. This process should also be implemented on a scheduled basis in order to measure performance impacts on data volumes, new features, functions, hardware, etc.
Extensive regression testing needs to be automated as well, because it will be required for every upgrade, patch application, third-party software upgrade, etc. Minor patch releases have been known to enforce features that were never enforced before in addition to fixing existing defects. This implies the necessity for diligent and thorough testing every time the system is upgraded. Lastly, every event that causes problems or system restarts should be captured and added to the patch test plan or process.
Establish conventions and standards
It may not be obvious, but simple things like establishing standards and conventions early on will save tremendous amounts of time and headaches later. For example, implementing naming conventions for databases, tables and columns will foster the automation of many processes and procedures.
Mind you, this is not the only result by far. Everyone benefits by naming standards and conventions. Imagine the nightmare when some tables and columns use the word PARENT and others use PRNT or PAR. Just writing SQL to accommodate such inconsistencies is difficult enough. Standards should not only be set at the application level, but at the enterprise or corporate level.
It’s also important to establish a governance committee at the enterprise level to institute and enforce modeling and architecture standards. This committee would be responsible for holding each corporate project accountable to the established standards. It should be composed of architects, developers and DBAs, as well as logical and physical data modelers. This will greatly improve the ability of different systems to talk with and play nice with one another. A common vocabulary is priceless when it comes to porting data from source systems to the enterprise data warehouse.
Rules can be established and simple automation used to enforce and report on them when:
- No database or object name can be greater than 25 characters (fewer is better).
- Project-related and production tables require names for indexes and establish a naming convention for them.
The governance committee needs to develop and sign off on such rules in advance. Once this is done, the DBA can start automating processes. The following macro validates data definition language (DDL) against the rules mentioned previously.
REPLACE MACRO VALIDATE_STAN-
DARDS (DB_IN CHAR(30))
/* This reports on General Naming Stan-
dards for a given Database */
AS (
/* Checks Database Names are 25 charac-
ters or less. */
SELECT
d.DatabaseName (Title ‘Conflict//DB//
Name’) (FORMAT ‘X(24)’),
CHAR(TRIM(d.DatabaseName)) (Title
‘Length of DBNM > 25’)
FROM DBC.DataBases d
WHERE (CHAR(TRIM(d.DatabaseName))
> 25)
AND d.DatabaseName = :DB_IN
ORDER BY 1;
/* Checks Object Names are 25 characters
or less. */
SELECT
TableName (Title ‘Conflict//Object//
Name’) (FORMAT ‘X(24)’),
DatabaseName (Title ‘From Dbase/
User’) (FORMAT ‘X(24)’) ,
CASE TableKind
WHEN ‘T’ THEN ‘TABLE’
WHEN ‘V’ THEN ‘VIEW’
WHEN ‘M’ THEN ‘MACRO’
WHEN ‘P’ THEN ‘PROCEDURE’
WHEN ‘X’ THEN ‘TRIGGER’
WHEN ‘J’ THEN ‘INDEX’
ELSE TableKind
END (Title ‘Object Type’)
(FORMAT
‘X(24)’),
CHAR(TRIM(TableName)) (Title ‘Length
of TableName > 25’)
FROM DBC.Tables
WHERE (CHAR(TRIM(TableName)) > 25)
AND DatabaseName = :DB_IN
AND TableKind NOT IN (‘J’)
ORDER BY 1;
/* Lists all Tables, Indices etc. with Null
Names. */
SELECT DatabaseName, TableName,
IndexName,
IndexType AS "Type", UniqueFlag
AS "Unique"
FROM DBC.Indices
WHERE DatabaseName = :DB_IN
AND IndexName IS NULL
ORDER BY 1,2,3; );
The execution macro demonstrates how to automate the rules that are put in place. The SQL from this example can be written in many ways; this just illustrates how to go about automating the rules that are put in place. Again, a little time invested up front to develop validation scripts will have tremendous payoffs down the road.
EXEC VALIDATE_STANDARDS(‘CSQL_
CLASS’);
DBAs will need to create their own, more comprehensive, knowledge book of data modeling, which should cover enterprise-wide conventions and guidelines, as well as conceptual, logical and physical designs; database standards for different database platforms; a dictionary of terms, including abbreviations and reserved words; and so on. The governance committee should be tasked with making sure the rest of the enterprise follows these standards and guidelines.
Just remember that, while guidelines and rules are important, it’s not always a good idea to apply them globally. A friend once said, “Any rule applied globally and absolutely is absolutely applied wrong!”
Implement backup and recovery
Every database platform and application needs a well-documented and tested plan or strategy for how databases will be backed up and restored in the case of failure. This process should not only meet your business requirements, but also be efficient on both the backup and recovery ends. Make sure consensus and sign-off is obtained from management so if something goes awry, all parties will be involved in resolving the problem in a timely manner.
Decisions will need to be discussed, agreed upon, and documented for such topics as retention, purge criteria, off-site storage with full, cluster, partial or partition backups, backup to disk or tape, and backup windows. The actual plan may be a hybrid of some or all of the above.
Real-time processing and intraday loads pose more challenges than, say, monthly batch-loaded systems. A few of the many questions that need to be addressed are:
- Are tables backed up weekly, daily, or more often?
- If the tables are large enough, are the source files backed up instead?
- If so, how long are files retained?
- Are we receiving incremental changes or full data refreshes?
While many organizations deal with these issues themselves, some rely on third-party vendors that can address off-site records storage management, retention and protection requirements.
Work smart
The “lazy” DBA is really about being anything but lazy. It is about being proactive, efficient, working smarter and gaining knowledge. Through proper automation and utilization of tools, DBAs can make their lives much more pleasant. However, there are some up-front costs, because time needs to be invested in education and automation, but the rewards are soon reaped. Eventually, there will be even more time to be proactive for the business.
Steve Wilmes, founder and CEO of Cerulium Corp., has more than 20 years of experience in the computer industry and is a Teradata Certified Master.
Steve Savoye is a senior DBA for Blue Cross and Blue Shield of North Carolina, an independent licensee of the Blue Cross and Blue Shield Association. A Teradata Certified Master, he has nearly 20 years of experience in IT.
This article is taken from Wilmes and Savoye’s book, “Teradata Toolbox: Providing Dynamic Solutions for Today’s Challenges,” published by Cerulium Corp.