Tech2Tech
Applied Solutions 2
Take a shortcut
When a quick and simple data transfer is needed, Teradata command lines can help.
by Alan Greenspan
Data warehouses and broader analytical ecosystems rely on managed and automated data integration processes for smooth operation. Extract, transform and load (ETL) and extract, load and transform (ELT) jobs are designed, developed and tested before being put into production. Additionally, production data movement jobs are integrated with the enterprise data warehouse (EDW) load processes so that any data marts are kept up to date as required by their applications.
As critical as ongoing production processes are, many one-time events occur that require loading data into the EDW or moving data between systems. These might be standard operations not typically repeated, such as moving a unique subset of data to a test system for a one-time performance run, or they might be completely unexpected one-time jobs that are required for unusual situations. Sometimes, these “one-off” jobs seem as common as routine ones.
Load and data-movement tools
The load and data-movement products from Teradata include high-powered and flexible control interfaces for running any type of job and for handling all sorts of error and exception conditions:
- Teradata Parallel Transporter is the primary product for loading data from other sources into the Teradata Database. It includes a scripting language and an application programming interface (API), which defines the load job, source data and target tables, as well as identifies which load operator to use. These interfaces can take full advantage of the product’s performance and error-handling capabilities.
- Teradata Data Mover transfers data and objects between Teradata systems within an analytical ecosystem. A metadata table-driven interface is included for event-based data movement, and a flexible XML file defines jobs initiated from the command line.
These are powerful and flexible tools, but what about those one-off jobs that don’t require such potency? Rather than working with the full functionality and options that these tools provide, DBAs or ETL programmers can employ the easy-to-use command line interfaces that are also a part of these products.
Teradata Parallel Transporter Easy Loader
The Teradata Parallel Transporter Easy Loader command line interface allows rapid job definition to load data from a flat file into the Teradata Database. It has several UNIX-like options, which consist of the single command “tdload” followed by a series of options preceded by a hyphen (or two hyphens for the long option name versions).
For example, the –f <filename> option (or --SourceFilename) is used to specify the name of the file containing the source data to be loaded. There are several options to define basic job parameters, such as the user name, target table, job variables and delimiter other than comma. If preferred, command options can be specified in a job variables file rather than on the command line.
Some Teradata Parallel Transporter job tuning functions are also available by specifying these command line options in the load job
- --LoadInstances <number>
- --MaxSessions <number>
- --MinSessions <number>
- --ErrorLimit <number>
- --TargetWorkingDatabase <database_name>
- --TargetAccountID <accountid>
Because the schema of the load job is derived from the definition of the target table, the layout of the data records in the flat file must match that of the Teradata Database table.
Teradata Data Mover
The Teradata Data Mover command line interface can be used by itself for fairly simple one-time or exception jobs or to further define a job. These commands serve two purposes:
- The “move” command enables an experienced DBA to quickly execute a one-time job to transfer a subset of data from the production system to the test system.
- The “create” command, accompanied by an XML file, will store and run a complex job or define a repeatable production job that is run periodically at a scheduled time.
A complete “move” command consists of its keyword followed by a series of parameter name and value pairs. For example, source_tdpid would be followed by the system name for the source system. Parameters exist that give the user complete flexibility to specify the system, user name and account string to apply on each system, as well as the table or other objects to be copied. The “query” command can be employed before the “move” command to get a list of databases and tables on the source system.
This sample command will move a table from one system (source_tdpid) to another (target_tdpid):
Datamove move –job_name job1
–source_tdpid Checks –source_user
TD_API_user –source_account_string
77112345 –target_tdpid Leo –target_user
NN1945a4 –target_account_string
Tstr1233 –table db1.table2
Teradata Data Mover can also transfer other database objects such as tables with less common characteristics (e.g., partitioned primary index, large objects and user-defined type columns), global temp tables, statistics, join and hash indexes, and triggers. Some objects, such as triggers, can be moved only along with the associated table.
Another way to exercise Teradata Data Mover is to start with an existing saved job and modify it. This allows the full use of features of an XML parameter file and other parameters. Because XML is simple to read and understand, a previously created job can be easily modified with any editor and saved as a new job. A short and simple command line can execute the job by referencing the XML file. Over time, job templates can be accumulated by an experienced DBA so that most short-term data- and object-movement tasks can be quickly handled with little effort or risk of error. (See table for list of commands.)
Quick transfer
Sometimes a graphical user interface is required to complete a complex task. Other times a full-featured language or API with production job attributes is necessary. For simple and one-off jobs that need to move data from one system to another, the command lines available through Teradata Parallel Transporter Easy Loader and Teradata Data Mover offer a quick option.
Alan Greenspan joined Teradata in 1988 and is product marketing manager for the Teradata Database and other core software products.