Wednesday, September 1, 2010

Introduction to teradata utilities : Multiload

"Teradata MultiLoad"


DEFINITION
Teradata MultiLoad, also called "MultiLoad," "MLoad" or "ML," is a command-driven parallel load utility for high-volume batch maintenance on multiple tables and views of the Teradata Database. It is specially designed for high-speed batch creation and maintenance of large databases.

OVERVIEW
Teradata MultiLoad executes a series of MultiLoad commands and Teradata SQL statements written in a batch mode job script or interactively entered. The MultiLoad commands provide the session control and data handling specifications for the data transfer operations, and the Teradata SQL statements perform the actual maintenance functions on the Teradata RDBMS tables and views.

A single MultiLoad job performs a number of different import and delete tasks on database tables and views:

• Each MultiLoad import task can do multiple data insert, update, and delete functions on up to five different tables or views;
• Each MultiLoad import task can have up to 100 DML steps;
• Each MultiLoad delete task can remove large numbers of rows from a single table.

Teradata MultiLoad is characterized by its serialization feature. In certain scenarios, multiple changes are possible to occur to a single row in the same MultiLoad job. By using its serialization feature, Teradata MultiLoad automatically guarantees that the ordering of such operations is maintained correctly.

If a MultiLoad job has only one import task, then Teradata FastLoad has to be chosen for it, since FastLoad will finish it much faster. However, the FastLoad utility works only on empty tables.

A FastLoad job can not be used to:
  • • Insert additional data rows into an existing table;
  • • Update individual rows of existing tables;
  • • Delete individual rows from existing tables;
  • • Load data into multiple tables.


DATA SOURCE

Teradata MultiLoad imports data from:
• Disk or tape files
• Input files on a network-attached workstation;
• Special input module (INMOD) routines that select, validate, and preprocess input data;
• Access modules;
• Any device providing properly formatted source data;
The table or view in the database receiving the data can be any existing table or view for which the user has access privileges for the required maintenance tasks.
MultiLoad command IMPORT has an optional parameter FORMAT, which is used to specify the format of the external data source.
The format may be:
• FASTLOAD;
• BINARY;
• TEXT;
• UNFORMAT;
• VARTEXT.


OPERATING MODES 

Teradata MultiLoad runs in the following operating modes:
• Interactive mode;
• Batch mode.

In interactive mode, Teradata MultiLoad uses terminal screen and keyboard as the standard output and input streams. In batch mode, MultiLoad uses > and < to redirect the standard output / input streams. To invoke Teradata MultiLoad in interactive mode, enter the following command (for Microsoft Windows): c:\ncr\bin\MultiLoad

To invoke MultiLoad in batch mode, use the command below (for Microsoft Windows): c:\ncr\bin\MultiLoad [options] < infile > outfile
Here, the infile is a Teradata MultiLoad job script file which includes all the required MultiLoad commands and Teradata SQL statements, whereas the outfile is the outfile is the output stream file.

Character Sets
Teradata MultiLoad supports Latin, Chinese, Japanese and Korean character sets, along with ASCII, EBCDIC, UTF-8 and UTF-16. It also supports user-defined session character sets.

Task Status Reporting
Teradata MultiLoad has three reporting methods to monitor the status of jobs that are still in progress and those that have just completed:
• MultiLoad utility messages provide job status reports, including:
o Options messages that list the settings of MultiLoad task parameters;
o Logoff/disconnect messages that list key run statistics.
• QrySessn (i.e., Query Session Utility) provides real-time, phase-oriented progress reports at selected intervals during the MultiLoad job;
• Statistics messages present information at the end of the MultiLoad job.

Data Conversion Capabilities
Teradata MultiLoad can redefine the data type specification of input data to match it with that of the destination column in the MultiLoad table on the target database. MultiLoad data conversion types are:
• Numeric-to-numeric (for example integer-to-decimal);
• Character-to-numeric;
• Character-to-date;
• Date-to-character.
Use the DATADESC specification of the MultiLoad FIELD command to convert input data to a different data type before inserting it into the MultiLoad table on the Teradata Database.
Note: Redundant conversions, like integer-to-integer, are legal and necessary to support the zoned decimal format.

MULTILOAD COMMANDS


MultiLoad Command Summary
The MultiLoad commands perform two types of activities:
• Support activities — Support commands perform support activities to establish the MultiLoad sessions with the Teradata Database and define the operational support environment for Teradata MultiLoad. Established support environment options remain in effect until another support command changes them. Support commands do not specify a MultiLoad task;
• Task activities — Task commands perform task activities to specify the actual processing of each MultiLoad task.

1. Teradata MultiLoad Commands for Support Activities:
COMMAND NAME
FUNCTION
ACCEPT
Allows
the value of one or more utility variables to be accepted from either a file
or an environment variable
DATEFORM
Specifies
the form of the DATE data type specifications for the MultiLoad job
DISPLAY
Writes messages to the specified destination

ELSE (see
IF, ELSE, and ENDIF)

Introduces
commands and statements that execute when a preceding IF condition is false

ENDIF
(see IF, ELSE, and ENDIF)

Delimits the group of MultiLoad commands that were subject to previous IF or ELSE conditions

IF (see
IF, ELSE, and ENDIF)

Introduces a conditional expression whose value initiates execution of subsequent commands

LOGOFF

Disconnects all active sessions and terminates MultiLoad on the client system.

LOGON

Specifies the LOGON command string to be used in connecting all sessions established by MultiLoad.
LOGTABLE
Identifies the table to be used to journal checkpoint information required for safe,
automatic restart of MultiLoad when the client or Teradata Database system fails.

ROUTE
MESSAGES

Identifies the destination of output produced by MultiLoad support environment.

RUN FILE

Invokes the specified external file as the current source of utility commands and Teradata SQL statements

SET

Assigns a data type and a value to a utility variable

SYSTEM

Suspends operation of MultiLoad and executes any valid local operating system command

2. Teradata MultiLoad Commands for Task Activities:


COMMAND NAME

FUNCTION

BEGIN
MLOAD and BEGIN DELETE MLOAD

Specifies:
  • The kind of MultiLoad task to
    be executed;
  • The target tables in the
    Teradata Database;
  • The parameters for executing
    the task.

DML LABEL

Defines a label and error treatment options for a following group of DML statements.

END MLOAD
 Indicates completion of MultiLoad command entries and initiates execution of the task.

FIELD

Used with the LAYOUT command to define a field of the data source record that is sent to the Teradata Database.

FILLER

Used with the LAYOUT command to define a field in the data source record that is not sent to the Teradata Database.

IMPORT

Identifies the data source, the layout used to describe the data record, and optional conditions for performing DML operations.

LAYOUT

Introduces the record format of the data source to be used in the MultiLoad task. This command is followed by a sequence or combination of FIELD and TABLE commands.

LOGDATA
Supplies parameters to the LOGMECH command beyond those needed by the logon mechanism, such as userid and password, to successfully authenticate the user.

LOGMECH

Identifies the appropriate logon mechanism by name.
 PAUSE
ACQUISITION

Temporarily suspends the acquisition phase of a MultiLoad job.

TABLE
Used with the LAYOUT command to identify a table whose column names and data descriptions are used as the field names and data descriptions of the data source records.


Teradata SQL Statements in MultiLoad
MultiLoad supports a subset of Teradata SQL statements in order to define and manipulate the data stored in the Teradata Database. Consequently, other utilities do not have to be invoked to perform routine database maintenance functions before executing MultiLoad utility tasks. The Teradata SQL statements supported by Teradata MultiLoad are listed below:

COMMAND NAME

FUNCTION

ALTER TABLE

Changes the column configuration or options of an existing table

CHECKPOINT

Adds a checkpoint entry to a journal table

COLLECT STATISTICS
 Collects statistical data for one or more columns of a table

COMMENT

Stores or retrieves comment string associated with a database object

CREATE DATABASE

CREATE MACRO

CREATE TABLE

CREATE VIEW

Creates a new database, macro, table, or view

DATABASE
Specifies a new default database for the current session

DELETE

Removes rows from a table

DELETE DATABASE
Removes all tables, views, and macros from a database
DROP DATABASE

Drops the definition for an empty database from the Data Dictionary

DROP TABLE

Removes a table from the database
GIVE
Transfers ownership of a database to another user
GRANT
Grants access privileges to a database object
INSERT
Insert  new rows to a table

MODIFY DATABASE

Changes  the options of an existing database

RELEASE MLOAD

Removes the access locks from the target tables in the Teradata Database

This inhibits any attempts to restart the MultiLoad job when a MultiLoad task has been suspended or aborted.

Note: A RELEASE MLOAD statement must be entered from BTEQ. It cannot be entered from MultiLoad.

RENAME

Changes the name of an existing table, view, or macro

REPLACE MACRO
REPLACE VIEW

Redefines an existing macro or view

REVOKE
Rescinds access privileges to a database object

SET QUERY_BAND

Identifies type and source of query to determine prioritization of queries

SET SESSION COLLATION

Overrides the collation specification for the current session

SET SESSION OVERRIDE REPLICATION ON/OFF
 Turns replication services on or off

UPDATE

Changes the column values of an existing row in a table

No comments:

Post a Comment