Saturday, September 25, 2010

How to pass parameter to BTEQ script (Windows) ?

How to pass parameter to BTEQ script (Windows) ?

I was wondering whether we could pass parameters to Bteq script. After doing some research , found out a closest way to passing parameter to Bteq script.  Actually this involves creating of BATCH file".bat"  and passing parameter to it.


Let us try this for following select query
select * from user.tab_test1  where test1='2010';

STEP1:
The batch file is created as follows (script2.bat)
del script1.txt
echo .logon localhost/test,test >>script1.txt
echo select * from user.tab_%1%  where %1% ='2010'; >>script1.txt
echo .logoff >> script1.txt
echo .quit >> script1.txt
bteq < script1.txt


STEP2:
Run the batch files as follows
C:\>script2.bat test1





STEP3:
we can see following execution steps
C:\>del script.txt
C:\>echo .logon localhost/test,test 1>>script1.txt
C:\>echo select * from user.tab_test1 where test1='2010' ; 1>>script1.txt
C:\>echo .logoff 1>>script1.txt
C:\>echo .quit 1>>script1.txt
C:\>bteq 0

BTEQ 08.02.00.00 Thu Jul 18 09:33:12 2010

+---------+---------+---------+---------+---------+---------+---------+----
.logon localhost/test,

*** Logon successfully completed.
*** Transaction Semantics are BTET.
*** Character Set Name is 'ASCII'.

*** Total elapsed time was 5 seconds.

+---------+---------+---------+---------+---------+---------+---------+----
select * from user.tab_test1  where test1='2010';

*** Query completed. One row found. One column returned.
*** Total elapsed time was 1 second.

test1
-----------
2010

+---------+---------+---------+---------+---------+---------+---------+----
.logoff
*** You are now logged off from the DBC.
+---------+---------+---------+---------+---------+---------+---------+----
.quit
*** Exiting BTEQ...
*** RC (return code) = 0

Purpose of CASESPECIFIC in teradata

Casespecific function in teradata is used to get excat value  say  for  example .product code 'Rac123Sa'  .

SELECT  product_code
FROM product_tab
WHERE product_code (CASESPECIFIC) LIKE '%Ra%';

Note: Usually the comparison  in teradata is not case-specific. To enforce CASESPECIFIC , we use CASESPECIFIC function

How to split source column into multiple target columns ( full name to first and Last)


Problem: To split fullname into firstname and lastname to be inserted into Target table.


Approach:

CREATE SET TABLE test
fullname varchar(30)
);


INSERT INTO test12 ('nitin raj');
INSERT INTO test12 ('nitin agarwal');
INSERT INTO test12 ('abhishek gupta');


sel * FROM    test;
fullname
nitin agarwal
nitin raj
abhishek gupta


Use index to find the position of space "SPACE" in full name and then use the position to get
--> firstname  =fullname from 1st till (SPACE-1)
-->lastname = fullname from (SPACE+1)

SELECT      INDEX(fullname ,' ') AS "a", SUBSTR(fullname,1, a-1 ) , SUBSTR(fullname,a+1 )  FROM        test;
a Substr(fullname,1,(a-1)) Substr(fullname,a)
6 nitin agarwal
6 nitin raj
9 abhishek gupta



Sunday, September 19, 2010

Difference between read lock and access lock?

For access lock to be placed we have to specify it explicitly as LOCKING FOR ACCESS.
Access lock is one wherein the table is not locked means you can do insert/update/delete on the table while access lock will access the table ,in this the dirty reads may happen , which  means you will not get the latest changes whatever happened on table to be reflected in your answer set.

 READ lock will lock the table wherein you can not do insert/update and structural changes in table.
It is placed by simple SELECT statement for by explicitly specifying as LOCKING FOR READ.

Syntax:
read lock :
Locking table for Access;

access lock :
Locking table for Read Access

The main difference between read lock and access lock is data Integrity.On placing a read lock the user expects data integrity, while as for access lock the user cannot expect data integrity.

Consider following scenarios
1. User A places READ lock and User B WRITE places lock
User B will have to wait for User A to complete its read in order to start insert/updates/deletes on the data.

2. User A places ACCESS lock and User B WRITE places lock.
User A & B access the data simultaneously, hence User A cannot expect to get consistent results.

Wednesday, September 15, 2010

How to create and use Macro ?

A macro is a Teradata extension to ANSI SQL that contains pre written SQL  statements. Macros are used to run a repeatable set of tasks.The details of macro can be found in data dictionary (DD)  . Macros are database objects and thus they belong to a specified user or database. A macro can be executed by Queryman. ,  BTEQ, by another macro.


How to create a Macro
Create a macro to generate a DOB list for department 321:

CREATE MACRO DOB_Details AS
(SELECT  first_name ,last_name  ,DOB
FROM TERADATA.employees
WHERE dept_numbr =321
ORDER BY DOB asc;);

EXECUTE a Macro
To execute a macro, call it along with  the exec command.
EXEC DOB_Details;

last_name    first_name    DOB
Ram            Kumar         75/02/22
Laxman       Sinha           79/04/06

DROP a Macro
To drop a macro, use following command .
DROP MACRO DOB_Details;

REPLACE a Macro
If we need to modify an existing macro , instead of  dropping and re-creating it
We can use replace macro command as follows

REPLACE MACRO DOB_Details AS      
(SELECT first_name,last_name ,DOB
FROM TERADATA.employees
WHERE dept_numbr = 321
ORDER BY DOB, first_name;);

Parameterized Macros

Parametrized macros allow usage of variables .  we can pass values to these variables. Advantage of using parametrized macros is  , Values can be passed to these variables at run-time.

Example
CREATE MACRO dept_list (dept INTEGER) AS
(
SELECT last_name
FROM TERADATA.employees
WHERE dept_numbr = :dept; );

To Execute the macro
EXEC dept_list (321);

Macros may have more than one parameter. Each name and its associated type are separated by a comma from the next name and its associated type. The order is important. The first value in the EXEC of the macro will be associated with the first value in the parameter list. The second value in the EXEC is associated with the second value in the parameter list, and so on.

Example
CREATE MACRO emp_verify (dept_numbr INTEGER ,salary DEC(18,0))
AS (
SELECT emp_numbr
from TERADATA.employees
WHERE dept_numbr = :dept
AND salary< :sal;) ;

To Execute this  macro
EXEC emp_check (301, 50000);

Key points to note about Macros:
  • Macros are a Teradata extension to SQL.
  • Macros can only be executed with the EXEC privilege.
  • Macros can provide column level security.
NOTE:  A user must have  EXEC privileges to execute the macros.  It doesn't matter if
he has privileges for the underlying tables or views that the macro uses.

How can you view Data Distribution in Teradata ?



Teradata uses HASH values to store data in AMPs. To view data distribution we use  Hash Functions.
Hash functions are usually used over primary index columns to find data distribution . We can identify skewness  by using this concept .
Following query can be used to find hash values of PI columns

SELECT HASHAMP(HASHBUCKET(HASHROW(<PRIMARY INDEX>))) AS
"AMP#",COUNT(*)
FROM <TABLENAME>
GROUP BY 1
ORDER BY 2 DESC;

By looking at result ,  you  query you can easily find out the Data Distribution across all AMPs in your system and further you can easily identify un-even data distribution.


definitions :
HASHROW - returns the row hash value for a given value
HASHBUCKET - the grouping of a specific hash value
HASHAMP - the AMP that is associated with the hash bucket

How to check for table size in Teradata



The folllowing queries are used to check for tablesize in Teradata 

How can you find the Table Space Size of your table across all AMPs ?
 
SELECT DATABASENAME, TABLENAME, SUM(CURRENTPERM)
FROM DBC.TABLESIZE
WHERE DATABASENAME = '<DATABASE_NAME>'
AND TABLENAME = '<TABLE_NAME>'
GROUP BY DATABASENAME , TABLENAME;


Following query will give the space consumed on each AMP by the Table

SELECT DATABASENAME, TABLENAME, CURRENTPERM
FROM DBC.TABLESIZE
WHERE DATABASENAME = ‘<DATABASE_NAME>'
AND TABLENAME = '<TABLE_NAME> ';

Monday, September 13, 2010

Error code :2616 : Numeric overflow occured during computation

Below is the scenario when this error occurs

2616 : Numeric overflow occured during computation

sel count(*) from DPRODUIT;
Numeric overflow occured during computation


Reason:  result value is more then integer range.   The output was 2,333,452,124  --> which is more than integer limit. Hence it resulted in numeric overflow.

How to fix the issue?
Solution:
Using cast to convert result from integer to decimal will work  .
COunt(*)  always provides result in integer format. Hence using  cast we can get values larger than integer limit of * characters

sel  cast (count(*) as decimal(18,0) ) from DPRODUIT;
result : 2,333,452,124

Saturday, September 11, 2010

Teradata TIMESTAMP tips & tricks - Part 2

This is continuation of following topic http://www.teradatahelp.com/2010/08/teradata-timestamp-tips.html.

Some more tips which might be useful while working on date and timestamp. (Queries run and verified on V2R6)


Tip 1: Adding and Subtracting Dates in Teradata

select my_date + interval '10' day from my_table

And equally to subtract values from a date:
select my_date - interval '10' day from my_table


Tip 2: Calculating the last day in a given month

select add_months(current_date,1) - extract(day from add_months(current_date,1)) ;


Tip 3:  Usage of SYS_CALENDAR.CALENDAR

Teradata provides a handy calendar table for date lookup called  sys_calendar.calendar.
Using this table you can calculate useful dates such as weekending dates, quarter ending dates, etc.

example1 : Find the previous Friday from today:
 select *  from   sys_calendar.calendar where  day_of_week=6 and calendar_date between date -6 and date;

example2 :Find the first Monday one week after the last Friday of previous month:
select a.calendar_date from sys_calendar.calendar a,
(
sel max(calendar_date) + 10 as calendar_date
from sys_calendar.calendar 
where extract(year from add_months( date, -1)) = year_of_calendar
and
extract(month from add_months(date,-1)) = month_of_year and day_of_week = 6
) b
where
a.calendar_date = b.calendar_date;


Tip 4: Adding or Subtracting Months
  
select add_month(date_col,1) from table;  /* adding of month */
select add_month(date_col,-10) from table; /* subtraction of month */

Friday, September 10, 2010

How to select Nth row from a table ?

how to select a particular row from the table?
Ans:

By using  ROW_NUMBER() Function , we can mark all the selected rows with numbers .
Then use QUALIFY clause to get excat row number.


select
columnA,columnB
from
source_table
qualify row_number() over (order by columnA ) = Nth record ;
Here 'N' being particular row number.

P.S:  The same query can be used to select Top N records;
select
columnA,columnB
from 
source_table 
qualify row_number() over (order by columnA ) <= N;

How to generate row numbers in teradata?

We have come across rownum function in oracle . This function gives number for every row/record in Oracle. In teradata there is no direct approach like in oracle. There are various approaches suggested around.

Approach 1:
Here Sum function is used  over rows preceding in the SOURCE_TABLE

select
sum(1) over( rows unbounded preceding ),
columnA,
columnB
from
SOURCE_TABLE;

Approach 2:
Here ROW_NUMBER function is used to generate row_number on columnA
select
ROW_NUMBER() over( ORDER BY columnA ),
columnA,
columnB
from
SOURCE_TABLE;


If you have  to use the  row number concept  in target table  as well, then  following  approach using "identity column" (from V2R6 onwards )  be used :

CREATE MULTISET TABLE TARGET_TABLE
  (
   columnA INTEGER GENERATED BY DEFAULT AS IDENTITY
       (START WITH 1
        INCREMENT BY 20) ,
   columnB VARCHAR(20) NOT NULL
  )
UNIQUE PRIMARY INDEX pidx (ColA);

P.S:  Identity columns does differ from sequence concept in oracle. The numbers assigned in these columns are not guaranteed to be sequenctial. The Identity column in Teradata is used to guaranteed row-uniqueness.

This  works without use of Identity approach.
create TABLE TARGET_TABLE as
(
   select
   ROW_NUMBER() over( ORDER BY columnA ) NUMBER_SEQ ,
   columnA,
   columnB,
   columnC
   from a join b on a.id=b.id
) with data ;

Sunday, September 5, 2010

FAQ's on Collect statistics

Collect stats just derives the data demographics of the table. These demographics are useful for optimizer to decide the execution of given query which in turn improves performance.

It collects the information like:
  • total row counts of the table,
  • how many distinct values are there in the column,
  • how many rows per value, is the column indexed,
  • if so unique or non unique etc.


What if collect stats is not done on the table?
Ans:
Teradata uses a cost based optimizer and cost estimates are done based on statistics.
So if you donot have statistics collected then optimizer will use a Dynamic AMP Sampling method to get the stats. If your table is big and data was unevenly distributed then dynamic sampling may not get right information and your performance will suffer.


How can i know the tables for which the collect stats has been done?
Ans:
Run the Help Stats command on that table.
e.g HELP STATISTICS TABLE_NAME ;
this will give you Date and time when stats were last collected. You will also see stats for the columns ( for which stats were defined) for the table


Whenever collect stats is done on the particular  table(say on index/column) where can I find information regarding these entries ?
Ans:
Collected statistics are stored in DBC.TVFields or DBC.Indexes tables. However, these two tables cannot be queried .


When to collect stats on  tables which have stats ?

Ans:
1.    Typical guideline is  roughly 10% of the data has changed. (By measuring delta in perm space since last collected.) 
2.    Recollect based on stats that have aged 60-90 days. (say last time stats collected was 2 months ago) .

Please note :
  Collect stats could be pretty resource consuming for large tables. So it is always advisable to schedule the job at off peak period .

Saturday, September 4, 2010

Permanent Journal

I did some homework on the topic and would like to share the same . comments and suggestions are welcome :)


Permanent Journal

The purpose of a permanent journal is to maintain a sequential history of all changes made to the rows of one or more tables.
Permanent journals help protect user data when users commit, uncommit or abort transactions.
A permanent journal can capture a snapshot of rows before a change, after a change, or both.
 Permanent journaling is usually used to  protect data.
likein case of  the automatic journal, the contents of a permanent journal remain until you drop them.

When you create a new journal table, you can use several options to control the type of information to be captured.


 We create Permanent Journal  when the User or Database is created.
Consider following example for creation of database
CREATE DATABASE testdat
FROM xxspace
AS
PERM = 20000000
SPOOL = 2000000
NO FALLBACK
ACCOUNT = '$xxxxx'
NO BEFORE JOURNAL
AFTER JOURNAL
DEFAULT JOURNAL TABLE = testdat.journal;


Here Admin has opted   for only AFTER JOURNAL and he has name the journal table as "testdat.journal".
When user creates a table in the database "testdat" , by default AFTER JOURNAL is available for him to protect his data when the hardware failure occurs.

He can opt for NO AFTER JOURNAL by overriding the default. Follwoing is the example.

Scenario1 : Here  by default the table has AFTER JOURNAL option.

"CREATE TABLE testdat.table
( field1 INTEGER,
field2 INTEGER)
PRIMARY INDEX field1;

Scenario2: in this case, user has specifically stated he wanted no AFTER JOURNAL for his data. This is how user can override the defult.

CREATE TABLE testdat.table2
FALLBACK,
NO AFTER JOURNAL
( field1 INTEGER,
field2 INTEGER)
PRIMARY INDEX field1;

In this case whenever the user inserts/updates and the transaction is committed , then the affected rows will be taken backup in the journal table "testdat.journal".


Please note :
You must allocate sufficient permanent space to a database or user that will contain permanent journals. If a database or user that contains a permanent journal runs out of space, all table updates that write to that journal abort.

How to change data-type of primary index column for a given Table

Situation: There is a requirement to change the datatype of primary index column .

Solution:
  1. You could run a Show table command to get the exact DDL, then change datatype ( say for example char(8)) to datatype varchar(10) .
  2. Run the ddl script to create the table.
  3. Then you can run an insert select command to insert data into the new table.
  4. Since the PI of both the tables are same, the operation would be pretty fast.
  5. Then DROP the original table and rename the new one to the old one.

Inserting data into an empty table is very quick because there is not reference to transient journal.

please note , if you use

  • CREATE new_table AS existing_table" preserves all column attributes and indexes (just Triggers and Foreign Keys are removed)
  • But using CREATE new_table AS (SELECT * FROM existing_table) will remove NOT NULL and TITLE properties.

Thursday, September 2, 2010

Introduction to teradata utilities : BTEQ

BTEQ is a Teradata native query tool for DBA and programmers. BTEQ (Basic
TEradata Query) is a command-driven utility used to 1) access and manipulate
data, and 2) format reports for both print and screen output.




DEFINITION

BTEQ, short for Basic TEradata Query,
is a general-purpose command-driven utility used to access and manipulate data
on the Teradata Database, and format reports for both print and screen output. [1]



OVERVIEW

As part of the Teradata Tools and Utilities (TTU), BTEQ is a
Teradata native query tool for DBA and programmers — a real Teradata workhorse,
just like SQLPlus for the Oracle Database. It enables users on a workstation to
easily access one or more Teradata Database systems for ad hoc queries, report
generation, data movement (suitable for small volumes) and database
administration.

All database requests in BTEQ are expressed in Teradata
Structured Query Language (Teradata SQL). You can use Teradata SQL statements
in BTEQ to:

    * Define
      data — create and modify data structures;
    * Select
      data — query a database;
    * Manipulate
      data — insert, delete, and update data;
    * Control
      data — define databases and users, establish access rights, and secure
      data;
    * Create
      Teradata SQL macros — store and execute sequences of Teradata SQL
      statements as a single operation.


BTEQ supports Teradata-specific SQL functions for doing
complex analytical querying and data mining, such as:

    * RANK -
      (Rankings);
    * QUANTILE
      - (Quantiles);
    * CSUM -
      (Cumulation);
    * MAVG -
      (Moving Averages);
    * MSUM -
      (Moving Sums);
    * MDIFF
      - (Moving Differences);
    * MLINREG
      - (Moving Linear Regression);
    * ROLLUP
      - (One Dimension of Group);
    * CUBE -
      (All Dimensions of Group);
    * GROUPING
      SETS - (Restrict Group);
    * GROUPING
      - (Distinguish NULL rows).


Noticeably, BTEQ supports the conditional logic (i.e.,
"IF..THEN..."). It is useful for batch mode export / import
processing.



OPERATING FEATURES

This section is based on Teradata documentation for the
current release.[1]



BTEQ Sessions

In a BTEQ session, you can access a Teradata Database easily
and do the following:

    * enter
      Teradata SQL statements to view, add, modify, and delete data;
    * enter
      BTEQ commands;
    * enter
      operating system commands;
    * create
      and use Teradata stored procedures.






Operating Modes

BTEQ operates in two modes: interactive mode and batch mode.
In interactive mode, you start a BTEQ session by entering BTEQ at
the system prompt on your terminal or workstation, and submit commands to the
database as needed. In batch mode, you prepare BTEQ scripts or macros, and then
submit them to BTEQ from a scheduler for processing. A BTEQ script is a set of
SQL statements and BTEQ commands saved in a file with the extension
".bteq"; however, it does not matter what file extension is used. The
BTEQ script can be run using the following command (in UNIX or Windows):

bteq < infle > outfile

Here infile is the BTEQ script, and outfile
is the output or log file.



BTEQ COMMANDS

This section is based on Teradata documentation[1],
and for the detailed usage, please refer to Reference 1.



BTEQ Command Summary

BTEQ commands can be categorized into four functional
groups, as described below:

    * Session
      control — Session control commands begin and end BTEQ sessions, and control
      session characteristics;
    * File
      control — specify input and output formats and identify information
      sources and destinations;
    * Sequence
      control — control the sequence in which other BTEQ commands and Teradata
      SQL statements will be executed within scripts and macros;
    * Format
      control — control the format of screen and printer output.



[edit]


1. Commands for Session Control



COMMAND NAME


FUNCTION


ABORT


abort any active requests and transactions without exiting
BTEQ.


COMPILE


create or replace a Teradata stored procedure.


DECIMALDIGITS


override the precision specified by a CLI System Parameter
Block (SPB) max_decimal_returned entry,

or if that entry does not exist,

to indicate what the precision should be for decimal
values associated with subsequently issued SQL requests for non-fieldmode
responses.


DEFAULTS


Resets BTEQ command options to the values that were set
when BTEQ was first invoked.


EXIT


end the current sessions and exit BTEQ.


HALT EXECUTION


abort any active requests and transactions and exit BTEQ;
also called "HX".


LOGOFF


end the current sessions without exiting BTEQ.


LOGON


start a BTEQ session.


LOGONPROMPT


bypass the warnings related to conventional LOGON command
use.


QUIT


end the current sessions and exit BTEQ.


SESSION CHARSET


specify the name of a character set for the current
session.


SESSION RESPBUFLEN


override the buffer length specified in resp_buf_len.


SESSION SQLFLAG


specify the disposition of warnings issued in response to
violations of ANSI-compliant syntax.


SESSION TRANSACTION


specify whether transaction boundaries are determined by
Teradata SQL semantics or ANSI semantics.


SESSION TWORESPBUFS


specify whether CLI double-buffering is used.


SESSIONS


specify the number of sessions to use with the next LOGON
command.


SHOW CONTROLS


display the current configuration of the BTEQ control
command options.


SHOW VERSIONS


display the BTEQ version number, module revision numbers,
and linking date.


TDP


specify the Teradata server for subsequent logons during
the current session.


[edit]


2. Commands for File Control



COMMAND NAME


FUNCTION


=


Repeats the previous Teradata SQL request a specified
number of times.


AUTOKEYRETRIEVE


enables users to specify whether the values of any fields
associated with Identity Data are returned in response to a SQL Insert
operation.


CMS


executes a VM CMS command from within the BTEQ
environment.


ERROROUT


Routes the standard error stream and the standard output
stream to two files or devices for channel-attached systems, or to one file
or device for network-attached client systems.


EXPORT


Specifies the name and format of an export file that BTEQ
will use to store database information returned by a subsequent SQL SELECT
statement.


EXPORTEJECT


Enables suppression of the additional Page Advance ASA
Carriage Control Character at the top of the EXPORT file in REPORT mode for
MVS/VM BTEQ.


HALT EXECUTION


aborts any active requests and transactions and exit BTEQ;
also called "HX".


FORMAT


Enables all of the page-oriented formatting commands, or
disables them and centers the response from SQL SELECT statements, using the
value of the WIDTH command option to determine the space available.


IMPORT


Opens a channel- or network-attached system file, of the
specified format, to provide data for USING modifiers of subsequent SQL
statements.


INDICDATA


Specifies the mode of information returned from the
Teradata Database in response to SQL SELECT statements.


INDICDATA and / or LARGEDATAMODE


specify the response mode, either Field mode, Indicator
mode, Record mode, or Multipart Indicator Mode, for data selected from the
Teradata Database.


LARGEDATAMODE


Enables use of Teradata Database’s Multipart Indicator
response mode for inline mode retrieval of Large Object (LOB) data. BTEQ
limits the record size for exported files to approximately 64K (65473 for
workstation builds and 64260 for mainframe builds).

If more than 64K is required, SET LARGEDATAMODE allows
hex-dump style output (similar to RecordMode directed to standard output).


OS


executes an MS-DOS, PC-DOS, or UNIX command from within
the BTEQ environment.


QUIET


Limits BTEQ output to errors and request processing
statistics. BTEQ displays the results in a format that is suitable for
performance testing.


RECORDMODE


Returns data from SQL SELECT statements in client-oriented
data representations rather than character format.


REPEAT


submits the next request a specified number of times.


RUN


executes Teradata SQL requests and BTEQ commands from a
specified run file.


TSO


executes an MVS TSO command from within the BTEQ environment.


[edit]


3. Commands for Sequence Control


Use the following commands to control the sequence in which
BTEQ executes commands:


  • ABORT
  • ERRORLEVEL
  • EXIT
  • GOTO
  • HANG
  • IF...
    THEN...
  • LABEL
  • MAXERROR
  • QUIT
  • REMARK
  • REPEAT
  • =

For the commands not listed below, refer to the tables
above.



COMMAND NAME


FUNCTION


ERRORLEVEL


Assigns severity levels to errors.


GOTO


Skips over all intervening BTEQ commands and SQL
statements until a specified label is encountered, then resumes processing in
sequence.


HANG


Pauses BTEQ processing for a specified period of time.


IF... THEN...


Tests the validity of the condition stated in the IF
clause.


LABEL


Identifies the point at which BTEQ resumes processing, as
specified in a previous GOTO command.


MAXERROR


Designates a maximum error severity level beyond which
BTEQ terminates job processing.


[edit]


4. Format Control Commands


Use the following BTEQ commands to specify the way BTEQ
presents information for screenoriented and printer/printer-file oriented
output:


  • DEFAULTS
  • ECHOREQ
  • EXPORT
  • FOLDLINE
  • FOOTING
  • FORMAT
  • HEADING
  • IMPORT
  • INDICDATA
  • NULL
  • OMIT
  • PAGEBREAK
  • PAGELENGTH
  • QUIET
  • RECORDMODE
  • RETCANCEL
  • RETLIMIT
  • RETRY
  • RTITLE
  • SEPARATOR
  • SHOW
    CONTROLS
  • SIDETITLES
  • SKIPDOUBLE
  • SKIPLINE
  • SUPPRESS
  • TITLEDASHES
  • UNDERLINE
  • WIDTH

For the commands not listed below, refer to the tables
above.



COMMAND NAME


FUNCTION


ECHOREQ


Enables the echo required function that returns a copy of
each Teradata SQL request and BTEQ command to the standard output stream.


FOLDLINE


Splits (fold) each line of a report into two or more
lines.


FOOTING


Specifies a footer to appear at the bottom of every page
of a report.


HEADING


Specifies a header to appear at the top of every page of a
report.


NULL


Specifies a character or character string to represent
null field values returned from the Teradata Database.


OMIT


Excludes specified columns returned from SQL SELECT
statements.


PAGEBREAK


Ejects a page whenever the value for one or more specified
columns changes.


PAGELENGTH


specify the page length of printed reports, in lines per
page.


RETCANCEL


cancel a request when the value specified by the RETLIMIT
command ROWS option is exceeded.


RETLIMIT


Specifies the maximum number of rows and/or columns
displayed or written in response to a Teradata SQL request.


RETRY


resubmit requests that fail under certain error
conditions.


RTITLE


Specifies a header to appear at the top of every page of a
report.


SEPARATOR


Specifies a character string or width (in blank
characters) to separate columns of a report.


SIDETITLES


Position summary titles to the left of the summary lines
in a report.


SKIPDOUBLE


insert two blank lines in a report whenever the value of a
specified column changes.


SKIPLINE


Inserts a blank line in a report whenever the value of a
specified column changes.


SUPPRESS


Replaces all consecutively repeated values with all-blank
character strings.


TITLEDASHES


Display a row of dash characters before each report line
summarized by a WITH clause.


UNDERLINE


Displays a row of dash characters whenever the value of a
specified column changes.


WIDTH


Specifies the width of screen displays and printed
reports, in characters per line.

Teradata interview questions PART 3

Teradata interview questions PART 3 consolidated over period of time from various sources. Thanks! to the original posters

=================================================================

Can you recover the password of a user in Teradata?

Answers:

No, you can’t recover the password of a user in Teradata. Passwords are stored in this data dictionary table (DBC.DBASE) using a one-way encryption method. You can view the encrypted passwords using the following query

SELECT * FROM DBC.DBASE;

=================================================================

Explain Ferret Utility in Teradata?

Ferret (File Reconfiguration tool) is an utility which is used to display and set Disk Space Utilization parameters within Teradata RDBMS. When you select the Ferret Utility parameters, it dynamically reconfigures the data on disks. We can run this utility through Teradata Manager; to start the Ferret Utility type (START FERRET) in the database window.

Answers:
Following commands can be used within Ferret Utility:

1. SHOWSPACE – Well this command reports you the amount of Disk Cylinder Space is in use and the amount of Disk Cylinder Space is available in the system. This will give you an information about Permanent Space cylinders, Spool Space cylinders, Temporary Space cylinders, Journaling cylinders, Bad cylinders and Free cylinders. For each of these 5 things it will present you 3 parameters i.e. Average Utilization per cylinder, % of total avaliable cylinders and number of cylinders.
2. SHOWBLOCKS – This command will help you in identifying the Data Block size and the number of Rows per data block. This command displays the Disk Space information for a defined range of Data Blocks and Cylinders.

=================================================================

Explain TPUMP (Teradata Parallel Data Pump) Utility in Teradata?

Answers: 

* TPUMP allows near real time updates from Transactional Systems into the Data Warehouse.
* It can perform Insert, Update and Delete operations or a combination from the same source.
* It can be used as an alternative to MLOAD for low volume batch maintenance of large databases.
* TPUMP allows target tables to have Secondary Indexes, Join Indexes, Hash Indexes, Referential Integrity, Populated or Empty Table, Multiset or Set Table or Triggers defined on the Tables.
* TPUMP can have many sessions as it doesn’t have session limit.
* TPUMP uses row hash locks thus allowing concurrent updates on the same table.

=================================================================

How can you determine I/O and CPU usage at a user level in Teradata?

Answers: 

You can find out I/O and CPU Usage from this Data Dictionary Table DBC.AMPUSAGE;

SELECT ACCOUNTNAME, USERNAME, SUM(CPUTIME) AS CPU, SUM(DISKIO) AS DISKIO FROM DBC.AMPUSAGE GROUP BY 1,2 ORDER BY 3 DESC;

=================================================================

How can you find the Table Space Size of your table across all AMPs?

Answers:

You can find the Table Space Size of your table from this Data Dictionary Table DBC.TABLESIZE

SELECT DATABASENAME, TABLENAME, SUM(CURRENTPERM) FROM DBC.TABLESIZE WHERE DATABASENAME = ‘’ AND TABLENAME = ‘’ GROUP BY DATABASENAME , TABLENAME;

=================================================================
How can you find the Teradata Release and Version information from Data Dictionary Table?
Answers:


To find Release and Version information you can query this Data Dictionary table DBC.DBCINFO

SELECT * FROM DBC.DBCINFO;

=================================================================
How can you track Login Parameters of users in Teradata?
Answers: 


You can view all these parameters in this Data Dictionary Table DBC.LOGONOFF
SELECT LOGDATE, LOGTIME, USERNAME, EVENT FROM DBC.LOGONOFF;

=================================================================
How can you use HASH FUNCTIONS to view Data Distribution across all AMPs in Teradata?
Answers: 


Hash Functions can be used to view the data distribution of rows for a chosen primary index.

SELECT HASHAMP(HASHBUCKET(HASHROW())) AS “AMP#”, COUNT(*) FROM GROUP BY 1 ORDER BY 2 DESC;

HASHROW – returns the row hash value for a given value
HASHBUCKET – the grouping of a specific hash value
HASHAMP – the AMP that is associated with the hash bucket

This is really good, by looking into the result set of above written query you can easily find out the Data Distribution across all AMPs in your system and further you can easily identify un-even data distribution.

=================================================================
How do you transfer large amount of data in Teradata?
Answers: 


Transferring of large amount of data can be done using various Application Teradata Utilities which resides on the host computer ( Mainframe or Workstation) i.e. BTEQ, FastLaod, MultiLoad, Tpump and FastExport.

* BTEQ (Basic Teradata Query) supports all 4 DMLs: SELECT, INSERT, UPDATE and DELETE. BTEQ also support IMPORT/EXPORT protocols.
* Fastload, MultiLoad and Tpump transfer the data from Host to Teradata.
* FastExport is used to export data from Teradata to the Host.

=================================================================
How does Hashing happens in Teradata?

Answers: 

* Hashing is the mechanism through which data is distributed and retrieved to/from AMPs.
* Primary Index (PI) value of a row is the input to the Hashing Algorithm.
* Row Hash (32-bit number) value is the output from this Algorithm.
* Table Id + Row Hash is used to locate Cylinder and Data block.
* Same Primary Index value and data type will always produce same hash value.
* Rows with the same hash value will go to the same AMP.

So data distribution depends directly on the Row Hash uniqueness; be careful while Choosing Indexes in Teradata.

=================================================================
How to eliminate Product Joins in a Teradata SQL query?
Answers: 

 
1. Ensure statistics are collected on join columns and this is especially important if the columns you are joining on are not unique.
2. Make sure you are referencing the correct alias.
3. Also, if you have an alias, you must always reference it instead of a fully qualified tablename.
4. Sometimes product joins happen for a good reason. Joining a small table (100 rows) to a large table (1 million rows) a product join does make sense.


=================================================================
How to select first N Records in Teradata?
Answers: 


To select N records in Teradata you can use RANK function. Query syntax would be as follows
SELECT BOOK_NAME, BOOK_COUNT, RANK(BOOK_COUNT) A FROM LIBRARY QUALIFY A <= 10;

=================================================================
How to view every column and the columns contained in indexes in Teradata?
Answers:


Following query describes each column in the Teradata RDBMS
SELECT * FROM DBC.TVFields;

Following query describes columns contained in indexes in the Teradata RDBMS
SELECT * FROM DBC.Indexes;

=================================================================
What are the 5 phases in a MultiLoad Utility?
Answers:

* Preliminary Phase – Basic Setup
* DML Phase – Get DML steps down on AMPs
* Acquisition Phase – Send the input data to the AMPs and sort it
* Application Phase – Apply the input data to the appropriate Target Tables
* End Phase – Basic Cleanup

=================================================================
What are the functions of a Teradata DBA?
Answers:

Following are the different functions which a DBA can perform:
1. User Management – Creation and managing Users, Databases, Roles, Profiles and Accounts.
2. Space Allocation – Assigning Permanent Space, Spool Space and Temporary Space.
3. Access of Database Objects – Granting and Revoking Access Rights on different database objects.
4. Security Control – Handling logon and logoff rules for Users.
5. System Maintenance – Specification of system defaults, restart etc.
6. System Performance – Use of Performance Monitor(PMON), Priority Scheduler and Job Scheduling.
7. Resource Monitoring – Database Query Log(DBQL) and Access Logging.
8. Data Archives, Restores and Recovery – ARC Utility and Permanent Journals.

=================================================================
What are the MultiLoad Utility limitations?
Answers:

MultiLoad is a very powerful utility; it has following limitations:

* MultiLoad Utility doesn’t support SELECT statement.
* Concatenation of multiple input data files is not allowed.
* MultiLoad doesn’t support Arithmatic Functions i.e. ABS, LOG etc. in Mload Script.
* MultiLoad doesn’t support Exponentiation and Aggregator Operators i.e. AVG, SUM etc. in Mload Script.
* MultiLoad doesn’t support USIs (Unique Secondary Indexes), Refrential Integrity, Join Indexes, Hash Indexes and Triggers.
* Import task require use of PI (Primary Index).

=================================================================
What are TPUMP Utility Limitations?
Answers:

Following are the limitations of Teradata TPUMP Utility:
* Use of SELECT statement is not allowed.
* Concatenation of Data Files is not supported.
* Exponential & Aggregate Operators are not allowed.
* Arithmatic functions are not supported.

=================================================================
What is FILLER command in Teradata?
Answers:

While running Fastload or Multiload if you don’t want to load a particular field from the datafile to the target table then use the FILLER command to achieve this. Syntax for FILLER command would be as following:

.LAYOUT FILE_PRODUCT; /* It is input file layout name */
.FIELD Prod_No * char(11); /* To load data into Prod_No */
.FIELD Prod_Name * char(11); /* To load data into Prod_Name */
.FIELD Location * char(11); /* To load data into Location */
.FILLER Prod_Chars * char(20); /* To skip the value for the next 5 locations */

=================================================================
What is the difference between Access Logging and Query Logging in Teradata?
Answers:

1. Access Logging is concerned with security (i.e. who’s is doing what). In access logging you ask the database to log who’s doing what on a given object. The information stored is based on the object not the SQL fired or the user who fired it.
2. Query Logging (DBQL) is used for debugging (i.e. what’s happening around ?). Incase of DBQL database keep on tracking various parameters i.e. the SQLs, Resource, Spool Usage, Steps and other things which help you understand what’s going on, the information is fruitful to debug a problem. Further DBQL is enabled on a User id rather than an object like say Table or so.

=================================================================
What is the difference between Sub-Query & Co-Related Sub-Query?
Answers: 


When queries are written in a nested manner then it is termed as a sub-query. A Sub-Query get executed once for the parent statement whereas Co-Related Sub-Query get executed once for each row of the parent query.

Select Empname, Deptno, Salary from Employee Emp where Salary = (Select Max(Salary) from Employee where Deptno = Emp.Deptno) order by Deptno

=================================================================
What is Reconfiguration Utility in Teradata and What it is used for?
Answers:

* When we feed Primary Index value to Hashing Algorithm then it gives us Row Hash(32 bit number) value which is used to make entries into Hash Maps.
* Hash Maps are the mechansim for determining which AMP will be getting that row.
* Each Hash Map is an array of 65,536 entries and its size is close to 128KB.

When Teradata is installed on a system then there are some scrpits which we need to execute i.e. DIP Scripts. So it creates a Hash Maps of 65,536 entries for the current configuration. But what if you want to add some more AMPs into your system?

Reconfiguration (Reconfig) is a technique for changing the configuration (i.e. changing the number of AMPs in a system) and is controlled by the Reconfiguration Hash Maps. System builds Reconfiguration Hash Maps by reassigning hash map entries to reflect new configuration of system.

Lets understand this concept with the help of an example; suppose you have a 4 AMPs system which holds 65,536 entries. Each AMP is responsible for holding (65,536/4=16,384) 16,384 entries.

Now you have added 2 more AMPs in your current configuration so you need to reconfigure your system. Now each AMP would be responsible for holding (65,536/6=10922) 10,922 entries.

=================================================================