Tuesday 12 November 2013

9.3 Teradata Parallel Transporter - Load Operator - 2 examples


Example 1: We use a delimited file to load a empty table using Load operator
Following is how the file looks like :

 more input_delimited_file.txt
2|sukul
3|uma
4|bhanu
5|chettri

Script:

 
DEFINE JOB LOAD_EMPLOYEE_TABLE_FROM_FILE --> Job name
DESCRIPTION 'Load file  to an empty table'
(
/*****************************/
DEFINE SCHEMA EMPLOYEE_SCHEMA
DESCRIPTION 'SAMPLE EMPLOYEE SCHEMA'
(
EMP_ID   VARCHAR(10),  --> All columns must be defined varchar as we are reading delimited file.
EMP_NAME VARCHAR(30)
);
/*****************************/

DEFINE OPERATOR FILE_READER()
DESCRIPTION 'TERADATA PARALLEL TRANSPORTER DATA CONNECTOR OPERATOR'
TYPE DATACONNECTOR PRODUCER
SCHEMA EMPLOYEE_SCHEMA
ATTRIBUTES
(
VARCHAR PrivateLogName    = 'file_reader_private.log',
VARCHAR FileName          = 'input_delimited_file.txt',
VARCHAR IndicatorMode     = 'N',
VARCHAR OpenMode          = 'Read',
VARCHAR Format            = 'Delimited'
);



/*****************************/
DEFINE OPERATOR LOADOP
DESCRIPTION 'Teradata PT LOAD OPERATOR'
TYPE LOAD
SCHEMA *  --> Use same schema as the producer
ATTRIBUTES
(
VARCHAR TdpId = 'Prd.Database.com',
VARCHAR UserName = 'sukul',
VARCHAR UserPassword = 'mysillypwd',
VARCHAR TargetTable = 'FINANCE.SOURCE_EMP_TABLE',
/* Target table , Log table name, Et1 table name and ET2 table name*/
VARCHAR LogTable = 'FINANCE.SOURCE_EMP_TABLE_LOG',
VARCHAR ErrorTable1 = 'FINANCE.SOURCE_EMP_TABLE_ET1',
VARCHAR ErrorTable2 = 'FINANCE.SOURCE_EMP_TABLE_ET2',
VARCHAR PrivateLogName = 'load_log'
);

/*****************************/

STEP export_to_file
(
APPLY ('INSERT INTO FINANCE.SOURCE_EMP_TABLE VALUES (:EMP_ID,:EMP_NAME);')
/* The insert statements shuld be inside brackets and single quotes. Note that even the semi-colon is inside the semicolon.
Also note we use the column name preceded by colons.*/
TO OPERATOR (LOADOP() )
SELECT EMP_ID,EMP_NAME FROM OPERATOR (FILE_READER());
);
);
Log:

 
Teradata Parallel Transporter Version 13.10.00.05
Job log: /opt/teradata/client/13.10/tbuild/logs/sukul-39.out
Job id is sukul-39, running on testunixbox
Teradata Parallel Transporter Load Operator Version 13.10.00.04
LOADOP: private log specified: load_log
Teradata Parallel Transporter DataConnector Version 13.10.00.05
FILE_READER Instance 1 directing private log report to 'file_reader_private.log-1'.
FILE_READER: TPT19008 DataConnector Producer operator Instances: 1
FILE_READER: TPT19003 ECI operator ID: FILE_READER-7558
FILE_READER: TPT19222 Operator instance 1 processing file 'input_delimited_file.txt'.
LOADOP: connecting sessions
LOADOP: preparing target table
LOADOP: entering Acquisition Phase
LOADOP: entering Application Phase
LOADOP: Statistics for Target Table:  'FINANCE.SOURCE_EMP_TABLE'
LOADOP: Total Rows Sent To RDBMS:      4
LOADOP: Total Rows Applied:            4
FILE_READER: TPT19221 Total files processed: 1.
LOADOP: disconnecting sessions
LOADOP: Total processor time used = '1.51 Second(s)'
LOADOP: Start : Tue Nov 12 12:20:47 2013
LOADOP: End   : Tue Nov 12 12:21:02 2013
Job step export_to_file completed successfully
Job sukul completed successfully

If we try to submit the same script again the job will fail as load operator can load only empty tables:

 
Teradata Parallel Transporter Version 13.10.00.05
Job log: /opt/teradata/client/13.10/tbuild/logs/sukul-40.out
Job id is sukul-40, running on testunixbox
Teradata Parallel Transporter DataConnector Version 13.10.00.05
FILE_READER Instance 1 directing private log report to 'file_reader_private.log-1'.
Teradata Parallel Transporter Load Operator Version 13.10.00.04
LOADOP: private log specified: load_log
FILE_READER: TPT19008 DataConnector Producer operator Instances: 1
FILE_READER: TPT19003 ECI operator ID: FILE_READER-21020
FILE_READER: TPT19222 Operator instance 1 processing file 'input_delimited_file.txt'.
LOADOP: connecting sessions
LOADOP: preparing target table
LOADOP: TPT10508: RDBMS error 2636: SOURCE_EMP_TABLE must be empty for Fast Loading.
LOADOP: disconnecting sessions
LOADOP: Total processor time used = '1.51 Second(s)'
LOADOP: Start : Tue Nov 12 12:32:03 2013
LOADOP: End   : Tue Nov 12 12:32:11 2013
FILE_READER: TPT19221 Total files processed: 0.
Job step export_to_file terminated (status 12)
Job sukul terminated (status 12)

Example 2: We load a file to the table, with Pause Acq = 'Y'. So the data will not actually loaded in the target table.
We also add error limit of 2 and add a bad record in the input file.
Also we add a duplicate record to the file.
Input File:

2|sukul
2|sukul  --> duplicate record
3|uma
4|bhanu
5|chettri
A|neel --> bad record having character A for a numeric field

Script:


DEFINE JOB LOAD_EMPLOYEE_TABLE_FROM_FILE
DESCRIPTION 'Load file  to an empty table'
(
/*****************************/
DEFINE SCHEMA EMPLOYEE_SCHEMA
DESCRIPTION 'SAMPLE EMPLOYEE SCHEMA'
(
EMP_ID   VARCHAR(10),
EMP_NAME VARCHAR(30)
);
/*****************************/

DEFINE OPERATOR FILE_READER()
DESCRIPTION 'TERADATA PARALLEL TRANSPORTER DATA CONNECTOR OPERATOR'
TYPE DATACONNECTOR PRODUCER
SCHEMA EMPLOYEE_SCHEMA
ATTRIBUTES
(
VARCHAR PrivateLogName    = 'file_reader_private.log',
VARCHAR FileName          = 'input_delimited_file.txt',
VARCHAR IndicatorMode     = 'N',
VARCHAR OpenMode          = 'Read',
VARCHAR Format            = 'Delimited'
);

/*****************************/
DEFINE OPERATOR LOADOP
DESCRIPTION 'Teradata PT LOAD OPERATOR'
TYPE LOAD
SCHEMA *
ATTRIBUTES
(
VARCHAR TdpId = 'Prd.Database.com',
VARCHAR UserName = 'sukul',
VARCHAR UserPassword = 'mysillypwd',
VARCHAR PAUSEACQ ='Y',
INTEGER Errorlimit = 2,
VARCHAR TargetTable = 'FINANCE.SOURCE_EMP_TABLE',
VARCHAR LogTable = 'FINANCE.SOURCE_EMP_TABLE_LOG',
VARCHAR ErrorTable1 = 'FINANCE.SOURCE_EMP_TABLE_ET1',
VARCHAR ErrorTable2 = 'FINANCE.SOURCE_EMP_TABLE_ET2',
VARCHAR PrivateLogName = 'load_log'
);

/*****************************/

STEP export_to_file
(
APPLY ('INSERT INTO FINANCE.SOURCE_EMP_TABLE VALUES (:EMP_ID,:EMP_NAME);')
TO OPERATOR (LOADOP() )
SELECT EMP_ID,EMP_NAME FROM OPERATOR (FILE_READER());
);
);

Log:

Teradata Parallel Transporter Version 13.10.00.05
Job log: /opt/teradata/client/13.10/tbuild/logs/sukul-41.out
Job id is sukul-41, running on testunixbox
Found CheckPoint file: /opt/teradata/client/13.10/tbuild/checkpoint/sukulLVCP
This is a restart job; it restarts at step export_to_file.
Teradata Parallel Transporter Load Operator Version 13.10.00.04
LOADOP: private log specified: load_log
Teradata Parallel Transporter DataConnector Version 13.10.00.05
FILE_READER Instance 1 directing private log report to 'file_reader_private.log-1'.
FILE_READER: TPT19008 DataConnector Producer operator Instances: 1
FILE_READER: TPT19003 ECI operator ID: FILE_READER-856
FILE_READER: TPT19222 Operator instance 1 processing file 'input_delimited_file.txt'.
LOADOP: connecting sessions
LOADOP: preparing target table
LOADOP: entering Acquisition Phase  --> notice no application phase
FILE_READER: TPT19221 Total files processed: 1.
LOADOP: disconnecting sessions
LOADOP: Total processor time used = '1.51 Second(s)'
LOADOP: Start : Tue Nov 12 12:51:34 2013
LOADOP: End   : Tue Nov 12 12:51:46 2013
Job step export_to_file terminated (status 4) --> Return code of 4 as the job paused
Job sukul completed successfully, but with warning(s).

Private log:

 
 tlogview -l /opt/teradata/client/13.10/tbuild/logs/sukul-41.out -f load_log

     ===================================================================
     =                                                                 =
     =                  TERADATA PARALLEL TRANSPORTER                  =
     =                                                                 =
     =              LOAD OPERATOR     VERSION 13.10.00.04              =
     =                                                                 =
     =          OPERATOR SUPPORT LIBRARY VERSION 13.10.00.04           =
     =                                                                 =
     = COPYRIGHT 2001-2010, TERADATA CORPORATION. ALL RIGHTS RESERVED. =
     =                                                                 =
     ===================================================================

**** 12:51:34 Processing starting at: Tue Nov 12 12:51:34 2013

     ===================================================================
     =                                                                 =
     =                      Module Identification                      =
     =                                                                 =
     ===================================================================

     Load Operator for HP-UX release B.11.31 on testunixbox
     LoadMain   : 13.10.00.09
     LoadCLI    : 13.10.00.09
     LoadUtil   : 13.10.00.02
     PcomCLI    : 13.10.00.15
     PcomMBCS   : 13.00.00.02
     PcomMsgs   : 13.01.00.07
     PcomNtfy   : 13.00.00.03
     PcomPx     : 13.10.00.16
     PcomUtil   : 13.10.00.07
     PXICU      : 13.10.00.04
     TDICU      : 13.10.00.03
     CLIv2      : 13.10.00.04

     ===================================================================
     =                                                                 =
     =                      Attribute Definitions                      =
     =                                                                 =
     ===================================================================

**** 12:51:34 Options in effect for this job:
              OperatorType:  Consumer
              Instances:     1
              Character set: Not specified; will use default
              Checkpoint:    No checkpoint in effect
              Notify:        Not enabled
              Buffer size:   Maximum allowable
              Error limit:   2 rejected record(s)
              Tenacity:      4 hour limit to successfully connect
              Sleep:         6 minute(s) between connect retries
              Pause Acq:     In effect

     ===================================================================
     =                                                                 =
     =                     Column/Field Definition                     =
     =                                                                 =
     ===================================================================

     Column Name                    Offset Length Type
     ============================== ====== ====== ========================
     EMP_ID                              0     10 VARCHAR
     EMP_NAME                           12     30 VARCHAR
     ============================== ====== ====== ========================
     INDICATOR BYTES NEEDED: 1
     EXPECTED RECORD LENGTH: 45

     ===================================================================
     =                                                                 =
     =                   Control Session Connection                    =
     =                                                                 =
     ===================================================================

**** 12:51:34 Connecting to RDBMS:    'Prd.Database.com'
**** 12:51:34 Connecting with UserId: 'sukul'

     ===================================================================
     =                                                                 =
     =                  Teradata Database Information                  =
     =                                                                 =
     ===================================================================

**** 12:51:35 Teradata Database Version:      '13.10.07.14                     '
**** 12:51:35 Teradata Database Release:      '13.10.07.02                   '
**** 12:51:35 Maximum request size supported: 1MB
**** 12:51:35 Session character set:          'ASCII'
**** 12:51:35 Total AMPs available:           384
**** 12:51:35 Data Encryption:                supported
**** 12:51:35 Restart log table 'FINANCE.SOURCE_EMP_TABLE_LOG' exists from previous job
**** 12:51:35 This job is restarting

     ===================================================================
     =                                                                 =
     =                   Special Session Connection                    =
     =                                                                 =
     ===================================================================

**** 12:51:39 Number of sessions adjusted due to TASM:      16

              Instance Assigned Connected Result
              ======== ======== ========= ======================
                  1        16       16    Successful
              ======== ======== ========= ======================
                Total      16       16    Successful

     ===================================================================
     =                                                                 =
     =                 Target/Error Table Information                  =
     =                                                                 =
     ===================================================================

**** 12:51:39 This job will use the following tables:

              Target Table:  'FINANCE.SOURCE_EMP_TABLE'
              Error Table 1: 'FINANCE.SOURCE_EMP_TABLE_ET1'
              Error Table 2: 'FINANCE.SOURCE_EMP_TABLE_ET2'

     ===================================================================
     =                                                                 =
     =                        Acquisition Phase                        =
     =                                                                 =
     ===================================================================

**** 12:51:40 DML statement for DML Group: 1

     INSERT INTO FINANCE.SOURCE_EMP_TABLE VALUES (:EMP_ID,:EMP_NAME);

**** 12:51:40 Number of records per buffer for this job: 1427
**** 12:51:41 Starting to send data to the RDBMS
**** 12:51:42 Checkpoint complete. Rows sent: 6
**** 12:51:44 Finished sending rows to the RDBMS

                        Instance    Rows Sent
                        ========  =============
                            1               6
                        ========  =============
                          Total             6

**** 12:51:44 Pause Acquisition in effect. Skipping Application Phase. --> application phase was skipped

     ===================================================================
     =                                                                 =
     =                        Logoff/Disconnect                        =
     =                                                                 =
     ===================================================================

**** 12:51:44 Logging off all sessions

              Instance      Cpu Time
              ========  ================
                   1        1.51 Seconds

**** 12:51:46 Total processor time used = '1.51 Second(s)'
     .        Start : Tue Nov 12 12:51:34 2013
     .        End   : Tue Nov 12 12:51:46 2013
     .        Highest return code encountered = '4'.
**** 12:51:46 This job paused
To complete the loading part we can submit the same script as above with PAUSEACQ='N'
Or can generate a script with standalone Load operator as shown below:

 
DEFINE JOB LOAD_EMPLOYEE_TABLE_FROM_FILE
DESCRIPTION 'Load file  to an empty table'
(

/*****************************/
DEFINE OPERATOR LOADOP
DESCRIPTION 'Teradata PT LOAD OPERATOR'
TYPE LOAD /* keyword "StandAlone" is optional*/
SCHEMA *
ATTRIBUTES
(
VARCHAR TdpId = 'Prd.Database.com',
VARCHAR UserName = 'sukul',
VARCHAR UserPassword = 'mysillypwd',
VARCHAR PAUSEACQ ='N',
/* pause acq N means the job will run the application phase*/
INTEGER Errorlimit = 2,
VARCHAR TargetTable = 'FINANCE.SOURCE_EMP_TABLE',
VARCHAR LogTable = 'FINANCE.SOURCE_EMP_TABLE_LOG',
/* Note that the log table name should not change.
Even in case of staged loads all the scripts should use the same log table name*/
VARCHAR ErrorTable1 = 'FINANCE.SOURCE_EMP_TABLE_ET1',
VARCHAR ErrorTable2 = 'FINANCE.SOURCE_EMP_TABLE_ET2',
VARCHAR PrivateLogName = 'load_log'
);

/*****************************/

STEP export_to_file
(
APPLY
TO OPERATOR (LOADOP() )
;
);
);
Log:

 
Teradata Parallel Transporter Version 13.10.00.05
Job log: /opt/teradata/client/13.10/tbuild/logs/sukul-42.out
Job id is sukul-42, running on testunixbox
Teradata Parallel Transporter Load Operator Version 13.10.00.04
LOADOP: private log specified: load_log
LOADOP: connecting sessions
LOADOP: preparing target table
LOADOP: entering Application Phase --> directly entering the application phase
LOADOP: Statistics for Target Table:  'FINANCE.SOURCE_EMP_TABLE'
LOADOP: Total Rows Sent To RDBMS:      6
LOADOP: Total Rows Applied:            4
LOADOP: Total Possible Duplicate Rows: 2
LOADOP: disconnecting sessions
LOADOP: Total processor time used = '1.5 Second(s)'
LOADOP: Start : Tue Nov 12 13:05:41 2013
LOADOP: End   : Tue Nov 12 13:05:50 2013
Job step export_to_file completed successfully
Job sukul completed successfully
Private log:

 
tlogview -l /opt/teradata/client/13.10/tbuild/logs/sukul-42.out -f load_log

     ===================================================================
     =                                                                 =
     =                  TERADATA PARALLEL TRANSPORTER                  =
     =                                                                 =
     =              LOAD OPERATOR     VERSION 13.10.00.04              =
     =                                                                 =
     =          OPERATOR SUPPORT LIBRARY VERSION 13.10.00.04           =
     =                                                                 =
     = COPYRIGHT 2001-2010, TERADATA CORPORATION. ALL RIGHTS RESERVED. =
     =                                                                 =
     ===================================================================

**** 13:05:41 Processing starting at: Tue Nov 12 13:05:41 2013

     ===================================================================
     =                                                                 =
     =                      Module Identification                      =
     =                                                                 =
     ===================================================================

     Load Operator for HP-UX release B.11.31 on testunixbox
     LoadMain   : 13.10.00.09
     LoadCLI    : 13.10.00.09
     LoadUtil   : 13.10.00.02
     PcomCLI    : 13.10.00.15
     PcomMBCS   : 13.00.00.02
     PcomMsgs   : 13.01.00.07
     PcomNtfy   : 13.00.00.03
     PcomPx     : 13.10.00.16
     PcomUtil   : 13.10.00.07
     PXICU      : 13.10.00.04
     TDICU      : 13.10.00.03
     CLIv2      : 13.10.00.04

     ===================================================================
     =                                                                 =
     =                      Attribute Definitions                      =
     =                                                                 =
     ===================================================================

**** 13:05:41 Options in effect for this job:
              OperatorType:  StandAlone
              Instances:     1
              Character set: Not specified; will use default
              Checkpoint:    No checkpoint in effect
              Notify:        Not enabled
              Buffer size:   Maximum allowable
              Error limit:   2 rejected record(s)
              Tenacity:      4 hour limit to successfully connect
              Sleep:         6 minute(s) between connect retries
              Pause Acq:     Not in effect

     ===================================================================
     =                                                                 =
     =                   Control Session Connection                    =
     =                                                                 =
     ===================================================================

**** 13:05:41 Connecting to RDBMS:    'Prd.Database.com'
**** 13:05:41 Connecting with UserId: 'sukul'

     ===================================================================
     =                                                                 =
     =                  Teradata Database Information                  =
     =                                                                 =
     ===================================================================

**** 13:05:42 Teradata Database Version:      '13.10.07.14                     '
**** 13:05:42 Teradata Database Release:      '13.10.07.02                   '
**** 13:05:42 Maximum request size supported: 1MB
**** 13:05:42 Session character set:          'ASCII'
**** 13:05:42 Total AMPs available:           384
**** 13:05:42 Data Encryption:                supported
**** 13:05:42 Restart log table 'FINANCE.SOURCE_EMP_TABLE_LOG' exists from previous job
**** 13:05:42 This job is restarting

     ===================================================================
     =                                                                 =
     =                   Special Session Connection                    =
     =                                                                 =
     ===================================================================

**** 13:05:46 Number of sessions adjusted due to TASM:      16

              Instance Assigned Connected Result
              ======== ======== ========= ======================
                  1        16       16    Successful
              ======== ======== ========= ======================
                Total      16       16    Successful

     ===================================================================
     =                                                                 =
     =                 Target/Error Table Information                  =
     =                                                                 =
     ===================================================================

**** 13:05:46 This job will use the following tables:

              Target Table:  'FINANCE.SOURCE_EMP_TABLE'
              Error Table 1: 'FINANCE.SOURCE_EMP_TABLE_ET1'
              Error Table 2: 'FINANCE.SOURCE_EMP_TABLE_ET2'

**** 13:05:46 Number of rows sent, last job: 6

     ===================================================================
     =                                                                 =
     =                        Application Phase                        =
     =                                                                 =
     ===================================================================

**** 13:05:46 Application Phase Begin

              Statistics for Target Table: 'FINANCE.SOURCE_EMP_TABLE'
              Total Rows Sent To RDBMS:      6
              Total Rows Applied:            4
              Total Possible Duplicate Rows: 2  --> duplicates are just reported

**** 13:05:47 Application Phase End

     ===================================================================
     =                                                                 =
     =                   Load Operator Task Cleanup                    =
     =                                                                 =
     ===================================================================

     Target Table 1: 'FINANCE.SOURCE_EMP_TABLE'
     Number of rows  Error Table Name
     ==============  ===============================================
                  1  'FINANCE.SOURCE_EMP_TABLE_ET1' --> 1 bad record. Had there been two bad records the job would have failed
                  0  'FINANCE.SOURCE_EMP_TABLE_ET2'

**** 13:05:48 Restart log table 'FINANCE.SOURCE_EMP_TABLE_LOG' has been dropped

     ===================================================================
     =                                                                 =
     =                        Logoff/Disconnect                        =
     =                                                                 =
     ===================================================================

**** 13:05:48 Logging off all sessions

              Instance      Cpu Time
              ========  ================
                   1        1.50 Seconds

**** 13:05:50 Total processor time used = '1.5 Second(s)'
     .        Start : Tue Nov 12 13:05:41 2013
     .        End   : Tue Nov 12 13:05:50 2013
     .        Highest return code encountered = '0'.
**** 13:05:50 This job terminated


No comments:

Post a Comment