Teradata Learning - Data Definition (02/21/2013)


DDL : Data Definition Language: Used to create , modify and remove object definitions.

Following are DDL statements :

Object:
DDL Statements:
Databases
CREATE DATABASE
MODIFY DATABASE
DROP DATABASE
Users
CREATE USER
MODIFY USER
DROP USER
Tables
CREATE TABLE
ALTER TABLE
DROP TABLE
Views
CREATE VIEW
REPLACE VIEW
DROP VIEW
Macros
CREATE MACRO
REPLACE MACRO
DROP MACRO
Indexes
CREATE INDEX
DROP INDEX
Triggers
CREATE TRIGGER
DROP TRIGGER
Stored Procedures
CREATE PROCEDURE
DROP PROCEDURE


Note that
  • USER's and DATABASE's get modified.
  • VIEW's and MACRO's get replaced.
  • TABLE's get altered.
  • Rest need to be dropped and recreated.



Create Table statement:

Used to create and store table definition in Data dictionary.

The CREATE TABLE statement allows you to create:
  • TABLE options.
  • COLUMN definitions.
  • TABLE level constraints.
  • INDEX definitions.


CREATE <SET/MULTISET> TABLE employee
     <Create Table Options>
     <Column Definitions>
     <Index Definitions>;

Note that we specify table level options followed by column definitions , which is in turn followed by index definitions. Constraints can be specified a table level or column levels.

Create table Options:

Table level options can be :

  1. Duplicate row options
   -  SET        no duplicate rows allowed (Default). Teradata follows SET theory and hence tables are SET by default
   -  MULTISET   duplicate rows allowed. This is default when mode is ANSI.

  1. Table protection options (Underlines indicate defaults)
   -  FALLBACK or NO FALLBACK PROTECTION
   -  BEFORE JOURNAL (NOSINGLE or DUAL)
   -  AFTER JOURNAL (NOSINGLE (LOCAL or NOT LOCAL) or DUAL)
   -  WITH JOURNAL TABLE (TABLENAME)

FALLBACK specifies that the system builds a duplicate copy of each row of the table and stores it on a different (FALLBACK) AMP within the cluster.

JOURNAL means the type of journal that is maintained. JOURNAL can be BEFORE or AFTER. It can be single or dual.
JOURNAL TABLE indicates the table where JOURNAL records are maintained.

  1. Space Management Options (Defaults are provided if not specified)
    • FREESPACE - % cylinder freespace to maintain during loading operation
    • DATABLOCKSIZE - maximum block size for multi-row data blocks


Column Definitions:

CREATE TABLE can have upto 256 columns. ( This value changes with every release)

Following is what we define for each column:

Column Name
Name of the column
Data Type
Column Data type.

Data type can be any of the following:

Character: CHAR, VARCHAR, CHAR VARYING, LONG VARCHAR
Numeric: INTEGER, SMALLINT,BYTEINT,REAL,FLOAT,DOUBLE,DECIMAL,DATE
BYTE: BYTE, VARBYTE
Data Type attributes
DEFAULT --> Specify user defined Default value in place of NULL
WITH DEFAULT --> Specify system defined Default value in place of NULL.
This converts to a DEFAULT phrase in which the default system value for that data type becomes the value to use in place of null.
Example:
Spaces for character string.
0's for numeric string.

FORMAT --> Default display format
TITLE --> Provide column title
NOT NULL --> This indicates that NULL's are not allowed in the column.
CASESPECIFIC --> stores data as entered. It distinguishes smaller case of upper case.
UPPERCASE --> Converts to upper case to store data
Column Storage attributes
Compress allows one or more values in a column of a permanent table be compressed to 0 space.

The COMPRESS phrase has three variations:

Variation:
What happens:
COMPRESS
Nulls are compressed.
COMPRESS NULL
Nulls are compressed.
COMPRESS <constant>
Nulls and the specified <constant> value are compressed.

NOTE: COMPRESS & COMPRESS NULL mean the same thing.

Ex: ACCOUNT_TYPE CHAR(10) COMPRESS 'SAVINGS' --> Both null and 'Savings' account types will be suppressed. The value of 'Savings' is written in the table header on each AMP in the system.
Column level constraint attributes
We can define column level constraints to add uniqueness (UNIQUE) or checks for valid values , Primary key, Foreign keys etc



Primary Key and Primary Index:

Primary Key:
 
  • Is defined as one or more columns that uniquely identify each row in table.
  • Used with Foreign keys to create table relations.
  • PK must always be unique.
  • PK cannot be NULL

Internally PK is implemented as a unique index (primary or Secondary)

Primary Index:
  • Is defined as one or more columns that are used to distribute and locate rows in table.
  • Index can be unique or non unique
  • Index values can be NULL
  • Every table can have only one primary index.

Indexes may be: 

Primary 
Secondary 
UNIQUE
UPI
USI
NON-UNIQUE
NUPI
NUSI

Following shows the number of AMP's used when using indexes to query a table:


Number of AMPS 
Number of ROWS
UPI
1
0 or 1
NUPI
1(a PI operation irrespective of it’s a unique index or a non-unique index always is a 1 AMP operation)
 Between 0-N  
 where N = any number > 0 
USI
2
Between 0-1
NUSI
All
Between 0-N
Full table scan 
All
Between 0-N


Oftentimes, but not always, the Primary Index and Primary Key are the same. 


Example of CREATE Statement:
CREATE TABLE MJ1.emp_data,FALLBACK,
    NO BEFORE JOURNAL,
    NO AFTER JOURNAL,
    (
    employee_number INTEGER NOT NULL,
    department_number SMALLINT,
    job_code INTEGER COMPRESS ,
    last_name CHAR(20) NOT CASESPECIFIC NOT NULL,
    first_name VARCHAR(20) NOT CASESPECIFIC,
    street_address VARCHAR(30) NOT CASESPECIFIC TITLE 'Address',
    city CHAR(15) NOT CASESPECIFIC     DEFAULT 'Boise     ' 
                                                              COMPRESS 'Boise     ',
    state CHAR(2) NOT CASESPECIFIC DEFAULT ' '
    birthdate DATE FORMAT 'mm/dd/yyyy',
    salary_amount DECIMAL(10,2),
    sex CHAR(1) UPPERCASE NOT CASESPECIFIC)

UNIQUE PRIMARY INDEX ( employee_number ) --> Unique index.
INDEX ( department_number ); --> Non Unique Secondary index.

For INDEX definition,
  • if UNIQUE is not explicit, then the default is a non-unique index.
  • if PRIMARY is not explicit, then the default is a secondary index.



DROP Table :

Used to remove all data associated as well as table structure from data dictionary.
Example:

DROP TABLE emp_data;

This deletes the data, Removes table definition from DD and also removes all explicit rights on the table.

DELETE Statement:

To remove all data associated with a table, without dropping the table definition from the Data Dictionary, use the DELETE statement.

3 valid syntax are:

DELETE FROM emp_data ALL;
DELETE FROM emp_data;
DELETE emp_data; --> without from clause is also correct.

Note that DELETE * FROM  is not correct.

Table definition is not dropped and access rights remain unchanged.


ALTER TABLE statement:

Characteristics like primary index choice, set-multiset are not alterable. To change this we need to recreate the table and populate it.

Other characteristics are alterable:

  1. Adding columns

ALTER TABLE emp_data
ADD salary decimal(10,2)
ADD Saltype char(1)

Note that we don’t use the keyword COLUMN which adding column

  1. Dropping Columns

ALTER TABLE emp_data
DROP salary,saltype;

  1. Changing attributes on existing column

ALTER TABLE emp_data
ADD salary format 'zzzz9.99';

Note that column salary already exists. We are just adding the attribute -Format

  1. Changing FALLBACK, NO FALLBACK

ALTER TABLE emp_data, NO FALLBACK;

Note that we cant alter a multiset table to be set or a set table to be a multiset.
We need to drop and recreate in such situations.

Note that all the above mentioned changes can be added in same ALTER command.




Creating Secondary index:

Indexes can be defined in two ways:
  1. Using CREATE INDEX statement
  2. Using CREATE TABLE while table creation.

Primary index will always be created at the time of table creation irrespective of whether we do or do not specify the Primary index explicitly.

Only secondary indexes can be created using CREATE INDEX statement after the table is created.

The Secondary index created can be named or unnamed.

Example of Named index: CREATE UNIQUE INDEX FULLNAME (last_name,first_name) on TESTTABLE;

Here the index is unique and it is named FULLNAME.

Example of unnamed index: CREATE INDEX (salarycode) ON test_table;

Above is an unnamed index.

Note that in either case the columns forming the index should be specified in parentheses.


Help INDEX:

HELP index < table-name> provided  details of all the indexes on a table.
For unnamed indexes the column "Index Name" will be NULL.

Dropping Indexes:
If you are dropping a named index, you can specify the index by either the index name or its column definition.
If you are dropping an unnamed index, you must specify the index by naming the columns which are associated with it.
Example
Delete all secondary indexes from the employee table.
DROP INDEX       FullName       On   emp_data; --> Using Name
DROP INDEX       (job_code)       On   emp_data; --> Dropping unnamed index using column names

No comments:

Post a comment