Showing posts with label Teradata RANK. Show all posts
Showing posts with label Teradata RANK. Show all posts

Sunday, 21 April 2013

ANSI and Teradata Function Equivalents - Part 2 - RANK with Grouping





Function : RANK with Groupings
Teradata Syntax:

We use the following Teradata syntax perform ranking with groups

Following query gives employee with highest salary with department.

SELECT EMPLOYEEID,DEPARTMENTNO,SALARY,RANK(SALARY) AS RNK
FROM EMPLOYEE
GROUP BY DEPARTMENTNO
QUALIFY RNK = 1;

Employeeid
DepartmentNo
Salary
RNK
9
100
9000
1
124
200
12345.11
1
11
300
11000
1
144
400
12345.11
1

Thus in Teradata Syntax:
 
  • Column used by ranking/ordering is specified within brackets.
     
  • Ordering is DESC is default. .ie Highest value gets rank 1

  • GROUP BY clause creates the groupings. This column also forms a major sort sequence.


    ANSI syntax:

    We can get the same result by making use of ANSI syntax as follows:

    SELECT EMPLOYEEID,DEPARTMENTNO,SALARY,
    RANK() OVER (PARTITION BY DEPARTMENTNO ORDER BY SALARY DESC) AS RNK
    FROM EMPLOYEE
    QUALIFY RNK = 1;

    Employeeid
    DepartmentNo
    Salary
    RNK
    9
    100
    9000
    1
    124
    200
    12345.11
    1
    11
    300
    11000
    1
    144
    400
    12345.11
    1


    Thus in ANSI syntax:

    • Column to be used by ordering is specified by ORDER BY clause.

    • Ordering is ASC by default. We need to specify DESC in order to make highest value get rank 1.

    • PARTITION BY is used to indicate the column to be used for grouping. This column also forms a major sort sequence.

ANSI and Teradata Function Equivalents - Part 1- RANK


Each Teradata OLAP function that we have seen have ANSI equivalents.
ANSI equivalents are sometimes tedious to write.

Following Table shows the Teradata OLAP functions and its ANSI equivalent.

Function: RANK
Teradata  Syntax:

Following is how we write the Teradata RANK function:

SELECT EMPLOYEEID,SALARY, RANK (SALARY) FROM EMPLOYEE;

Employeeid
Salary
Rank(Salary)
144
12345.11
1
124
12345.11
1
12
12000
3
11
11000
4
10
10000
5
9
9000
6
8
8000
7
7
7000
8
6
6000
9
5
5000
10
4
4000
11
3
3000
12
2
2000
13
1
1000
14

The Column name in the brackets indicate the column to be  used for ranking and ordering data.
In this syntax the highest value gets rank 1 by default.

Thus in Teradata Syntax:
  • Column used by ranking/ordering is specified within brackets.
     
  • Ordering is DESC is default. .ie Highest value gets rank 1


ANIS Syntax:

Following is how we write the RANK function in ANSI:

SELECT EMPLOYEEID,SALARY,
RANK() OVER (ORDER BY SALARY DESC) FROM EMPLOYEE;
 
Employeeid
Salary
Rank(Salary)
144
12345.11
1
124
12345.11
1
12
12000
3
11
11000
4
10
10000
5
9
9000
6
8
8000
7
7
7000
8
6
6000
9
5
5000
10
4
4000
11
3
3000
12
2
2000
13
1
1000
14

The column specified by ORDER BY indicates the column to be used for ranking.

Note the word DESC. DESC indicates that ordering is Descending .ie highest value gets 1st rank.

Note that this is by default in Teradata syntax, but not in ANSI syntax.
So we need to add DESC in the syntax.

if we don't add DESC, then default in ANSI syntax is ASC, which means the lowest value gets rank 1 and highest value gets highest rank.

Ex:

SELECT EMPLOYEEID,SALARY,RANK() OVER (ORDER BY SALARY/*BY DEFAULT ORDERING IS ASC*/) FROM EMPLOYEE;

Employeeid
Salary
Rank(Salary ASC)
1
1000
1
2
2000
2
3
3000
3
4
4000
4
5
5000
5
6
6000
6
7
7000
7
8
8000
8
9
9000
9
10
10000
10
11
11000
11
12
12000
12
144
12345.11
13
124
12345.11
13

Thus in ANSI syntax:

  • Column to be used by ordering is specified by ORDER BY clause.

  • Ordering is ASC by default. We need to specify DESC in order to make highest value get rank 1.

Wednesday, 27 March 2013

OLAP - Part 7 - RANK with Qualify and Group BY


RANK with ‘Qualify’ and ‘GROUP BY’:

  1. The QUALIFY clause allows restriction of which rankings will be output in the final result.

  1. QUALIFY performs like the HAVING clause by requesting a specific range in the output.

  1. Example 1:

select employeeid,salary,rank(salary) from employee
qualify rank(salary) < 4

Employeeid
Salary
Rank(Salary)
12
12000
1
11
11000
2
10
10000
3

Note that the Qualify restricts the result only to the rows with rank lesser than 4.

  1. Example 2 :

select employeeid ,DepartmentNO, salary,rank(salary) from employee
group by DepartmentNO
qualify rank(salary) < 3

Employeeid
DepartmentNo
Salary
Rank(Salary)
9
100
9000
1
5
100
5000
2
10
200
10000
1
6
200
6000
2
11
300
11000
1
7
300
7000
2
12
400
12000
1
8
400
8000
2

Above query restricts the result to 1st two rank for each department.

Note that here one more feature is being used . ie GROUP BY.
The GROUP BY isn't doing an aggregation. It is actually changing the scope of the query. It also causes an ascending sort.

The default sort sequence is descending by ranking column .
Due to GROUP BY, the sort is by salary ascending within department. With in a given department the sort is descending by the Ranked column .ie salary.

The GROUP BY clause may be used in conjunction with a RANK function to change the scope of the ranking. 

Without a GROUP BY clause, the scope defaults to the RANK column.
In the above example 1 the scope is the ranking column itself.

As shown in Example 2 ‘GROUP BY” adds a major sorting column as well.
Ascending Sort is performed first on the column in GROUP BY and then sorted descending on the ranked column.


  1. Sometimes we may need to use low order ranking .For example: To find out lowest 3 salaries .This is accomplished by changing the default order of the rank column from descending to ascending.

For this we use RANK(column ASC).
         Example:


select employeeid,DepartmentNO,salary,rank(salary asc) from employee

qualify rank(salary asc ) < 4

Employeeid
DepartmentNo
Salary
Rank(Salary ASC)
1
100
1000
1
2
200
2000
2
3
300
3000
3



The QUALIFY clause <= 4 means produce a list of the ‘first 4’ whatever the sequencing of the list. Qualify Does not look at the absolute mathematical value. Instead it looks only at the rank value irrespective of whether ranking uses descending(default) ordering or ascending ordering.
So if we are using rank with ascending sort on the column then it produces lowest 3 records.

  1. Beware of below query

select employeeid,DepartmentNO,salary,rank(salary asc) from employee

qualify rank(salary) < 4

The select part contains RANK with ascending for ordering, which means the lowest value will get 1st rank. However qualify does not use ascending . So while returning back the row it will return 1st 3 rows with higher salary and their ranks.

Employeeid
DepartmentNo
Salary
Rank(Salary ASC)
12
400
12000
12
11
300
11000
11
10
200
10000
10

To understand this better we can query as follows

select employeeid,
          DepartmentNO,
          salary,
          rank(salary),
          rank(salary asc) from employee

qualify rank(salary) < 4

Employeeid
DepartmentNo
Salary
Rank(Salary)
Rank(Salary ASC)
12
400
12000
1
12
11
300
11000
2
11
10
200
10000
3
10

Now we can see that the column used for qualifying is the 4th column.

  1. Though with RANK we get a default ordering either by the ranking column or GROUP BY column, We can use ORDER BY to change the display ordering. Note that this ordering is applied at the end after the result set is ready.

When order by clause is used, the ORDER BY becomes the major sort in the results list.

select employeeid,DepartmentNO,hiredate,salary,rank(salary) from employee
order by hiredate;

Employeeid
DepartmentNo
Hiredate
Salary
Rank(Salary)
8
400
1/1/1992
8000
5
4
400
1/1/1996
4000
9
7
300
6/1/1996
7000
6
3
300
1/1/1997
3000
10
6
200
3/1/1997
6000
7
10
200
1/1/1998
10000
3
2
200
1/1/1998
2000
11
5
100
2/1/1998
5000
8
1
100
1/1/1999
1000
12
9
100
4/1/1999
9000
4
11
300
5/1/1999
11000
2
12
400
8/1/1999
12000
1