Blog Archive

Sunday 21 April 2013

ANSI and Teradata Function Equivalents - Part 3 - Cumulative SUM(CSUM) and CSUM with Grouping



Function: Cumulative Sum
Teradata Syntax:

SELECT
EMPLOYEEID,
HIREDATE,
SALARY,
CSUM(SALARY,HIREDATE)
FROM EMPLOYEE;

Employeeid
Hiredate
Salary
CSum(Salary,Hiredate)
8
1/1/1992
8000
8000
4
1/1/1996
4000
12000
7
6/1/1996
7000
19000
3
1/1/1997
3000
22000
6
3/1/1997
6000
28000
10
1/1/1998
10000
38000
2
1/1/1998
2000
40000
5
2/1/1998
5000
45000
1
1/1/1999
1000
46000
9
4/1/1999
9000
55000
11
5/1/1999
11000
66000
12
8/1/1999
12000
78000
144
4/20/2013
12345.11
90345.11
124
4/20/2013
12345.11
102690.2



In Teradata Syntax:

  • The 1st column in the brackets indicates the column to be Summed

  • Columns following in the 1st column in brackets indicate the columns used for sorting. BY default its ASCENDING.

ANSI syntax:

Following is the way we write ANSI syntax to get same result:

SELECT EMPLOYEEID,HIREDATE,SALARY,
SUM(SALARY) OVER (ORDER BY HIREDATE ROWS UNBOUNDED PRECEDING)
FROM EMPLOYEE;

 
Employeeid
Hiredate
Salary
Cumulative Sum(Salary)
8
1/1/1992
8000
8000
4
1/1/1996
4000
12000
7
6/1/1996
7000
19000
3
1/1/1997
3000
22000
6
3/1/1997
6000
28000
2
1/1/1998
2000
30000
10
1/1/1998
10000
40000
5
2/1/1998
5000
45000
1
1/1/1999
1000
46000
9
4/1/1999
9000
55000
11
5/1/1999
11000
66000
12
8/1/1999
12000
78000
144
4/20/2013
12345.11
90345.11
124
4/20/2013
12345.11
102690.2

In ANSI syntax:

  • The column do be used for summing up is enclosed in the brackets following keyword SUM.
Note that when we used RANK, we did not write any column in the brackets following the keyword RANK.

  • Column used for ordering is specified by the 'ORDER BY' clause. Defaulting sequence is ASCENDING.

  • ROWS UNBOUNDED PRECEEDING indicates that all the rows preceding the current row (including current row) are to be used for summation. This generate Cumulative Summation.




Function : Cumulative SUM with grouping
Teradata Syntax:
 
Following is the way we perform cumulative sum with grouping
(also called cumulative sum with reset)

SELECT EMPLOYEEID,HIREDATE,DEPARTMENTNO,SALARY,
CSUM(SALARY,HIREDATE)
FROM EMPLOYEE
GROUP BY DEPARTMENTNO;

Employeeid
Hiredate
DepartmentNo
Salary
CSum(Salary,Hiredate)
5
2/1/1998
100
5000
5000
1
1/1/1999
100
1000
6000
9
4/1/1999
100
9000
15000
6
3/1/1997
200
6000
6000
10
1/1/1998
200
10000
16000
2
1/1/1998
200
2000
18000
124
4/20/2013
200
12345.11
30345.11
7
6/1/1996
300
7000
7000
3
1/1/1997
300
3000
10000
11
5/1/1999
300
11000
21000
8
1/1/1992
400
8000
8000
4
1/1/1996
400
4000
12000
12
8/1/1999
400
12000
24000
144
4/20/2013
400
12345.11
36345.11

Group by indicates the column to be used for resetting the cumulative sum.
When the column changes ( in above example as the department changes ) the accumulated value is discarded.

Note that Column indicated by group by also indicates the major sorting sequence. The output is sorted by this column used for grouping.
ANSI syntax:

Following is how we use ANSI syntax to perform cumulative sum with reset


SELECT EMPLOYEEID,HIREDATE,DEPARTMENTNO,SALARY,
SUM(SALARY) OVER (PARTITION BY DEPARTMENTNO ORDER BY HIREDATE ROWS UNBOUNDED PRECEDING)
FROM EMPLOYEE;


Employeeid
Hiredate
DepartmentNo
Salary
Cumulative Sum(Salary)
5
2/1/1998
100
5000
5000
1
1/1/1999
100
1000
6000
9
4/1/1999
100
9000
15000
6
3/1/1997
200
6000
6000
10
1/1/1998
200
10000
16000
2
1/1/1998
200
2000
18000
124
4/20/2013
200
12345.11
30345.11
7
6/1/1996
300
7000
7000
3
1/1/1997
300
3000
10000
11
5/1/1999
300
11000
21000
8
1/1/1992
400
8000
8000
4
1/1/1996
400
4000
12000
12
8/1/1999
400
12000
24000
144
4/20/2013
400
12345.11
36345.11

PARTITION BY indicates the column(s) to be used for grouping.

In the above example as the DEPARTMENTNO changes the cumulative sum is reset.

No comments:

Post a Comment