Showing posts with label Teradata SUM function. Show all posts
Showing posts with label Teradata SUM function. Show all posts

Sunday, 21 April 2013

ANSI and Teradata Function Equivalents - Part 3 - Cumulative SUM(CSUM) and CSUM with Grouping



Function: Cumulative Sum
Teradata Syntax:

SELECT
EMPLOYEEID,
HIREDATE,
SALARY,
CSUM(SALARY,HIREDATE)
FROM EMPLOYEE;

Employeeid
Hiredate
Salary
CSum(Salary,Hiredate)
8
1/1/1992
8000
8000
4
1/1/1996
4000
12000
7
6/1/1996
7000
19000
3
1/1/1997
3000
22000
6
3/1/1997
6000
28000
10
1/1/1998
10000
38000
2
1/1/1998
2000
40000
5
2/1/1998
5000
45000
1
1/1/1999
1000
46000
9
4/1/1999
9000
55000
11
5/1/1999
11000
66000
12
8/1/1999
12000
78000
144
4/20/2013
12345.11
90345.11
124
4/20/2013
12345.11
102690.2



In Teradata Syntax:

  • The 1st column in the brackets indicates the column to be Summed

  • Columns following in the 1st column in brackets indicate the columns used for sorting. BY default its ASCENDING.

ANSI syntax:

Following is the way we write ANSI syntax to get same result:

SELECT EMPLOYEEID,HIREDATE,SALARY,
SUM(SALARY) OVER (ORDER BY HIREDATE ROWS UNBOUNDED PRECEDING)
FROM EMPLOYEE;

 
Employeeid
Hiredate
Salary
Cumulative Sum(Salary)
8
1/1/1992
8000
8000
4
1/1/1996
4000
12000
7
6/1/1996
7000
19000
3
1/1/1997
3000
22000
6
3/1/1997
6000
28000
2
1/1/1998
2000
30000
10
1/1/1998
10000
40000
5
2/1/1998
5000
45000
1
1/1/1999
1000
46000
9
4/1/1999
9000
55000
11
5/1/1999
11000
66000
12
8/1/1999
12000
78000
144
4/20/2013
12345.11
90345.11
124
4/20/2013
12345.11
102690.2

In ANSI syntax:

  • The column do be used for summing up is enclosed in the brackets following keyword SUM.
Note that when we used RANK, we did not write any column in the brackets following the keyword RANK.

  • Column used for ordering is specified by the 'ORDER BY' clause. Defaulting sequence is ASCENDING.

  • ROWS UNBOUNDED PRECEEDING indicates that all the rows preceding the current row (including current row) are to be used for summation. This generate Cumulative Summation.




Function : Cumulative SUM with grouping
Teradata Syntax:
 
Following is the way we perform cumulative sum with grouping
(also called cumulative sum with reset)

SELECT EMPLOYEEID,HIREDATE,DEPARTMENTNO,SALARY,
CSUM(SALARY,HIREDATE)
FROM EMPLOYEE
GROUP BY DEPARTMENTNO;

Employeeid
Hiredate
DepartmentNo
Salary
CSum(Salary,Hiredate)
5
2/1/1998
100
5000
5000
1
1/1/1999
100
1000
6000
9
4/1/1999
100
9000
15000
6
3/1/1997
200
6000
6000
10
1/1/1998
200
10000
16000
2
1/1/1998
200
2000
18000
124
4/20/2013
200
12345.11
30345.11
7
6/1/1996
300
7000
7000
3
1/1/1997
300
3000
10000
11
5/1/1999
300
11000
21000
8
1/1/1992
400
8000
8000
4
1/1/1996
400
4000
12000
12
8/1/1999
400
12000
24000
144
4/20/2013
400
12345.11
36345.11

Group by indicates the column to be used for resetting the cumulative sum.
When the column changes ( in above example as the department changes ) the accumulated value is discarded.

Note that Column indicated by group by also indicates the major sorting sequence. The output is sorted by this column used for grouping.
ANSI syntax:

Following is how we use ANSI syntax to perform cumulative sum with reset


SELECT EMPLOYEEID,HIREDATE,DEPARTMENTNO,SALARY,
SUM(SALARY) OVER (PARTITION BY DEPARTMENTNO ORDER BY HIREDATE ROWS UNBOUNDED PRECEDING)
FROM EMPLOYEE;


Employeeid
Hiredate
DepartmentNo
Salary
Cumulative Sum(Salary)
5
2/1/1998
100
5000
5000
1
1/1/1999
100
1000
6000
9
4/1/1999
100
9000
15000
6
3/1/1997
200
6000
6000
10
1/1/1998
200
10000
16000
2
1/1/1998
200
2000
18000
124
4/20/2013
200
12345.11
30345.11
7
6/1/1996
300
7000
7000
3
1/1/1997
300
3000
10000
11
5/1/1999
300
11000
21000
8
1/1/1992
400
8000
8000
4
1/1/1996
400
4000
12000
12
8/1/1999
400
12000
24000
144
4/20/2013
400
12345.11
36345.11

PARTITION BY indicates the column(s) to be used for grouping.

In the above example as the DEPARTMENTNO changes the cumulative sum is reset.

Finding Individual contribution using SUM OVER (SUM window)


Finding Individual contribution using SUM OVER:

In the employee table we have individual salaries. However we may need to find out what percentage of total salary is an individual salary.

Following query would help us get that result:

select
employeeid,
DepartmentNo,
salary,
SUM(salary) over (PATITION BY departmentNo rows between unbounded preceding and unbounded following) as totsum,
(salary * 100.00 /totsum)
from employee;

Employeeid
Salary
totsum
((Salary*100.00)/totsum)
1
1000
102690.2
0.9738
2
2000
102690.2
1.9476
3
3000
102690.2
2.9214
4
4000
102690.2
3.8952
5
5000
102690.2
4.869
6
6000
102690.2
5.8428
7
7000
102690.2
6.8166
8
8000
102690.2
7.7904
9
9000
102690.2
8.7642
10
10000
102690.2
9.738
11
11000
102690.2
10.7118
12
12000
102690.2
11.6856
124
12345.11
102690.2
12.0217
144
12345.11
102690.2
12.0217

Thus above result tells us that salary of employee 124 is actually 12.02% of the total salaries.

We can use the concept to find how percentage of departmental salary is the individual salary.

select
employeeid,
DepartmentNo,
salary,
SUM(salary) over (PARTITION BY departmentNo rows between unbounded preceding and unbounded following) as totsum,
(salary * 100.00 /totsum)
from employee;


Employeeid
DepartmentNo
Salary
totsum
((Salary*100.00)/totsum)
1
100
1000
15000
6.6667
9
100
9000
15000
60
5
100
5000
15000
33.3333
10
200
10000
30345.11
32.9542
124
200
12345.11
30345.11
40.6824
6
200
6000
30345.11
19.7725
2
200
2000
30345.11
6.5908
3
300
3000
21000
14.2857
7
300
7000
21000
33.3333
11
300
11000
21000
52.381
4
400
4000
36345.11
11.0056
144
400
12345.11
36345.11
33.9664
8
400
8000
36345.11
22.0112
12
400
12000
36345.11
33.0168

From the above result we get information that salary of employee 5 is 33.333% of the departmental salary   .ie sum of salary of department 100 .ie 15000