Tuesday 19 March 2013

OLAP - Part 2 - CSUM with reset logic (GROUP BY)


Cumulative Sums With Reset:

  1. A cumulative sum may be reset to zero at specified breakpoints.
     
  1. This is done with the help of ’GROUP BY’ statement. Group BY indicates that when value of column specified in ‘group by’ changes the cumulative sum should be reset.

  1. Also its very important to note that OLAP and standard aggregations (SUM, COUNT,AVG, MIN,MAX) are not compatible within the same query. Since OLAP and aggregate functions cant be used together , GROUP BY serves a separate purpose for each of the types of query.
     
  1. GROUP BY used in the Teradata CSUM is equivalent to PARTITION BY in the ANSI syntax(explained later).
     
  1. Example:

SELECT employeeid,salary,EXTRACT(year from hiredate),CSUM(salary,hiredate desc) from employee
GROUP BY EXTRACT(YEAR FROM hiredate);

Employeeid
Salary
Year
CSum(Salary,Hiredate DESC)

8
8000
1992
8000

7
7000
1996
7000
Reset happens here
4
4000
1996
11000

6
6000
1997
6000
Reset happens here
3
3000
1997
9000

5
5000
1998
5000
Reset happens here
2
2000
1998
7000

10
10000
1998
17000

12
12000
1999
12000
Reset happens here
11
11000
1999
23000

9
9000
1999
32000

1
1000
1999
33000


Note that we are using GROUP BY. Here the function of group by is different from what it does with aggregate functions. Here group by is used to provide partitioning logic.

3 comments:

  1. Sort sequence is 'DESC‘, why result is showing YEAR as ascending?
    What will result display if I append an 'ORDER BY' clause, like ORDER BY employeeid?

    ReplyDelete
  2. Very clear explanation !!!!!!!!!!!! Very usefull

    ReplyDelete
  3. How can you apply the grouping the other way around, how can you reset when the hireyear repits instead of when it changes?

    ReplyDelete