Saturday 16 March 2013

Correlated Sub-query


First we create sample data for understanding the correlated queries using following queries:

DATABASE YOURTESTDB;
CREATE TABLE EMPLOYEE
(
Employeeid integer,
DepartmentNo integer,
Salary decimal(8,2),
Hiredate date
);

Insert into EMPLOYEE VALUES (1,100,1000.00,DATE '1999-01-01');
Insert into EMPLOYEE VALUES (2,200,2000.00,DATE '1998-01-01');
Insert into EMPLOYEE VALUES (3,300,3000.00,DATE '1997-01-01');
Insert into EMPLOYEE VALUES (4,400,4000.00,DATE '1996-01-01');
Insert into EMPLOYEE VALUES (5,100,5000.00,DATE '1998-02-01');
Insert into EMPLOYEE VALUES (6,200,6000.00,DATE '1997-03-01');
Insert into EMPLOYEE VALUES (7,300,7000.00,DATE '1996-06-01');
Insert into EMPLOYEE VALUES (8,400,8000.00,DATE '1992-01-01');
Insert into EMPLOYEE VALUES (9,100,9000.00,DATE '1999-04-01');
Insert into EMPLOYEE VALUES (10,200,10000.00,DATE '1998-01-01');
Insert into EMPLOYEE VALUES (11,300,11000.00,DATE '1999-05-01');
Insert into EMPLOYEE VALUES (12,400,12000.00,DATE '1999-08-01');




Correlated subquery is where the outer query is processed a row at a time, against the subquery result. The subquery is executed for each row processed by the outer query.

Example 1 : Find the employee with maximum salary with in his department. Note that here we are not finding the total maximum, instead we are finding departmental maximums.

Select * from employee E where salary =(select max(salary) from employee EX where E.DepartmentNo = Ex.departmentNo);

Following is how the query works:
 
  1. Get 1 row from the employee table
     
  2. Using the departmentNo run the inner query to find the maximum salary for the department.
     
  3. Compare the salary in the outer row with the maximum salary from the subquery.
     
  4. If the values are equal the row is output.
     
  5. If the values are not equal then the row is not output.
     
  6. Repeat the same for next row

Answer set:

Employeeid        
DepartmentNo        
Salary
        Hiredate
9        
100        
9000.00        
4/1/1999
10        
200        
10000.00        
1/1/1998
11        
300        
11000.00        
5/1/1999
12        
400        
12000.00        
8/1/1999

Thus we can see that we have got employees with maximum salaries in their respective departments.

Example 2: Find employees whose salary is greater than their departmental averages.

Select * from employee e where salary > (select avg(salary) from employee ex where e.departmentno=ex.departmentno);

Employeeid        
DepartmentNo        
Salary
        Hiredate
9        
100        
9000.00        
4/1/1999
10        
200        
10000.00        
1/1/1998
11        
300        
11000.00        
5/1/1999
12        
400        
12000.00        
8/1/1999

Example 3: Find out the employee who joined latest in each department

Select * from employee e where hiredate = (select max(hiredate) from employee ex where e.departmentno=ex.departmentno);

Employeeid        
DepartmentNo        
Salary
        Hiredate
11        
300        
11000.00        
5/1/1999
12        
400        
12000.00        
8/1/1999
9        
100        
9000.00        
4/1/1999
10        
200        
10000.00        
1/1/1998
2        
200        
2000.00        
1/1/1998

Note that there are two rows output for departmentno = 200. The reason being that the maximum hire date for this department is 1/1/1988 which is the hire date for two of its employees and hence both of them qualify in the output.






NOT IN and EXISTS:


Given a query where we need to find employees without a valid department we would normally use a traditional subquery as follows:

SELECT * FROM EMPLOYEE WHERE DEPARTMENTNO NOT IN (select DISTINCT DEPARTMENTNO FROM DEPARTMENT);

We can get similar result using the EXISTS with correlated Subqueries as follows:

SELECT * FROM EMPLOYEE E WHERE NOT EXISTS (SELECT * FROM DEPARTMENT D WHERE E.DEPARTMENTNO=D.DEPARTMENTNO);

No comments:

Post a Comment