Sample Function
Feature:
Sample
allows user to specify a sample of results specified by the SELECT statement.
The
sample function has following additional features:
Stratified Sampling
|
Ability
to generate different size samples from different groupings of the data.
|
Replacement Sampling
|
Ability
to sample a row and return it to the pool of rows and making it candidate for
sampling.
|
Randomized Sampling
|
Ability
to sample randomly across the entire system instead of each AMP returning an
equal share of the sample.
|
For
this testing lets consider a table EMPLOYEE2 which has following data
distribution:
select departmentno,
count(*)
from employee2
group by
1 ;
DepartmentNo
|
Count(*)
|
100
|
7
|
200
|
7
|
300
|
6
|
400
|
8
|
Stratified sampling:
Following
is the example of stratified sampling. It allows us to take 25% samples of each
department.
Select employeeid,departmentno,sampleid
From employee2
SAMPLE
WHEN departmentno=100
then .25
WHEN departmentno=200
then .25
WHEN departmentno=300
then .25
WHEN departmentno=400
then .25
END ;
Employeeid
|
DepartmentNo
|
SampleId
|
132
|
100
|
1
|
9
|
100
|
1
|
10
|
200
|
2
|
6
|
200
|
2
|
152
|
300
|
3
|
3
|
300
|
3
|
144
|
400
|
4
|
136
|
400
|
4
|
Things
to note:
- The syntax of SAMPLE is very much similar to CASE statement. Each condition is specified using WHEN ..THEN and it ends with a END.
- Rounding off takes place. For department 300 there are only 6 rows in the table. 25% of that comes to 1.5 which is rounded off to 2 and hence we see 2 rows
Following
is another example with different percentages for each sample:
Select employeeid,departmentno,sampleid
From employee2
SAMPLE
WHEN departmentno=100
then .40
WHEN departmentno=200
then .25
WHEN departmentno=300
then .05
WHEN departmentno=400
then .35
END ;
Employeeid
|
DepartmentNo
|
SampleId
|
5
|
100
|
1
|
1
|
100
|
1
|
9
|
100
|
1
|
140
|
200
|
2
|
139
|
200
|
2
|
144
|
400
|
4
|
134
|
400
|
4
|
137
|
400
|
4
|
Things
to note:
- Note that the sample .05 of 6 rows is 0.3 which is not rounded to 1 and hence no row is returned for the department 3.
Thank you! It is really helpful
ReplyDelete