Random Function:
Random
function can be used to generate random numbers between specified range.
Syntax: RANDOM
(lower-limit, upper-limit).
The two
numbers provided must be integers. Decimals are not allowed.
Numbers
should be greater than 1.
SELECT
EMPLOYEEID, random(1,10) FROM employee;
Employeeid
|
Random(1,10)
|
8
|
7
|
2
|
8
|
6
|
6
|
4
|
6
|
12
|
10
|
1
|
7
|
3
|
4
|
10
|
10
|
7
|
3
|
11
|
7
|
9
|
10
|
5
|
4
|
Random
function is invoked for each row processed, thus duplicate are possible .
We can decrease the chances of getting duplicates by
increasing the size of the pool (by increasing the random interval)
Rules for using Random
functions:
- RANDOM can be used in a SELECT list or WHERE clause, but not in both
SELECT EMPLOYEEID, random(1,10) FROM employee where random(1,10)
> 5;
SELECT Failed. 5522: The
RANDOM function can not be used in more than one SELECT query clauses.
- RANDOM is not ANSI standard.
- RANDOM cannot be referenced by numeric position in a GROUP BY or ORDER BY clause
Creating Samples using
RANDOM :
We can
get samples using SAMPLE .
Say we
need 50% sample of row having salary lesser than 10000 bucks.
We
could use SAMPLE as shown below:
SELECT * FROM EMPLOYEE WHERE SALARY <
10000 SAMPLE .5;
Another
way to get a 50 % sample using RANDOM is as follows:
SELECT * FROM EMPLOYEE WHERE SALARY < 10000 AND
RANDOM(1,2) < 2;
However
this technique is approximate and does not always give 50% sample.
If you
run the same query multiple times you may get varying number of rows.
The
reason being the random interval is very small.
To get
a better consistent result we should use a bigger Interval as shown below
SELECT * FROM EMPLOYEE WHERE SALARY < 10000 AND
RANDOM(1,1000) < 501;
Problem
of using sample is that it can use used once in the query. But random can be used multiple times in a the same select and
hence we can produce multiple samples , each having a different criteria.
SELECT * FROM EMPLOYEE WHERE
(SALARY >
50000 AND RANDOM(1,1000) < 501)
OR
(SALARY >
30000 AND SALARY < 50000 AND RANDOM(1,1000) < 701)
OR
(SALARY >
5000 AND SALARY < 30000 AND RANDOM(1,1000) < 301)
;
Above
query gives is
50%
sample of employees having salary > 50000
70%
sample of employees having salary > 30000 and salary < 50000
30%
sample of employees having salary > 5000 and salary < 30000
Locating string position
within a string using- POSITION and
INDEX functions:
Both
the functions POSITION and INDEX are used for finding the position of a string
with in a string.
POSITION
is ANSI standard.
INDEX
is Teradata Standard.
Both
the functions need to arguments- One is the string to search in and other is
the string to be searched.
Syntax:
SELECT POSITION('GOD' IN 'LIKE A GOD');
--> Result is 8
SELECT
INDEX('GOD','LIKE A GOD'); --> Result
is 8
Care
needs to be taken when using these functions in ANSI mode. Text comparisons are
case sensitive in ANSI mode.
No comments:
Post a Comment