onesegun
asked on
Select random records
Hi Experts,
I have this piece of SQL script which I use to select a random sample of 5% of records from the whole data set.
I've looked in the Aginity boards and documentation but I can't find how to select say random sample of 200 records rather than percentage.
Thanks,
OS
I have this piece of SQL script which I use to select a random sample of 5% of records from the whole data set.
I've looked in the Aginity boards and documentation but I can't find how to select say random sample of 200 records rather than percentage.
SELECT *
FROM table_a
WHERE Random() <= 0.05;
Thanks,
OS
what randomness test must the subset pass?
Is this a one time exercise or would the be done multiple times?
If the tables you are selecting from are distributed on random, then you have a sort of built in randomness to the data. You can then use a query like this:
select * from table_a where datasliceid=5 limit 200;
You can use any number for the data slice, but the records would tend to be the same ones if you used the same dataslice repeatedly. They wouldn't always be the same, but it would be close.
You could also use a method where you do something like this:
create table table_a_random as select * from table_a distribute on random;
select * from table_a where datasliceid=5 limit 200;
This would make it a little more random and less likely to return the same rows.
If the tables you are selecting from are distributed on random, then you have a sort of built in randomness to the data. You can then use a query like this:
select * from table_a where datasliceid=5 limit 200;
You can use any number for the data slice, but the records would tend to be the same ones if you used the same dataslice repeatedly. They wouldn't always be the same, but it would be close.
You could also use a method where you do something like this:
create table table_a_random as select * from table_a distribute on random;
select * from table_a where datasliceid=5 limit 200;
This would make it a little more random and less likely to return the same rows.
Compare with this comment using access database. table a(aid, f1)
1
A query to make a table say random_make_table:
SELECT a.aid, a.f1, Int(Rnd(a.f1)*1000) AS rnd INTO random_make_table
FROM a;
2
SELECT top 10 *
FROM random_make_table
ORDER BY random_make_table.[rnd];
1
A query to make a table say random_make_table:
SELECT a.aid, a.f1, Int(Rnd(a.f1)*1000) AS rnd INTO random_make_table
FROM a;
2
SELECT top 10 *
FROM random_make_table
ORDER BY random_make_table.[rnd];
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
In MySQl: Table children (cid, cname), select 5% of records.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Doh!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!! !!!!
Why didn't I think of using Limit!!
Thanks guys....
Why didn't I think of using Limit!!
Thanks guys....
SELECT TOP 200 *
FROM table_a
ORDER BY NEWID()
or
SELECT TOP 10 PERCENT *
FROM table_a
ORDER BY NEWID()