Link to home
Start Free TrialLog in
Avatar of Lou Dufresne
Lou DufresneFlag for United States of America

asked on

MS Access 2003 - Random Sample

I need to obtain a random sample of approx 20 records from a table of about 200 people. How would I use a query to obtain a random sample of about 20 people.

Is there any other way to obtain a random sample?

LAD
ASKER CERTIFIED SOLUTION
Avatar of Graham Mandeno
Graham Mandeno
Flag of New Zealand image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
<<select top 20 * from table1 order by rnd(id)>>
This is the method I would have used... up until last week.  But as CactusData has pointed out, it does not work as expected.
The reason is that when you call a built-in function from a query, it executes in a different scope from the VBA code in the database project.  That means that there is no way to call Randomize to initialize the random number seed.
What this means is that if you close Access and then reopen it and run your query again, you will get the same "random" sample as last time.
If you use your own function, you can overcome this problem.
[Thanks to CactusData and JDettman for pointing this out]
--
Graham
There are many ways to get a "Random sampling"

It all depends of How "random", it needs to be...

For example, I use something similar to what andre23 uses, and it works fine for me and my purposes.

Techniques like what Graham are a bit more refined, and slicker regarding how they determine the "randomness" of the sample.

I have even seen systems where the exact time in milliseconds is added into the mix to further "randomize" the sample.

It all depends on how far you want to take it.

;-)

JeffCoachman
Jeff, it's not a question of more or less refined but if you wish to prevent the same sample to be picked whenever you (re)launch Access and run the query the first time.

With the simple method you will get the exact same sample each time (given the same data, of course).
With the method with the external function you will get different samples within the limitations of the Rnd function (this is another and lengthy discussion mostly for true statisticians)

You may conclude, that for a one time selection of samples - running Access in one session without closing it down and relaunch it - the simple query will provide perfect results. For any other purposes, the query with the external function should be used.

/gustav
Hi Gustav
Thank you very much for bringing this matter to light!  
I have been using Access since V1.0 and have been using Rnd() in queries since V2.0 when VBA was introduced.  Occasionally clients and others have reported intermittent "strange" behaviour, but I never really nailed it down, so I thought they must have been imagining things.  I realise now this is partly due to the fact that I almost always have Access open and this seems to affect the behaviour when another database is opened - perhaps the two instances share the same Jet scope or something.
Anyway, I am now a complete convert.  Who says you can't teach an old dog new tricks? :-)
I hope you don't mind me modifying your function as posted by Jim Dettman.  I thought it better not to require the SQL argument to be evaluated as a boolean False, but just to use a dummy argument, and have a secong argument for the rare occasions when a reseed was required.
I'm attaching a little database that demos the problem and the solution.  I hope someone out there finds it useful!
--
Graham

RandomQueriesDemo.mdb
No problem! Feel free to modify as you wish to fit whatever purpose.
The original function was provided to be as simple as possible.

The really funny thing which caught my attention is, that the sequence of random values is fixed - when the last number is reached, the sequence starts over. So what happens when Randomize is called, is just that you are put somewhere in the sequence and start from that point.

/gustav
cactus_data,

Yes I remember having this discussion before.
;-)

Again, my post clearly stated:
  "for me and my purposes"

For example, I do not use this on the same table every time, nor do I ask for the same number of random records.
I simply use this every so often whenever I need to get a quick set of random records for comparison, form various tables.
So, I guess I should have made this clear...
:-)

However, for random samples from the same table, I do use the techniques like the one you advocate.
;-)

So, not knowing the full scope of what the asker was looking for, I merely presented this as just one of many alternatives.

;-)

Jeff
Avatar of Lou Dufresne

ASKER

Thanks everyione for you assistance and the learning experience

LAD