Solved

Pull a set percentage of random data from an Access table to audit.

Posted on 2011-03-24
493 Views
I have an access table that I create daily.  I need to be able to randomly select 3% of the records for a random quality audit.  the number of total records for each day will change but the random pull will be set at 3%.
0
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

• Help others & share knowledge
• Earn cash & points
• 9
• 6
• 6
• +1

LVL 75

Expert Comment

ID: 35210336
Here is an example:

SELECT TOP 3 PERCENT tblEmp.EmpName
FROM tblEmp
ORDER BY Rnd(Int(Now()*[tblEmp].[EmpID])-Now()*[tblEmp].[EmpID]);

mx
0

Author Comment

ID: 35210428
Hi Mx,  this works good but one issue.  Total number of records is 1369.   3% should be 41 or 42 depending on the rounding.  The query is returning 43 records.  ?
0

LVL 75

Accepted Solution

DatabaseMX (Joe Anderson - Microsoft MVP, Access and Data Platform) earned 500 total points
ID: 35210475
Because if you have duplicates ... then TOP will do that. Per Help File - example:

If you set the TopValues property to 3 with a descending sort on the Sales field, Microsoft Access returns the following four records.

Sales Salesperson
90,000 Leverling
80,000 Peacock
70,000 Davolio
70,000 King

So ... yop may need to GROUP BY to avoid duplicates ...

mx

0

LVL 57

Expert Comment

ID: 35210494
It's also got one other problem; because of the way Access works with VBA, you'll get the same "random" records each time you restart Access.

You need to modify it a bit by calling an external function.  Past the following into a module:

Public Function RandomNumber( _
Optional dummy as Variant, _
Optional ByVal Reseed As Boolean) _
As Single
Static Randomized As Boolean
If Reseed Or Not Randomized Then
' A new seed is requested or this is the first run.
Randomize
Randomized = True
End If
' Generate and return a random number.
RandomNumber = Rnd()
End Function

Then use the following SQL:

SELECT TOP 3 PERCENT  tblEmp.EmpName FROM tblEmp ORDER BY RandomNumber([EmpName]);

JimD.

0

LVL 75

Expert Comment

ID: 35210519
"It's also got one other problem; because of the way Access works with VBA, you'll get the same "random" records each time you restart Access."

I don't think so.  The Now() function handles that.

mx
0

LVL 57

Expert Comment

ID: 35210570
@mx

<<I don't think so.  The Now() function handles that.>>

Been down this road myself; try it<g>

1. Execute the query.  Note the records returned.
2. Close Access and re-open.
3. Execute the query.  Note that you get the same records.

JimD.
0

LVL 75

Expert Comment

ID: 35210669
"try it<g>"
Been There, Done That :-)
Here you go ... 1 2 3 times run-close-open-run-close-open-run

Do they look the same to you ?
Capture1.gif
Capture2.gif
Capture3.gif
0

LVL 57

Expert Comment

ID: 35210758
Well I'm not remembering something right then.  Quite a few have been caught by the fact that a query runs in a different space then VBA, so it's imposssible to call Randomize and get a correct result.  Here's the old thread where Catus Data and Graham where discussing it, although it appears pieces of the thread are missing as I'm mentioned, but I don't see any comments from myself.

http://www.experts-exchange.com/Software/Office_Productivity/Office_Suites/MS_Office/Word/Q_24845989.html

All I know is I feel pray to this once and Catus Data was the one that set me straight.  I tired every way possible and could not get a unique result if I restarted access.  I use the function above anytime I pull random records because of that.

I wonder if this is something that's changed in a latter version.  Looks like you tested in 2007 or 10.  The thread was working with 2003.  I'll have to check into that.  If not, somethings different that I'm not taking into account because obviosuly from your test, it works as you posted.

JimD.
0

LVL 75

Expert Comment

ID: 35210853
"Catus Data was the one that set me straight.  "
Kind of like he is trying to do with the Age calc thing :-)

The screen shot I posted were just run in A2003.  In the tread you posted, the Now() is not present, which is why this works.  In fact, I do not see how it could not work.

This method does however depend on using a numeric field (unique I believe - not sure) in order to work with Now().  So, doing so on the EmpName field - no can do.

mx

0

LVL 57

Expert Comment

ID: 35210943
<<In the tread you posted, the Now() is not present, which is why this works.  In fact, I do not see how it could not work.>>

Yeah, I just tried it here and that is the difference.  With just a Rnd([FieldName]), you get what I described. Including the Now() ends up giving you random records.

<<This method does however depend on using a numeric field (unique I believe - not sure) in order to work with Now().  So, doing so on the EmpName field - no can do.>>

Maybe that's why I ended up with the function, because it works in all situations.  But then again, I don't remember anyone talking about using Now() like that, and this has come up several times here and in other forums.  In fact I vaguely remember trying to work with now() when Gustav caught me on this and could not get it to work.

Sorry for the side track...

JimD.

P.S.

<<Kind of like he is trying to do with the Age calc thing :-)>>

He will never get me to agree that someone born on the 29th of Feb should get a birthday card on the 28th in non-leap years.  Someone born on Feb 28th at 12:01 shares a birthday with someone who was born 24 hours latter?  Just doesn't make sense.
0

LVL 75

Expert Comment

ID: 35210963
"Just doesn't make sense."
I will let you guys hash that one out.

mx
0

LVL 50

Expert Comment

ID: 35213948
> I don't think so.  The Now() function handles that.

Joe, this is very interesting and it seems to be true.

I have never thought of combining Rnd and Now this way. Or rather the key and Now, because that's probably what happens: Without Now, Rnd is reading the key only. As most likely the table is read in the same sequence ordered by the key, thus feeding the same sequence of values to Rnd the same sequence will be returned for each run by Rnd, while using Now as shown will feed a different sequence of values to Rnd each time the query is run causing different sequences of values to be returned by Rnd.

I modified two of my test queries to use Now as shown, and the results are consistent - after each relaunch of Access different results are returned.

Where/how did you learn this excellent trick?

/gustav

0

LVL 50

Expert Comment

ID: 35213997
> He will never get me to agree that someone born on the 29th of Feb should
> get a birthday card on the 28th in non-leap years.
> Someone born on Feb 28th at 12:01 shares a birthday with
> someone who was born 24 hours latter?  Just doesn't make sense.

Ha ha, Jim, following your rule would mean that:

> Someone born on Feb 29th at 12:01 shares a birthday with
> someone who was born 24 hours later?  Just doesn't make sense.

Thus, logically, as this argument is identical whether you use the common rule or your rule, it can be ignored.

We just have to accept that like "a month is not a month" because of a varying days' count, "a year is not a year" because of the leap years, thus applying strict math on calculating months, years (and age) is doomed to fail. This means you have to make some non-math rules for those born on Feb. 29th.
I certainly understand your point and method but, as I have stated before, I find it easier to accept that anyone born in February always have their birthday in February like mine of November always falls in November.

/gustav
0

LVL 75

Expert Comment

ID: 35216908
"Where/how did you learn this excellent trick?"

Well ... I thought possibly that ... you had posted this at some point..  I can tell you for SURE ... I learned it here on EE, as well as 100's of other cool tricks.  I'm sure if I went back through a load of Q's, I could eventually find that thread.  If I have time, I will try to do that.

But then again, I'm only here for the points :-)

mx
0

LVL 50

Expert Comment

ID: 35217181
Perhaps here:

http://www.experts-exchange.com/Programming/Languages/SQL_Syntax/Q_24450926.html

Of course, as noted there, if two users run the query at the exact same second, the result sets will be identical.

/gustav
0

LVL 75

Expert Comment

ID: 35217224
Although similar, that was not the Q.
0

LVL 50

Expert Comment

ID: 35217258
If you google on "Rnd(Int(Now()*" you'll find some newer questions on EE as well; it could be one of those. But the one above is the oldest I could find.

The EE search is useless in this regard (Jim?).

/gustav
0

LVL 50

Expert Comment

ID: 35217390
> .. if two users run the query at the exact same second,
> the result sets will be identical.

That can be (almost) sorted by using Timer which has an resolution of 1/18 second:

SELECT TOP 3 PERCENT tblEmp.EmpName
FROM tblEmp
ORDER BY Rnd(Int(Timer()/86400*[tblEmp].[EmpID])-Timer()/86400*[tblEmp].[EmpID]);

/gustav
0

LVL 75

Expert Comment

ID: 35217417
I'm sure I put that Q in my KB here, with keyword Random ... but not able to find :-(

mx
0

LVL 57

Expert Comment

ID: 35217680
@gustav

<<The EE search is useless in this regard (Jim?).>>

Don't know...I get more or less the same with EE search vs Google:

Don't have time at the moment to narrow it down or go through them as I leave for Dallas on Monday for a week and need to get my desk cleared off.

JimD.
0

LVL 50

Expert Comment

ID: 35219097
I see. I always end in some kind of struggle with the EE search engine - and don't win.
Have a nice trip!

/gustav
0

LVL 57

Expert Comment

ID: 35222194
<<I always end in some kind of struggle with the EE search engine - and don't win.>>

Personally I've always found EE interface's less then intuitive, but I have to say, the search has come a long way from what it once was.

JimD.
0

Featured Post

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A simple tool to export all objects of two Access files as text and compare it with Meld, a free diff tool.
In earlier versions of Windows (XP and before), you could drag a database to the taskbar, where it would appear as a taskbar icon to open that database.  This article shows how to recreate this functionality in Windows 7 through 10.
Familiarize people with the process of retrieving data from SQL Server using an Access pass-thru query. Microsoft Access is a very powerful client/server development tool. One of the ways that you can retrieve data from a SQL Server is by using a paâ€¦
Basics of query design. Shows you how to construct a simple query by adding tables, perform joins, defining output columns, perform sorting, and apply criteria.
Suggested Courses
Course of the Month6 days, 4 hours left to enroll