Link to home
Start Free TrialLog in
Avatar of cav1984
cav1984

asked on

random sample (with a twist?) from ms access 2000

Greetings;

 I've seen solutions to random samples previously but have questions regarding sampling by 'user', ie. I have a table of user events (example: user, date/time, action taken, reference id) and need to draw a random 3% (by user) sample of events for each user. Though records are unique (user and date/time ensures this) they are not autonumbered.


 
Avatar of nico5038
nico5038
Flag of Netherlands image

I would create a function that will:
1) Count the number of rows per user
2) Insert a random number in every row
3) Processes the table to extract the highest 3% for each user.

As you see this will require VBA coding, how are your coding skills ?

Nic;o)
Avatar of VD
VD

There is another way of doing this.  You shouldn't really be adding redundant rows to a table when you don't need to.

So my idea:

1) Count the number of users

select count(*) from userTable

let's say 3540

therefore you will need: 3540 *.03 = 107 members

2) In vb code create an array of 107 numbers the size of the number of users (actually it only needs to have 3% of them), and make it mutlidimensional

This will store an array of numbers representing the rows that a user will be found in.

3) Generate random numbers to add to your array of members, and make sure you do not duplicate numbers:

liMember = int((3540*rnd)+.5)

3b) order you array of rows

4) Open a recordset up of your table, and run through the entire table, counting the rows as you go, and stopping if it is in the array of members you created, storing the PK to that row.

Avatar of cav1984

ASKER

Thanks (and sorry for the delay...)

 I obviously have much to learn...nico5038's approach is similar to the one I took except I ended up running it through an excel VB script,  after the users enters the parameters for how much or whom they want the data for in Access. The end results was requested as a spreadsheet so this works for now.
 I still want to get around to trying this solely in Access and it appears that I'll have to get better at VB coding to do this, especially with arrays etc...my skills are very weak in this area, for now ;).
 I don't understand arrays completely yet, but wouldn't VDs idea only yield 3% of the whole and not 3% by user?



ASKER CERTIFIED SOLUTION
Avatar of nico5038
nico5038
Flag of Netherlands image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Well done, yes it would.  I read the questions really quickly, to get around this you alter your recordset, so adjust the above to:

I am assuming the user id is a number not a character field.  If it is a character field you have to wrap it is single quotes. (...here userId = '" & userID & "'" )

1) Count the number of users

"select count(*) from userTable where userId = " & userID

This would only select the records from that user.


so when you open the recordset up, i.e

dim lrsRecords as new adodb.recordset

lrsRecords.ActiveConnection = CurrentProject.Connection
lrsRecords.open "Select * from userTable where userid = " & userid"
liCount = 0
while not lrsRecords.eof
  liCount = liCount + 1
  'What ever you want to do
  lrsrecords.movenext
wend

Arrays are simply and compact way of storing lots of data, you normally reserve the amount of memory allocated to your array before you start, especially if you are using an iterative process (for loop with a count for example).

lrsRecords.open "Select * from userTable where userid = " & userid"

should me

lrsRecords.open "Select * from userTable where userid = " & userid
Well done, yes it would.  I read the questions really quickly, to get around this you alter your recordset, so adjust the above to:

I am assuming the user id is a number not a character field.  If it is a character field you have to wrap it is single quotes. (...here userId = '" & userID & "'" )

1) Count the number of users

"select count(*) from userTable where userId = " & userID

This would only select the records from that user.


so when you open the recordset up, i.e

dim lrsRecords as new adodb.recordset

lrsRecords.ActiveConnection = CurrentProject.Connection
lrsRecords.open "Select * from userTable where userid = " & userid"
liCount = 0
while not lrsRecords.eof
  liCount = liCount + 1
  'What ever you want to do
  lrsrecords.movenext
wend

Arrays are simply and compact way of storing lots of data, you normally reserve the amount of memory allocated to your array before you start, especially if you are using an iterative process (for loop with a count for example).

Well done, yes it would.  I read the questions really quickly, to get around this you alter your recordset, so adjust the above to:

I am assuming the user id is a number not a character field.  If it is a character field you have to wrap it is single quotes. (...here userId = '" & userID & "'" )

1) Count the number of users

"select count(*) from userTable where userId = " & userID

This would only select the records from that user.


so when you open the recordset up, i.e

dim lrsRecords as new adodb.recordset

lrsRecords.ActiveConnection = CurrentProject.Connection
lrsRecords.open "Select * from userTable where userid = " & userid"
liCount = 0
while not lrsRecords.eof
  liCount = liCount + 1
  'What ever you want to do
  lrsrecords.movenext
wend

Arrays are simply and compact way of storing lots of data, you normally reserve the amount of memory allocated to your array before you start, especially if you are using an iterative process (for loop with a count for example).

Avatar of cav1984

ASKER

Thanks -

 I haven't been able to put this into practice but am looking forward to replacing my 'quick' fix with this much cleaner version...

 Thanks for the info and assistance...