Link to home
Start Free TrialLog in
Avatar of nickleplated
nickleplated

asked on

Access select random records split list

I have a list that I want to split into two segments. The records are random expect that I have one field "LANG" that contains either "EN" or "FR" and I need an equal number from each in the resulting two lists.

ie:
Original Table
------------------
5000 Records Total
3000 are "EN"
2000 are "FR"

New Table 1
------------------
2500 Records Total
1500 are "EN"
1000 are "FR"

New Table 2
-------------------
2500 Records Total
1500 are "EN"
1000 are "FR"


This make sense?
High points for urgency :)
Avatar of Nievergelt
Nievergelt
Flag of Switzerland image

- Add an Autonumber column (e.g. rec_num) into Original_Table.

- Execute the following SQL statements:

   SELECT TOP 1500 * INTO New_Table1
   FROM Original_Table WHERE lang = 'EN';

   INSERT TOP 1000 * INTO New_Table1
   FROM Original_Table WHERE lang = 'FR';

   DELETE FROM Original_Table WHERE rec_num IN (SELECT rec_num FROM New_Table1);

- (Maybe) rename Original_Table
Avatar of chanito
chanito

Ok, add the rec_num for identification of the records, or don't if you already have an ID, and then for a pseudo-random solution:

select top 1500 rand(datepart(ms, getdate())*(abs(checksum(*))/100000000)) as order1, * into New_Table_1
from Original_Table where lang ='EN' order by order1
select top 1000 rand(datepart(ms, getdate())*(abs(checksum(*))/100000000)) as order1, * into New_Table_1
from Original_Table where lang ='FR' order by order1
select * into New_Table_2 from Original_Table where rec_num not in (select rec_num from New_Table_1)

If you get an error expanding the '*' try using instead any changing field in the table, like the name of the person.  Just anything that changes.
Avatar of nickleplated

ASKER

hmmm..seems like we're getting close, but I'd like to be able to run this on any table; Meaning in my example I was arbitrarily putting numbers of each to make a point.

I'd like to run it now matter how many records there are in the table and in each lang segment.
Piece of cake...  you just need to use top 50 percent :

select top 50 percent rand(datepart(ms, getdate())*(abs(checksum(*))/100000000)) as order1, * into New_Table_1
from Original_Table where lang ='EN' order by order1
Getting Syntax error(missing operator) in query expression rand(datepart(ms, getdate())*(abs(checksum(*))/100000000))
That's weird, I copy / paste in my Query Analyzer and works fine, you might be missing a parentheses there.  Try changing the * in checksum(*) with another field like checksum(name).
nada - sorry.

( don't forget that I am using access for this, not SQL - not sure if that makes a dif)
Yes it does... I'm sorry but checking the list of questions in the Database section I can't tell which DB you're using.

In Access there isn't a checksum function.  If you have a numeric id_field you can use:

rnd(rec_num)

Or just use any numeric field you have.  I think that's the only difference with Access, so resuming it should be:  

select top 50 percent rnd(rec_num) as order1, * into New_Table_1
from Original_Table where lang ='EN' order by order1
Sorry, but I can't get it to fly.

Trying:

select top 50 percent rnd(ID) as order1, * into New_Table_1
from F_Donors where lang ='EN' order by order1


And it prompts me for ID Parameter

Looks like you don't have an ID field in the F_Donors table ...
doh - embarassing - I'd changed the ID field name to myID

select top 50 percent rnd(myID) as order1, * into New_Table_1
from F_Donors where myLANG ='EN' order by order1

now prompts me for order1 parameter
ASKER CERTIFIED SOLUTION
Avatar of chanito
chanito

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Sorry, comes up with "You are about to paste 0 row(s) into a new table"
Try the select query without the into clause first...

That's a bit strange.  The problem could be the myLANG = 'EN' condition, maybe the value is different?  Try just:

select top 50 percent * from F_Donors where myLANG ='EN' order by rnd(myID)   or
select top 50 percent * from F_Donors order by rnd(myID)

Also, I found out you can't see the rnd(myID) value because if you include it in the fields and in the order by it actually generates 2 different random values, however give it a try and you'll see it does randomize the sort order.
Sorry, I am a jerk ( working on way too many different things today)

The values in "myLANG" were actually "E" or "F"

Thanks for your patience....points well earned :-S
Thanks, and don't worry.  We all have those days... :-)