Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
Solved

# randomly assign  grouped data to a group

Posted on 2015-02-05
Medium Priority
94 Views
I have the following table named PeopleLocation

PersonID           State                 zipCode
1000                   IL                        60067
2000                   IL                        60067
3000                   TX                      77045
4000                   IL                        60074
5000                   TX                       77045
6000                   TX                      77045
7000                   IL                        60074
8000                   TX                       77045
9000                   IL                         60067

What I want to do is write a query that will group the total number of people by state and zipcode then randomly assign a group value of 1, 2 or 3

(if the number is not divisible by 3 it is ok if group size is not equal or if there are no entries in a group

Using the sample above, I would like to have something like

PersonID           State                 zipCode     Group
1000                   IL                        60067        1
9000                   IL                        60067        2
2000                   IL                        60067        3
4000                   IL                        60074        1
7000                   IL                        60074        2
6000                   TX                      77045         1
5000                   TX                       77045        2
3000                   TX                      77045         3
8000                   TX                       77045        1

Hopefully this makes sense
0
Question by:johnnyg123
[X]
###### Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

• Help others & share knowledge
• Earn cash & points

LVL 12

Accepted Solution

Habib Pourfard earned 2000 total points
ID: 40592602
you can write it with use of row_number() to assign a group number to each group of state and zip code and to make it random you can order it by NEWID():

``````IF OBJECT_ID('tempdb.dbo.#PeopleLocation') IS NOT NULL
DROP TABLE #PeopleLocation
GO

CREATE TABLE #PeopleLocation
(
PersonID INT PRIMARY KEY,
[State] VARCHAR(2),
zipCode INT
)

INSERT INTO #PeopleLocation VALUES (1000,'IL', 60067 )
INSERT INTO #PeopleLocation VALUES (2000,'IL', 60067 )
INSERT INTO #PeopleLocation VALUES (3000,'TX', 77045 )
INSERT INTO #PeopleLocation VALUES (4000,'IL', 60074 )
INSERT INTO #PeopleLocation VALUES (5000,'TX', 77045 )
INSERT INTO #PeopleLocation VALUES (6000,'TX', 77045 )
INSERT INTO #PeopleLocation VALUES (7000,'IL', 60074 )
INSERT INTO #PeopleLocation VALUES (8000,'TX', 77045 )
INSERT INTO #PeopleLocation VALUES (9000,'IL',  60067)

SELECT *, ((ROW_NUMBER() OVER (PARTITION BY State, zipCode ORDER BY NEWID())) - 1) % 3 + 1 AS [Group] FROM #PeopleLocation
ORDER BY [State], zipCode
``````
0

LVL 75

Expert Comment

ID: 40592608
;WITH c AS (
SELECT PersonID, state, ZipCode, [Group]= ROW_NUMBER() OVER(PARTITION BY state ORDER BY zipcode )
FROM yourTable
)
SELECT * FROM c
0

LVL 49

Expert Comment

ID: 40592626
try these?
``````select
*
, (ABS(CHECKSUM(NewId())) % 3) + 1 as ri
, NTILE(3) over(partition by [State], [zipCode] order by NewId()) as nt
from PeopleLocation
``````

results(1)
``````| PERSONID | STATE | ZIPCODE | RI | NT |
|----------|-------|---------|----|----|
|     1000 |    IL |   60067 |  3 |  2 |
|     2000 |    IL |   60067 |  1 |  3 |
|     9000 |    IL |   60067 |  1 |  1 |
|     4000 |    IL |   60074 |  1 |  1 |
|     7000 |    IL |   60074 |  1 |  2 |
|     3000 |    TX |   77045 |  2 |  1 |
|     5000 |    TX |   77045 |  2 |  2 |
|     6000 |    TX |   77045 |  3 |  3 |
|     8000 |    TX |   77045 |  1 |  1 |
``````

results(2)
``````| PERSONID | STATE | ZIPCODE | RI | NT |
|----------|-------|---------|----|----|
|     1000 |    IL |   60067 |  1 |  1 |
|     2000 |    IL |   60067 |  3 |  3 |
|     9000 |    IL |   60067 |  1 |  2 |
|     4000 |    IL |   60074 |  2 |  1 |
|     7000 |    IL |   60074 |  2 |  2 |
|     3000 |    TX |   77045 |  1 |  1 |
|     5000 |    TX |   77045 |  2 |  1 |
|     6000 |    TX |   77045 |  1 |  3 |
|     8000 |    TX |   77045 |  1 |  2 |

``````

see:
http://sqlfiddle.com/#!3/f1b796/6
http://sqlfiddle.com/#!3/f1b796/7

details:
``````**MS SQL Server 2008 Schema Setup**:

CREATE TABLE PeopleLocation
([PersonID] int, [State] varchar(2), [zipCode] int)
;

INSERT INTO PeopleLocation
([PersonID], [State], [zipCode])
VALUES
(1000, 'IL', 60067),
(2000, 'IL', 60067),
(3000, 'TX', 77045),
(4000, 'IL', 60074),
(5000, 'TX', 77045),
(6000, 'TX', 77045),
(7000, 'IL', 60074),
(8000, 'TX', 77045),
(9000, 'IL', 60067)
;

**Query 1**:

select
*
, (ABS(CHECKSUM(NewId())) % 3) + 1 as ri
, NTILE(3) over(partition by [State], [zipCode] order by NewId()) as nt
from PeopleLocation
order by [State], [zipCode], [PersonID]

**[Results][2]**:

| PERSONID | STATE | ZIPCODE | RI | NT |
|----------|-------|---------|----|----|
|     1000 |    IL |   60067 |  1 |  1 |
|     2000 |    IL |   60067 |  3 |  3 |
|     9000 |    IL |   60067 |  1 |  2 |
|     4000 |    IL |   60074 |  2 |  1 |
|     7000 |    IL |   60074 |  2 |  2 |
|     3000 |    TX |   77045 |  1 |  1 |
|     5000 |    TX |   77045 |  2 |  1 |
|     6000 |    TX |   77045 |  1 |  3 |
|     8000 |    TX |   77045 |  1 |  2 |

[1]: http://sqlfiddle.com/#!3/f1b796/7
``````
0

LVL 27

Expert Comment

ID: 40593648
Try this:
``````select
PersonID,
[State],
zipCode,
(ROW_NUMBER() OVER (PARTITION BY [State], zipCode ORDER BY PersonID)-1)%3+1 as [group]
from
#PeopleLocation
order by
[State],
zipCode,
[group],
PersonID
``````
0

## Featured Post

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question