Solved

randomly assign  grouped data to a group

Posted on 2015-02-05
4
87 Views
Last Modified: 2015-02-09
I have the following table named PeopleLocation

PersonID           State                 zipCode
1000                   IL                        60067
2000                   IL                        60067
3000                   TX                      77045
4000                   IL                        60074
5000                   TX                       77045
6000                   TX                      77045
7000                   IL                        60074
8000                   TX                       77045
9000                   IL                         60067




What I want to do is write a query that will group the total number of people by state and zipcode then randomly assign a group value of 1, 2 or 3

(if the number is not divisible by 3 it is ok if group size is not equal or if there are no entries in a group

Using the sample above, I would like to have something like

PersonID           State                 zipCode     Group
1000                   IL                        60067        1
9000                   IL                        60067        2
2000                   IL                        60067        3
4000                   IL                        60074        1
7000                   IL                        60074        2
6000                   TX                      77045         1
5000                   TX                       77045        2
3000                   TX                      77045         3
8000                   TX                       77045        1

Hopefully this makes sense
0
Comment
Question by:johnnyg123
4 Comments
 
LVL 12

Accepted Solution

by:
Habib Pourfard earned 500 total points
Comment Utility
you can write it with use of row_number() to assign a group number to each group of state and zip code and to make it random you can order it by NEWID():

IF OBJECT_ID('tempdb.dbo.#PeopleLocation') IS NOT NULL
	DROP TABLE #PeopleLocation
GO

CREATE TABLE #PeopleLocation
(
	PersonID INT PRIMARY KEY,
	[State] VARCHAR(2),
	zipCode INT
)

INSERT INTO #PeopleLocation VALUES (1000,'IL', 60067 )
INSERT INTO #PeopleLocation VALUES (2000,'IL', 60067 )
INSERT INTO #PeopleLocation VALUES (3000,'TX', 77045 )
INSERT INTO #PeopleLocation VALUES (4000,'IL', 60074 )
INSERT INTO #PeopleLocation VALUES (5000,'TX', 77045 )
INSERT INTO #PeopleLocation VALUES (6000,'TX', 77045 )
INSERT INTO #PeopleLocation VALUES (7000,'IL', 60074 )
INSERT INTO #PeopleLocation VALUES (8000,'TX', 77045 )
INSERT INTO #PeopleLocation VALUES (9000,'IL',  60067)


SELECT *, ((ROW_NUMBER() OVER (PARTITION BY State, zipCode ORDER BY NEWID())) - 1) % 3 + 1 AS [Group] FROM #PeopleLocation
ORDER BY [State], zipCode

Open in new window

0
 
LVL 75

Expert Comment

by:Aneesh Retnakaran
Comment Utility
;WITH c AS (
 SELECT PersonID, state, ZipCode, [Group]= ROW_NUMBER() OVER(PARTITION BY state ORDER BY zipcode )
 FROM yourTable
 )
 SELECT * FROM c
0
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
try these?
select
      *
      , (ABS(CHECKSUM(NewId())) % 3) + 1 as ri
      , NTILE(3) over(partition by [State], [zipCode] order by NewId()) as nt
from PeopleLocation

Open in new window


results(1)
| PERSONID | STATE | ZIPCODE | RI | NT |
|----------|-------|---------|----|----|
|     1000 |    IL |   60067 |  3 |  2 |
|     2000 |    IL |   60067 |  1 |  3 |
|     9000 |    IL |   60067 |  1 |  1 |
|     4000 |    IL |   60074 |  1 |  1 |
|     7000 |    IL |   60074 |  1 |  2 |
|     3000 |    TX |   77045 |  2 |  1 |
|     5000 |    TX |   77045 |  2 |  2 |
|     6000 |    TX |   77045 |  3 |  3 |
|     8000 |    TX |   77045 |  1 |  1 |

Open in new window


results(2)
| PERSONID | STATE | ZIPCODE | RI | NT |
|----------|-------|---------|----|----|
|     1000 |    IL |   60067 |  1 |  1 |
|     2000 |    IL |   60067 |  3 |  3 |
|     9000 |    IL |   60067 |  1 |  2 |
|     4000 |    IL |   60074 |  2 |  1 |
|     7000 |    IL |   60074 |  2 |  2 |
|     3000 |    TX |   77045 |  1 |  1 |
|     5000 |    TX |   77045 |  2 |  1 |
|     6000 |    TX |   77045 |  1 |  3 |
|     8000 |    TX |   77045 |  1 |  2 |
		

Open in new window


see:
http://sqlfiddle.com/#!3/f1b796/6
http://sqlfiddle.com/#!3/f1b796/7

details:
**MS SQL Server 2008 Schema Setup**:

    CREATE TABLE PeopleLocation
    	([PersonID] int, [State] varchar(2), [zipCode] int)
    ;
    	
    INSERT INTO PeopleLocation
    	([PersonID], [State], [zipCode])
    VALUES
    	(1000, 'IL', 60067),
    	(2000, 'IL', 60067),
    	(3000, 'TX', 77045),
    	(4000, 'IL', 60074),
    	(5000, 'TX', 77045),
    	(6000, 'TX', 77045),
    	(7000, 'IL', 60074),
    	(8000, 'TX', 77045),
    	(9000, 'IL', 60067)
    ;

**Query 1**:

    
    select
          *
          , (ABS(CHECKSUM(NewId())) % 3) + 1 as ri
          , NTILE(3) over(partition by [State], [zipCode] order by NewId()) as nt
    from PeopleLocation
    order by [State], [zipCode], [PersonID]
    

**[Results][2]**:
    
    | PERSONID | STATE | ZIPCODE | RI | NT |
    |----------|-------|---------|----|----|
    |     1000 |    IL |   60067 |  1 |  1 |
    |     2000 |    IL |   60067 |  3 |  3 |
    |     9000 |    IL |   60067 |  1 |  2 |
    |     4000 |    IL |   60074 |  2 |  1 |
    |     7000 |    IL |   60074 |  2 |  2 |
    |     3000 |    TX |   77045 |  1 |  1 |
    |     5000 |    TX |   77045 |  2 |  1 |
    |     6000 |    TX |   77045 |  1 |  3 |
    |     8000 |    TX |   77045 |  1 |  2 |



  [1]: http://sqlfiddle.com/#!3/f1b796/7

Open in new window

0
 
LVL 26

Expert Comment

by:Zberteoc
Comment Utility
Try this:
select 
	PersonID,
	[State],
	zipCode,
	(ROW_NUMBER() OVER (PARTITION BY [State], zipCode ORDER BY PersonID)-1)%3+1 as [group]
from
	#PeopleLocation
order by
	[State],
	zipCode,
	[group],
	PersonID

Open in new window

0

Featured Post

Backup Your Microsoft Windows Server®

Backup all your Microsoft Windows Server – on-premises, in remote locations, in private and hybrid clouds. Your entire Windows Server will be backed up in one easy step with patented, block-level disk imaging. We achieve RTOs (recovery time objectives) as low as 15 seconds.

Join & Write a Comment

If you have heard of RFC822 date formats, they can be quite a challenge in SQL Server. RFC822 is an Internet standard format for email message headers, including all dates within those headers. The RFC822 protocols are available in detail at:   ht…
SQL Server engine let you use a Windows account or a SQL Server account to connect to a SQL Server instance. This can be configured immediatly during the SQL Server installation or after in the Server Authentication section in the Server properties …
Via a live example, show how to extract insert data into a SQL Server database table using the Import/Export option and Bulk Insert.
Via a live example, show how to set up a backup for SQL Server using a Maintenance Plan and how to schedule the job into SQL Server Agent.

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now