Solved

randomly assign  grouped data to a group

Posted on 2015-02-05
4
90 Views
Last Modified: 2015-02-09
I have the following table named PeopleLocation

PersonID           State                 zipCode
1000                   IL                        60067
2000                   IL                        60067
3000                   TX                      77045
4000                   IL                        60074
5000                   TX                       77045
6000                   TX                      77045
7000                   IL                        60074
8000                   TX                       77045
9000                   IL                         60067




What I want to do is write a query that will group the total number of people by state and zipcode then randomly assign a group value of 1, 2 or 3

(if the number is not divisible by 3 it is ok if group size is not equal or if there are no entries in a group

Using the sample above, I would like to have something like

PersonID           State                 zipCode     Group
1000                   IL                        60067        1
9000                   IL                        60067        2
2000                   IL                        60067        3
4000                   IL                        60074        1
7000                   IL                        60074        2
6000                   TX                      77045         1
5000                   TX                       77045        2
3000                   TX                      77045         3
8000                   TX                       77045        1

Hopefully this makes sense
0
Comment
Question by:johnnyg123
4 Comments
 
LVL 12

Accepted Solution

by:
Habib Pourfard earned 500 total points
ID: 40592602
you can write it with use of row_number() to assign a group number to each group of state and zip code and to make it random you can order it by NEWID():

IF OBJECT_ID('tempdb.dbo.#PeopleLocation') IS NOT NULL
	DROP TABLE #PeopleLocation
GO

CREATE TABLE #PeopleLocation
(
	PersonID INT PRIMARY KEY,
	[State] VARCHAR(2),
	zipCode INT
)

INSERT INTO #PeopleLocation VALUES (1000,'IL', 60067 )
INSERT INTO #PeopleLocation VALUES (2000,'IL', 60067 )
INSERT INTO #PeopleLocation VALUES (3000,'TX', 77045 )
INSERT INTO #PeopleLocation VALUES (4000,'IL', 60074 )
INSERT INTO #PeopleLocation VALUES (5000,'TX', 77045 )
INSERT INTO #PeopleLocation VALUES (6000,'TX', 77045 )
INSERT INTO #PeopleLocation VALUES (7000,'IL', 60074 )
INSERT INTO #PeopleLocation VALUES (8000,'TX', 77045 )
INSERT INTO #PeopleLocation VALUES (9000,'IL',  60067)


SELECT *, ((ROW_NUMBER() OVER (PARTITION BY State, zipCode ORDER BY NEWID())) - 1) % 3 + 1 AS [Group] FROM #PeopleLocation
ORDER BY [State], zipCode

Open in new window

0
 
LVL 75

Expert Comment

by:Aneesh Retnakaran
ID: 40592608
;WITH c AS (
 SELECT PersonID, state, ZipCode, [Group]= ROW_NUMBER() OVER(PARTITION BY state ORDER BY zipcode )
 FROM yourTable
 )
 SELECT * FROM c
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 40592626
try these?
select
      *
      , (ABS(CHECKSUM(NewId())) % 3) + 1 as ri
      , NTILE(3) over(partition by [State], [zipCode] order by NewId()) as nt
from PeopleLocation

Open in new window


results(1)
| PERSONID | STATE | ZIPCODE | RI | NT |
|----------|-------|---------|----|----|
|     1000 |    IL |   60067 |  3 |  2 |
|     2000 |    IL |   60067 |  1 |  3 |
|     9000 |    IL |   60067 |  1 |  1 |
|     4000 |    IL |   60074 |  1 |  1 |
|     7000 |    IL |   60074 |  1 |  2 |
|     3000 |    TX |   77045 |  2 |  1 |
|     5000 |    TX |   77045 |  2 |  2 |
|     6000 |    TX |   77045 |  3 |  3 |
|     8000 |    TX |   77045 |  1 |  1 |

Open in new window


results(2)
| PERSONID | STATE | ZIPCODE | RI | NT |
|----------|-------|---------|----|----|
|     1000 |    IL |   60067 |  1 |  1 |
|     2000 |    IL |   60067 |  3 |  3 |
|     9000 |    IL |   60067 |  1 |  2 |
|     4000 |    IL |   60074 |  2 |  1 |
|     7000 |    IL |   60074 |  2 |  2 |
|     3000 |    TX |   77045 |  1 |  1 |
|     5000 |    TX |   77045 |  2 |  1 |
|     6000 |    TX |   77045 |  1 |  3 |
|     8000 |    TX |   77045 |  1 |  2 |
		

Open in new window


see:
http://sqlfiddle.com/#!3/f1b796/6
http://sqlfiddle.com/#!3/f1b796/7

details:
**MS SQL Server 2008 Schema Setup**:

    CREATE TABLE PeopleLocation
    	([PersonID] int, [State] varchar(2), [zipCode] int)
    ;
    	
    INSERT INTO PeopleLocation
    	([PersonID], [State], [zipCode])
    VALUES
    	(1000, 'IL', 60067),
    	(2000, 'IL', 60067),
    	(3000, 'TX', 77045),
    	(4000, 'IL', 60074),
    	(5000, 'TX', 77045),
    	(6000, 'TX', 77045),
    	(7000, 'IL', 60074),
    	(8000, 'TX', 77045),
    	(9000, 'IL', 60067)
    ;

**Query 1**:

    
    select
          *
          , (ABS(CHECKSUM(NewId())) % 3) + 1 as ri
          , NTILE(3) over(partition by [State], [zipCode] order by NewId()) as nt
    from PeopleLocation
    order by [State], [zipCode], [PersonID]
    

**[Results][2]**:
    
    | PERSONID | STATE | ZIPCODE | RI | NT |
    |----------|-------|---------|----|----|
    |     1000 |    IL |   60067 |  1 |  1 |
    |     2000 |    IL |   60067 |  3 |  3 |
    |     9000 |    IL |   60067 |  1 |  2 |
    |     4000 |    IL |   60074 |  2 |  1 |
    |     7000 |    IL |   60074 |  2 |  2 |
    |     3000 |    TX |   77045 |  1 |  1 |
    |     5000 |    TX |   77045 |  2 |  1 |
    |     6000 |    TX |   77045 |  1 |  3 |
    |     8000 |    TX |   77045 |  1 |  2 |



  [1]: http://sqlfiddle.com/#!3/f1b796/7

Open in new window

0
 
LVL 26

Expert Comment

by:Zberteoc
ID: 40593648
Try this:
select 
	PersonID,
	[State],
	zipCode,
	(ROW_NUMBER() OVER (PARTITION BY [State], zipCode ORDER BY PersonID)-1)%3+1 as [group]
from
	#PeopleLocation
order by
	[State],
	zipCode,
	[group],
	PersonID

Open in new window

0

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
SQL Backup Question 2 29
SQL query and VBA 5 45
SSMS Opening Mode 9 18
Get rid of the last Last comma and space in a SQL statement. 7 14
Use this article to create a batch file to backup a Microsoft SQL Server database to a Windows folder.  The folder can be on the local hard drive or on a network share.  This batch file will query the SQL server to get the current date & time and wi…
The Delta outage: 650 cancelled flights, more than 1200 delayed flights, thousands of frustrated customers, tens of millions of dollars in damages – plus untold reputational damage to one of the world’s most trusted airlines. All due to a catastroph…
Using examples as well as descriptions, and references to Books Online, show the different Recovery Models available in SQL Server and explain, as well as show how full, differential and transaction log backups are performed
Viewers will learn how to use the INSERT statement to insert data into their tables. It will also introduce the NULL statement, to show them what happens when no value is giving for any given column.

840 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question