Solved

randomly assign  grouped data to a group

Posted on 2015-02-05
4
91 Views
Last Modified: 2015-02-09
I have the following table named PeopleLocation

PersonID           State                 zipCode
1000                   IL                        60067
2000                   IL                        60067
3000                   TX                      77045
4000                   IL                        60074
5000                   TX                       77045
6000                   TX                      77045
7000                   IL                        60074
8000                   TX                       77045
9000                   IL                         60067




What I want to do is write a query that will group the total number of people by state and zipcode then randomly assign a group value of 1, 2 or 3

(if the number is not divisible by 3 it is ok if group size is not equal or if there are no entries in a group

Using the sample above, I would like to have something like

PersonID           State                 zipCode     Group
1000                   IL                        60067        1
9000                   IL                        60067        2
2000                   IL                        60067        3
4000                   IL                        60074        1
7000                   IL                        60074        2
6000                   TX                      77045         1
5000                   TX                       77045        2
3000                   TX                      77045         3
8000                   TX                       77045        1

Hopefully this makes sense
0
Comment
Question by:johnnyg123
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
4 Comments
 
LVL 12

Accepted Solution

by:
Habib Pourfard earned 500 total points
ID: 40592602
you can write it with use of row_number() to assign a group number to each group of state and zip code and to make it random you can order it by NEWID():

IF OBJECT_ID('tempdb.dbo.#PeopleLocation') IS NOT NULL
	DROP TABLE #PeopleLocation
GO

CREATE TABLE #PeopleLocation
(
	PersonID INT PRIMARY KEY,
	[State] VARCHAR(2),
	zipCode INT
)

INSERT INTO #PeopleLocation VALUES (1000,'IL', 60067 )
INSERT INTO #PeopleLocation VALUES (2000,'IL', 60067 )
INSERT INTO #PeopleLocation VALUES (3000,'TX', 77045 )
INSERT INTO #PeopleLocation VALUES (4000,'IL', 60074 )
INSERT INTO #PeopleLocation VALUES (5000,'TX', 77045 )
INSERT INTO #PeopleLocation VALUES (6000,'TX', 77045 )
INSERT INTO #PeopleLocation VALUES (7000,'IL', 60074 )
INSERT INTO #PeopleLocation VALUES (8000,'TX', 77045 )
INSERT INTO #PeopleLocation VALUES (9000,'IL',  60067)


SELECT *, ((ROW_NUMBER() OVER (PARTITION BY State, zipCode ORDER BY NEWID())) - 1) % 3 + 1 AS [Group] FROM #PeopleLocation
ORDER BY [State], zipCode

Open in new window

0
 
LVL 75

Expert Comment

by:Aneesh Retnakaran
ID: 40592608
;WITH c AS (
 SELECT PersonID, state, ZipCode, [Group]= ROW_NUMBER() OVER(PARTITION BY state ORDER BY zipcode )
 FROM yourTable
 )
 SELECT * FROM c
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 40592626
try these?
select
      *
      , (ABS(CHECKSUM(NewId())) % 3) + 1 as ri
      , NTILE(3) over(partition by [State], [zipCode] order by NewId()) as nt
from PeopleLocation

Open in new window


results(1)
| PERSONID | STATE | ZIPCODE | RI | NT |
|----------|-------|---------|----|----|
|     1000 |    IL |   60067 |  3 |  2 |
|     2000 |    IL |   60067 |  1 |  3 |
|     9000 |    IL |   60067 |  1 |  1 |
|     4000 |    IL |   60074 |  1 |  1 |
|     7000 |    IL |   60074 |  1 |  2 |
|     3000 |    TX |   77045 |  2 |  1 |
|     5000 |    TX |   77045 |  2 |  2 |
|     6000 |    TX |   77045 |  3 |  3 |
|     8000 |    TX |   77045 |  1 |  1 |

Open in new window


results(2)
| PERSONID | STATE | ZIPCODE | RI | NT |
|----------|-------|---------|----|----|
|     1000 |    IL |   60067 |  1 |  1 |
|     2000 |    IL |   60067 |  3 |  3 |
|     9000 |    IL |   60067 |  1 |  2 |
|     4000 |    IL |   60074 |  2 |  1 |
|     7000 |    IL |   60074 |  2 |  2 |
|     3000 |    TX |   77045 |  1 |  1 |
|     5000 |    TX |   77045 |  2 |  1 |
|     6000 |    TX |   77045 |  1 |  3 |
|     8000 |    TX |   77045 |  1 |  2 |
		

Open in new window


see:
http://sqlfiddle.com/#!3/f1b796/6
http://sqlfiddle.com/#!3/f1b796/7

details:
**MS SQL Server 2008 Schema Setup**:

    CREATE TABLE PeopleLocation
    	([PersonID] int, [State] varchar(2), [zipCode] int)
    ;
    	
    INSERT INTO PeopleLocation
    	([PersonID], [State], [zipCode])
    VALUES
    	(1000, 'IL', 60067),
    	(2000, 'IL', 60067),
    	(3000, 'TX', 77045),
    	(4000, 'IL', 60074),
    	(5000, 'TX', 77045),
    	(6000, 'TX', 77045),
    	(7000, 'IL', 60074),
    	(8000, 'TX', 77045),
    	(9000, 'IL', 60067)
    ;

**Query 1**:

    
    select
          *
          , (ABS(CHECKSUM(NewId())) % 3) + 1 as ri
          , NTILE(3) over(partition by [State], [zipCode] order by NewId()) as nt
    from PeopleLocation
    order by [State], [zipCode], [PersonID]
    

**[Results][2]**:
    
    | PERSONID | STATE | ZIPCODE | RI | NT |
    |----------|-------|---------|----|----|
    |     1000 |    IL |   60067 |  1 |  1 |
    |     2000 |    IL |   60067 |  3 |  3 |
    |     9000 |    IL |   60067 |  1 |  2 |
    |     4000 |    IL |   60074 |  2 |  1 |
    |     7000 |    IL |   60074 |  2 |  2 |
    |     3000 |    TX |   77045 |  1 |  1 |
    |     5000 |    TX |   77045 |  2 |  1 |
    |     6000 |    TX |   77045 |  1 |  3 |
    |     8000 |    TX |   77045 |  1 |  2 |



  [1]: http://sqlfiddle.com/#!3/f1b796/7

Open in new window

0
 
LVL 26

Expert Comment

by:Zberteoc
ID: 40593648
Try this:
select 
	PersonID,
	[State],
	zipCode,
	(ROW_NUMBER() OVER (PARTITION BY [State], zipCode ORDER BY PersonID)-1)%3+1 as [group]
from
	#PeopleLocation
order by
	[State],
	zipCode,
	[group],
	PersonID

Open in new window

0

Featured Post

Free eBook: Backup on AWS

Everything you need to know about backup and disaster recovery with AWS, for FREE!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Slowly Changing Dimension Transformation component in data task flow is very useful for us to manage and control how data changes in SSIS.
In this article we will learn how to fix  “Cannot install SQL Server 2014 Service Pack 2: Unable to install windows installer msi file” error ?
Familiarize people with the process of retrieving data from SQL Server using an Access pass-thru query. Microsoft Access is a very powerful client/server development tool. One of the ways that you can retrieve data from a SQL Server is by using a pa…
Viewers will learn how to use the SELECT statement in SQL and will be exposed to the many uses the SELECT statement has.

749 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question