Solved

SQL Query - Ranking / Distribution

Posted on 2012-03-22
4
379 Views
Last Modified: 2012-03-22
Hi,

SQL Server 2008 R2.  I have a table with 3 relevant columns:

[FileName], [FileSize], [Group]

I'd like break up the recordset up into groups based on the even distribution of [FileSize];

If I decided to break the recordset into 5 groups I would set [Group] to either '1','2','3','4' or '5' and I would expect the SUM([FileSize]) for each group to be similar.

I can think of some round-about ways of accomplishing this but want to know if there is an elegant way to do so.

Thanks
0
Comment
Question by:StrangerDanger
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
4 Comments
 
LVL 39

Expert Comment

by:appari
ID: 37751208
can you post sample data and the output you are looking for?
0
 

Author Comment

by:StrangerDanger
ID: 37751239
Sure... to re-iterate... I essentially I want to assign a group number to each row.  When you sum the filesizes for each group, I'd like the totals to be relatively similar.

EXAMPLE INPUT:
[FileName],[FileSize],[Group]

mypicture.jpg, 200, NULL
mypicture2.jpg, 300, NULL
mydoc.doc,500, NULL
myexcelsheet.xls, 1000, NULL
mytextfile.txt,600, NULL
myslide.ppt,400, NULL
myaudio.mp3, 1100, NULL
myfile.txt, 100, NULL
mysite.html, 400, NULL
mypicture3, 550, NULL

EXAMPLE OUTPUT:
[FileName],[FileSize],[Group]

mypicture.jpg, 200, 1
mypicture2.jpg, 300,1
mydoc.doc,500,1
myexcelsheet.xls, 1000, 2
mytextfile.txt,600,3
myslide.ppt,400,3
myaudio.mp3, 1100,4
myfile.txt, 100, 5
mysite.html, 400, 5
mypicture3, 550, 5

Example aggregate:
[Group], [SumOfFileSize]
1, 1000
2, 1000
3, 1000
4, 1100
5, 950
0
 
LVL 39

Accepted Solution

by:
appari earned 500 total points
ID: 37751373
try this
not efficient code but seems working
declare @mytab table([FileName] varchar(20),[FileSize] int,[Group] int)
insert into @mytab 
Select 'mypicture.jpg', 200, NULL
union Select 'mypicture2.jpg', 300, NULL
union Select 'mydoc.doc',500, NULL
union Select 'myexcelsheet.xls', 1000, NULL
union Select 'mytextfile.txt',600, NULL
union Select 'myslide.ppt',400, NULL
union Select 'myaudio.mp3', 1100, NULL
union Select 'myfile.txt', 100, NULL
union Select 'mysite.html', 400, NULL
union Select 'mypicture3', 550, NULL

declare @avgSize int
Select @avgSize = sum(filesize)/5 from @mytab
Select @avgSize 
declare @var decimal(4,2)
select @var=0
while exists(Select 1 from @mytab where [group] is null)
begin 
declare @firstGroup int
select @firstGroup =0
;with a as (Select [group], sum(filesize) sumSize from @mytab
where  [group] is not null
group by [group] having sum(filesize) >= @avgsize )
select @firstGroup = isnull(max([group]),0) from a

;with dat as (select row_number() over(order by filesize desc)+@firstGroup rownum,
* from @mytab where [group] is null )
update dat
set [group]= rownum
where rownum<=5
and (filesize + isnull((Select sum(filesize) from @mytab where [group]=rownum),0)<(@avgsize + @avgsize *@var)
or filesize>=@avgsize)
if @@rowcount =0 
select @var= @var+0.1
--select * from @mytab order by 3
end
--select * from @mytab order by 3
select [Group], sum(filesize) from @mytab 
group by [Group]

Open in new window

0
 

Author Comment

by:StrangerDanger
ID: 37755941
@appari - This helped.  Thanks mate.
0

Featured Post

Back Up Your Microsoft Windows Server®

Back up all your Microsoft Windows Server – on-premises, in remote locations, in private and hybrid clouds. Your entire Windows Server will be backed up in one easy step with patented, block-level disk imaging. We achieve RTOs (recovery time objectives) as low as 15 seconds.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

For both online and offline retail, the cross-channel business is the most recent pattern in the B2C trade space.
In part one, we reviewed the prerequisites required for installing SQL Server vNext. In this part we will explore how to install Microsoft's SQL Server on Ubuntu 16.04.
Using examples as well as descriptions, and references to Books Online, show the documentation available for date manipulation functions and by using a select few of these functions, show how date based data can be manipulated with these functions.
Via a live example combined with referencing Books Online, show some of the information that can be extracted from the Catalog Views in SQL Server.

617 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question