• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 389
  • Last Modified:

SQL Query - Ranking / Distribution

Hi,

SQL Server 2008 R2.  I have a table with 3 relevant columns:

[FileName], [FileSize], [Group]

I'd like break up the recordset up into groups based on the even distribution of [FileSize];

If I decided to break the recordset into 5 groups I would set [Group] to either '1','2','3','4' or '5' and I would expect the SUM([FileSize]) for each group to be similar.

I can think of some round-about ways of accomplishing this but want to know if there is an elegant way to do so.

Thanks
0
StrangerDanger
Asked:
StrangerDanger
  • 2
  • 2
1 Solution
 
appariCommented:
can you post sample data and the output you are looking for?
0
 
StrangerDangerAuthor Commented:
Sure... to re-iterate... I essentially I want to assign a group number to each row.  When you sum the filesizes for each group, I'd like the totals to be relatively similar.

EXAMPLE INPUT:
[FileName],[FileSize],[Group]

mypicture.jpg, 200, NULL
mypicture2.jpg, 300, NULL
mydoc.doc,500, NULL
myexcelsheet.xls, 1000, NULL
mytextfile.txt,600, NULL
myslide.ppt,400, NULL
myaudio.mp3, 1100, NULL
myfile.txt, 100, NULL
mysite.html, 400, NULL
mypicture3, 550, NULL

EXAMPLE OUTPUT:
[FileName],[FileSize],[Group]

mypicture.jpg, 200, 1
mypicture2.jpg, 300,1
mydoc.doc,500,1
myexcelsheet.xls, 1000, 2
mytextfile.txt,600,3
myslide.ppt,400,3
myaudio.mp3, 1100,4
myfile.txt, 100, 5
mysite.html, 400, 5
mypicture3, 550, 5

Example aggregate:
[Group], [SumOfFileSize]
1, 1000
2, 1000
3, 1000
4, 1100
5, 950
0
 
appariCommented:
try this
not efficient code but seems working
declare @mytab table([FileName] varchar(20),[FileSize] int,[Group] int)
insert into @mytab 
Select 'mypicture.jpg', 200, NULL
union Select 'mypicture2.jpg', 300, NULL
union Select 'mydoc.doc',500, NULL
union Select 'myexcelsheet.xls', 1000, NULL
union Select 'mytextfile.txt',600, NULL
union Select 'myslide.ppt',400, NULL
union Select 'myaudio.mp3', 1100, NULL
union Select 'myfile.txt', 100, NULL
union Select 'mysite.html', 400, NULL
union Select 'mypicture3', 550, NULL

declare @avgSize int
Select @avgSize = sum(filesize)/5 from @mytab
Select @avgSize 
declare @var decimal(4,2)
select @var=0
while exists(Select 1 from @mytab where [group] is null)
begin 
declare @firstGroup int
select @firstGroup =0
;with a as (Select [group], sum(filesize) sumSize from @mytab
where  [group] is not null
group by [group] having sum(filesize) >= @avgsize )
select @firstGroup = isnull(max([group]),0) from a

;with dat as (select row_number() over(order by filesize desc)+@firstGroup rownum,
* from @mytab where [group] is null )
update dat
set [group]= rownum
where rownum<=5
and (filesize + isnull((Select sum(filesize) from @mytab where [group]=rownum),0)<(@avgsize + @avgsize *@var)
or filesize>=@avgsize)
if @@rowcount =0 
select @var= @var+0.1
--select * from @mytab order by 3
end
--select * from @mytab order by 3
select [Group], sum(filesize) from @mytab 
group by [Group]

Open in new window

0
 
StrangerDangerAuthor Commented:
@appari - This helped.  Thanks mate.
0

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

  • 2
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now