Solved

SQL Query - Ranking / Distribution

Posted on 2012-03-22
4
373 Views
Last Modified: 2012-03-22
Hi,

SQL Server 2008 R2.  I have a table with 3 relevant columns:

[FileName], [FileSize], [Group]

I'd like break up the recordset up into groups based on the even distribution of [FileSize];

If I decided to break the recordset into 5 groups I would set [Group] to either '1','2','3','4' or '5' and I would expect the SUM([FileSize]) for each group to be similar.

I can think of some round-about ways of accomplishing this but want to know if there is an elegant way to do so.

Thanks
0
Comment
Question by:StrangerDanger
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
4 Comments
 
LVL 39

Expert Comment

by:appari
ID: 37751208
can you post sample data and the output you are looking for?
0
 

Author Comment

by:StrangerDanger
ID: 37751239
Sure... to re-iterate... I essentially I want to assign a group number to each row.  When you sum the filesizes for each group, I'd like the totals to be relatively similar.

EXAMPLE INPUT:
[FileName],[FileSize],[Group]

mypicture.jpg, 200, NULL
mypicture2.jpg, 300, NULL
mydoc.doc,500, NULL
myexcelsheet.xls, 1000, NULL
mytextfile.txt,600, NULL
myslide.ppt,400, NULL
myaudio.mp3, 1100, NULL
myfile.txt, 100, NULL
mysite.html, 400, NULL
mypicture3, 550, NULL

EXAMPLE OUTPUT:
[FileName],[FileSize],[Group]

mypicture.jpg, 200, 1
mypicture2.jpg, 300,1
mydoc.doc,500,1
myexcelsheet.xls, 1000, 2
mytextfile.txt,600,3
myslide.ppt,400,3
myaudio.mp3, 1100,4
myfile.txt, 100, 5
mysite.html, 400, 5
mypicture3, 550, 5

Example aggregate:
[Group], [SumOfFileSize]
1, 1000
2, 1000
3, 1000
4, 1100
5, 950
0
 
LVL 39

Accepted Solution

by:
appari earned 500 total points
ID: 37751373
try this
not efficient code but seems working
declare @mytab table([FileName] varchar(20),[FileSize] int,[Group] int)
insert into @mytab 
Select 'mypicture.jpg', 200, NULL
union Select 'mypicture2.jpg', 300, NULL
union Select 'mydoc.doc',500, NULL
union Select 'myexcelsheet.xls', 1000, NULL
union Select 'mytextfile.txt',600, NULL
union Select 'myslide.ppt',400, NULL
union Select 'myaudio.mp3', 1100, NULL
union Select 'myfile.txt', 100, NULL
union Select 'mysite.html', 400, NULL
union Select 'mypicture3', 550, NULL

declare @avgSize int
Select @avgSize = sum(filesize)/5 from @mytab
Select @avgSize 
declare @var decimal(4,2)
select @var=0
while exists(Select 1 from @mytab where [group] is null)
begin 
declare @firstGroup int
select @firstGroup =0
;with a as (Select [group], sum(filesize) sumSize from @mytab
where  [group] is not null
group by [group] having sum(filesize) >= @avgsize )
select @firstGroup = isnull(max([group]),0) from a

;with dat as (select row_number() over(order by filesize desc)+@firstGroup rownum,
* from @mytab where [group] is null )
update dat
set [group]= rownum
where rownum<=5
and (filesize + isnull((Select sum(filesize) from @mytab where [group]=rownum),0)<(@avgsize + @avgsize *@var)
or filesize>=@avgsize)
if @@rowcount =0 
select @var= @var+0.1
--select * from @mytab order by 3
end
--select * from @mytab order by 3
select [Group], sum(filesize) from @mytab 
group by [Group]

Open in new window

0
 

Author Comment

by:StrangerDanger
ID: 37755941
@appari - This helped.  Thanks mate.
0

Featured Post

Why You Need a DevOps Toolchain

IT needs to deliver services with more agility and velocity. IT must roll out application features and innovations faster to keep up with customer demands, which is where a DevOps toolchain steps in. View the infographic to see why you need a DevOps toolchain.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

JSON is being used more and more, besides XML, and you surely wanted to parse the data out into SQL instead of doing it in some Javascript. The below function in SQL Server can do the job for you, returning a quick table with the parsed data.
This article shows gives you an overview on SQL Server 2016 row level security. You will also get to know the usages of row-level-security and how it works
Using examples as well as descriptions, and references to Books Online, show the different Recovery Models available in SQL Server and explain, as well as show how full, differential and transaction log backups are performed
Via a live example, show how to backup a database, simulate a failure backup the tail of the database transaction log and perform the restore.

751 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question