[Last Call] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

Need a stored procedure to find all records that is needed to make up a specific value

Posted on 2007-10-08
8
Medium Priority
?
190 Views
Last Modified: 2012-05-05
Hi,

I'm using SQL Server 2005.

For the sake of clarity I will simplify the structrue of the information records.

I have a result set with 10 000 records. Each records has two columns, an id and a value:
ID => Unique Identifier
Value => Float

The value column can have any amount from 0.01 to 10 000 000.

What I need to do is find a number of records in this resultset that will add up to 100 000 or the closest possible value to 100 000.

EXAMPLE: (for this example I will make the ID just two characters)

ID     VALUE
AF    123.45
WT   1011.30
WE    239.54
KE    265 392.00
VW   15 935.04
IJ       23.32
OW    80 993.00
PR     1 324.56


In the above Example OW, VW, PR, WE, WT adds up to 99 982.52 so those are the transactions I'm interested in.

QUESTION: How do I determine which transactions to use to get as close as possible to 100 000?
0
Comment
Question by:dbit
  • 4
  • 2
6 Comments
 
LVL 18

Expert Comment

by:Yveau
ID: 20032651
No further restriction ?
Only for your example of 8 records, you have 8! = 40K+ possibilities of combinations ... for 10K records, you don't even want to know how many combinations you can have ...

Do you mean <= 100 000 or would 100 000.01 also be acceptable as an answer ?

0
 
LVL 42

Expert Comment

by:dqmq
ID: 20032716
You would need procedural code to compare all 10,000 1-row combinations with something like 50 million 2-row combinations,  then the 3-row combinations and 4-row combinations, etc... all the way to the 10,000-row combination, all the while keeping track of the closest combination as you go.

Hope you are willing to wait some amount of time for the answer :>).  

0
 
LVL 18

Expert Comment

by:Yveau
ID: 20032897
I think this one is pretty close:

set nocount on

--drop table #Y
create table #Y (ID char(2), Value decimal(12,2))

insert into #Y values('AF',123.45)
insert into #Y values('WT',1011.30)
insert into #Y values('WE',239.54)
insert into #Y values('KE',265392.00)
insert into #Y values('VW',15935.04)
insert into #Y values('IJ',23.32)
insert into #Y values('OW',80993.00)
insert into #Y values('PR',1324.56)


drop table #Temp
create table #Temp (ID varchar(max), Value decimal(12,2))

declare @Nr int
declare @SubNr int
declare @Current int
declare @Target decimal(12,2)
declare @SQL varchar(max)
declare @SQLSub varchar(max)
declare @SQLSub2 varchar(max)
declare @SQLSub3 varchar(max)
declare @SQLSub4 varchar(max)
declare @SQLSub5 varchar(max)

select  @Target = 100000.00
,       @Current = 0
,       @SQL = ''
,       @SQLSub = ''
,       @SQLSub2 = ''
,       @SQLSub3 = 'Y0.ID'
,       @SQLSub4 = 'Y0.Value'
,       @SQLSub5 = ''
,       @SubNr = 0
select  @Nr = count(ID) from #Y where Value <= @Target

while   @Current <= @nr
begin
        if @Current > 0
        begin
                select  @SQLSub = @SQLSub + ' inner join #Y Y' + cast(@current as varchar(4)) + ' on Y' + cast(@current as varchar(4)) + '.ID > Y' + cast(@current - 1 as varchar(4)) + '.ID'
                select  @SQLSub3 = @SQLSub3 + ' + Y' + cast(@current as varchar(4)) + '.ID'
                select  @SQLSub4 = @SQLSub4 + ' + Y' + cast(@current as varchar(4)) + '.Value'
        end

        select  @SQLSub2 = @SQLSub2 + ' and Y' + cast(@current as varchar(4)) + '.value <= '  + convert(varchar(12),@target)
        select  @SQL = 'insert into #Temp select ' + @SQLSub3 + ', sum(' + @SQLSub4 + ') from #Y Y0 ' + @SQLSub + ' where 1 = 1 ' + @SQLSub2 + ' group by ' + @SQLSub3
--        select  @SQL
        execute (@SQL)

        select  @Current = @current + 1
        ,       @SubNr = 0
end

select  Value, 100000 - Value as [Diff], ID
from    #Temp
where   Value <= 100000
order   by 100000 - Value

... and your example adds up to 99503,44. Your optimum would be AF,IJ,OW,PR,VW,WE,WT: 99650.21
As my example code proves ...

Hope this helps ...
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 18

Expert Comment

by:Yveau
ID: 20033533
I did some facelifting on the script to make it more efficient on the resources:

drop table #Y
create table #Y (ID char(2), Value decimal(12,2))

create clustered index idxY on #Y(ID)

insert into #Y values('AF',123.45)
insert into #Y values('WT',1011.30)
insert into #Y values('WE',239.54)
insert into #Y values('KE',265392.00)
insert into #Y values('VW',15935.04)
insert into #Y values('IJ',23.32)
insert into #Y values('OW',80993.00)
insert into #Y values('PR',1324.56)

drop table #Temp
create table #Temp (ID varchar(max), Value decimal(12,2))

declare @Nr int
declare @SubNr int
declare @Current int
declare @Target decimal(12,2)
declare @SQL varchar(max)
declare @SQLSub varchar(max)
declare @SQLSub2 varchar(max)
declare @SQLSub3 varchar(max)
declare @SQLSub4 varchar(max)
declare @Y varchar(6)
declare @Optimum varchar(max)

select  @Target = 100000.00
,       @Current = 0
,       @SQL = ''
,       @SQLSub = ''
,       @SQLSub2 = ''
,       @SQLSub3 = 'Y0.ID'
,       @SQLSub4 = 'Y0.Value'
,       @SubNr = 0
select  @Nr = count(ID) from #Y where Value <= @Target

while   @Current <= @nr
begin
        select  @Y = cast(@current as varchar(6))
        if @Current > 0
        begin
                select  @SQLSub = @SQLSub + ' inner join #Y Y' + @Y + ' on Y' + @Y + '.ID > Y' + cast(@current - 1 as varchar(4)) + '.ID'
                select  @SQLSub3 = @SQLSub3 + ' + '';'' + Y' + @Y + '.ID'
                select  @SQLSub4 = @SQLSub4 + ' + Y' + @Y + '.Value'
        end

        select  @SQLSub2 = @SQLSub2 + ' and Y' + @Y + '.value <= '  + convert(varchar(12),@target)
        select  @SQL = 'insert into #Temp select ' + @SQLSub3 + ', sum(' + @SQLSub4 + ') from #Y Y0 ' + @SQLSub + ' where 1 = 1 ' + @SQLSub2 + ' group by ' + @SQLSub3
--        select  @SQL
        execute (@SQL)

--      Find current optimum
        select  top 1 @Optimum = Value
        from    #Temp
        where   Value <= @Target
        order   by Value desc

--      remove all but the best option so far
        delete
        from    #Temp
        where   Value <> @Optimum

--      Show current status and check for match
        if (@Optimum = @target)
        begin
                select @Current = @nr + 1
        end
        else
        begin
                print   'Loop #' + cast(@Current as varchar(10)) + ' finished; Current Optimum: ' + cast(@Optimum as varchar(12))
                select  @Current = @current + 1
                ,       @SubNr = 0
        end
end

select  Value, @Target - Value as [Diff], ID
from    #Temp
where   Value <= @Target
order   by 2


Next to that I started the code to search for an optimal solution on 100 records. I canceled the script after an hour ... So think twice before running this on 10K records ... or be very patient. To give you an idea, the 8 record example takes one second, an example with 16 records takes 3 seconds on my machine.

Hope this helps ...
0
 
LVL 42

Accepted Solution

by:
dqmq earned 1000 total points
ID: 20037315
Yveau,
Your solution is intriguing.  I can tell you really like to accept a challenge.  

But it doesn't solve the fundamental problem that there are a lot of combinations to chunk through.  I did some testing with different table sizes.  

Rows        Time             Sums Required
8               < 1 sec          92
16              >6 sec           696
18              >35 sec         987
20              > 3 min        1350
22              > 18 min      1793


So, I think the trend is obvious and the prospect of processing 10,000 rows in this manner is formiddable.  Maybe there's some hueristic way to reduce the number of calc's required.    




           
0
 
LVL 18

Assisted Solution

by:Yveau
Yveau earned 1000 total points
ID: 20041124
I did take in account that all values larger than the target will be ignored.
Next to that I don't think there are other tricks you can use. Any value larger than 0 can contribute to a possible optimum ... you can set it to use only a combination of n ID's in stead of the possible 10K.

If you change the line
    while   @Current <= @nr
to something like this:
    while   @Current <= 5
it will only try to find an optimum, using 1-5 different values. This might not be the optimum, but might be an acceptable solution.

Your fundamental problem will remain a problem as you need to calculate A LOT of combinations !!!

Hope this helps ...
0

Featured Post

Get quick recovery of individual SharePoint items

Free tool – Veeam Explorer for Microsoft SharePoint, enables fast, easy restores of SharePoint sites, documents, libraries and lists — all with no agents to manage and no additional licenses to buy.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this article we will learn how to fix  “Cannot install SQL Server 2014 Service Pack 2: Unable to install windows installer msi file” error ?
Recently we ran in to an issue while running some SQL jobs where we were trying to process the cubes.  We got an error saying failure stating 'NT SERVICE\SQLSERVERAGENT does not have access to Analysis Services. So this is a way to automate that wit…
Via a live example, show how to extract insert data into a SQL Server database table using the Import/Export option and Bulk Insert.
This videos aims to give the viewer a basic demonstration of how a user can query current session information by using the SYS_CONTEXT function
Suggested Courses

830 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question