?
Solved

Finding the 5 closest numbers from a larger array a numbers

Posted on 2009-02-23
13
Medium Priority
?
336 Views
Last Modified: 2012-05-06
Given an array of 15 numbers (already sorted Ascending) I need to find the 5 closest-together numbers.  That is, the 5 numbers with the least total range between them.
For example, 1,2,3,4,5 have small range of 4, whereas 100,200,300,400,500 have a large range of 400.

I can brute force it with a loop like the attached code

but I wondering if there way a more elegant algorithm I could use, because once I get this working I could be looking through an array as large as 1000 numbers
trying to find the 5 'closest together' numbers.

Given the following array, my brute force finds elements 2-6 with a range of 12 to be the closest together set of 5.

array[0]=216;
array[1]=291;
array[2]=315;
array[3]=316;
array[4]=320;
array[5]=326;
array[6]=327;
array[7]=384;
array[8]=412;
array[9]=467;
array[10]=589;
array[11]=602;
array[12]=662;
array[13]=663;
array[14]=709;

keystart=0;
range=array[4]-array[0];
for(start=1;start<=10;start++){
	if ((array[start+4]-array[start])<range){
		range=array[start+4]-array[start];
		keystart=start;
	}
}

Open in new window

0
Comment
Question by:dban00b
  • 5
  • 3
  • 2
  • +3
13 Comments
 
LVL 10

Expert Comment

by:oleber
ID: 23714517
this is a complexaty n algorithm. So it is the fastest
0
 
LVL 85

Expert Comment

by:ozo
ID: 23714667
if array accesses are the slowest operation, you may be able to cut them by as much as half by  caching them or noting that any inteval smaller than 5 that has a range larger than the smallest range seen so far, cannot be contained in an interval of 5 that has a smaller range, but the added complexity may be more likely to slow down the search if the tine for array accesses is is the same as the time for, say a comparison operation.
0
 
LVL 39

Expert Comment

by:abel
ID: 23714727
I've given it some though and came to the conclusion that this is a typical O(n) problem. Same as, for instance, Max/Min on an unordered array.

However, if fuzziness is allowed, or when it is ok to be imprecise in a way, then you can use a divide and conquer style algorithm, or just loop every fifth element.

Speeding things up a bit will differ on the data. You can put in some logic where if you have two elements that are already too far apart as the current lowest - 3, you can skip the next three. But be aware, you might need to go back if the following two (after skipping) are low enough to make a low - 3 sum complete. You could mark positions that way and go back later, because it is likely that you will have a five-combi that will defeat those jumps.

Note that looping an array of 1000 numbers is not such a big deal considering the opcodes needed. Any modern computer can do an array of 1 million integers and do a compare on each of them in less then a tenth of a second (depending on language / interpreter / compiler).
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 39

Expert Comment

by:abel
ID: 23714735
Sorry for doubling the answers, the question didn't refresh well...
0
 
LVL 16

Expert Comment

by:t0t0
ID: 23715313
There is no 'elegant' solution to this problem. This can only be solved using iteration as you have already shown.

There is no need for any more than a single-pass 'scan' of the array.

Your loop loops from 0 through n-6 (n-5-1) where 'n' is the nukmber of elements in the array, -5 is the 'range' offset and -1 because we're starting from 0 rather than 1.

Something like...

range = array(4) - array(0);
keystart = 0;

for (i = 0, i < (n - 4), i++) {
   if (array(i + 4) - array(i)) < range {
      range = array(i + 4) - array(i);
      keystart = i;
   }
}

The only problem with this code is it performs the 'array(i + 4) - array(i)' twice and this is wasteful. It may be better to asign the result of this calculation to an temporary variable and then use the this in the 'if' statement as in:

range = array(4) - array(0);
keystart = 0;

for (i = 0, i < (n - 4), i++) {
   temp = array(i + 4) - array(i);
   if temp < range {
      range = temp;
      keystart = i;
   }
}

You may find this performs faster on a large array. In terms of elegance, this would be the best solution.




0
 
LVL 16

Expert Comment

by:t0t0
ID: 23715981
Reposted due to error...

There is no 'elegant' solution to this problem. This can only be solved using iteration as you have already shown.

There is no need for any more than a single-pass 'scan' of the array.

Your loop loops from 0 through n-4 (n - 4) where 'n' is the nukmber of elements in the array, and -4 is the 'range' offset.

Something like (which is similar to your code)...

range = array(4) - array(0);
keystart = 0;

for (i = 0, i < (n - 4), i++) {
   if (array(i + 4) - array(i)) < range {
      range = array(i + 4) - array(i);
      keystart = i;
   }
}

However, the only problem with this code is it performs the 'array(i + 4) - array(i)' twice and this is wasteful. It may be better to asign the result of this calculation to an temporary variable and then use the this in the 'if' statement as in:

range = array(4) - array(0);
keystart = 0;

for (i = 0, i < (n - 4), i++) {
   temp = array(i + 4) - array(i);
   if temp < range {
      range = temp;
      keystart = i;
   }
}

You may find this performs faster on a large array. In terms of elegance, this would be the best solution.

PS. I'm not going to quibble for starting my loop fro 0 rahter than 1 as you have done. In fact, it makes the logic seem clearer as we're reminded the first element of our array is 0.
0
 
LVL 20

Expert Comment

by:thehagman
ID: 23716122
If the cost is in the reading from array[] (but not from internal data; e.g. the array might be on a tape or it might in fact be a complicated function call), the following looks up each entry only once.


int temp[5];
k = 0
keystart = 0
range = +infinity  // or the biggest possible finite value
for (i=0; i<n; ++i) {
    t = array[i]
    if (i>4) {
       d = t-temp[k]
       if (d<range) {
          range = d
          keystart = i-4
       }
    }
    temp[k] = t
    if (k==4) k=0; else ++k; // thus always k = i % 5
}

Open in new window

0
 
LVL 16

Expert Comment

by:t0t0
ID: 23716281
thehangman

I don't think you can make it any simpler, slicker nor elegant as it already stands....  I couldn't get to grips with your solution. Are you trying to implement some sort of read-ahead 'buffer'?
0
 
LVL 1

Author Comment

by:dban00b
ID: 23724917
Okay it seems I really can't do any better.  It's just one of those things that any more-complicated solution is simply going to make it worse.

Yeah I wasn't having any speed problems, I just saw myself bruting through and thought there might be a better way.

I'm not exactly sure how to close this question.
0
 
LVL 10

Expert Comment

by:oleber
ID: 23725144
you give the points to the one giving the easer explanation.

Or split the points between the persons giving useful information for you to take this conclusion
0
 
LVL 16

Accepted Solution

by:
t0t0 earned 2000 total points
ID: 23725541
dban00b

In my post  ID: 23715981, I compared my code (which I wrote from the ground up) and compared it to your own code.

They are almost identical (save for the names of variables etc)

I then went on to suggest an improvement which you don't seem to have picked up on so I will now use your own code and moedl it after mine. Please see solution below:

keystart = 0;
range = array[4] - array[0];

for (start = 1; start <= 10; start++) {
   temp = array[start + 4] - array[start];
   if (temp < range) {
      range = temp;
      keystart = start;
   }
}


Notice the use of an intermedairy variable. This is so we don't perform the same calculation twice (array[start+4] - array[start]). From a programmer's point of view, it is more efficient to look up a value than it is to recalculate the same value.
 
As for the points, your question asks if there is a more elegant way in which to solve your problem. I have suggested the only possible change necessary to achieve a better solution. Please judge accordingly.
0
 
LVL 39

Expert Comment

by:abel
ID: 23727745
> From a programmer's point of view, it is more efficient to look up a value than it is to recalculate the same value.

Agreed. But bare in mind that, with IL-compiled languages like C# you may end up with the same execution speed. If you would do it in assembly, you may find out that moving memory is more expensive than calculating, but assembly optimization, especially with modern processors who pipeline and predict the next action, it is extremely hard to come with a "best" or "most speedy" solution (that would be a chapter on its own). But the readability definitely increases.
0
 
LVL 16

Expert Comment

by:t0t0
ID: 23728479
dban00b

Thank you for accepting my solution.
0

Featured Post

Hire Technology Freelancers with Gigs

Work with freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely, and get projects done right.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Article by: Nadia
Suppose you use Uber application as a rider and you request a ride to go from one place to another. Your driver just arrived at the parking lot of your place. The only thing you know about the ride is the license plate number. How do you find your U…
Iteration: Iteration is repetition of a process. A student who goes to school repeats the process of going to school everyday until graduation. We go to grocery store at least once or twice a month to buy products. We repeat this process every mont…
Although Jacob Bernoulli (1654-1705) has been credited as the creator of "Binomial Distribution Table", Gottfried Leibniz (1646-1716) did his dissertation on the subject in 1666; Leibniz you may recall is the co-inventor of "Calculus" and beat Isaac…
Finds all prime numbers in a range requested and places them in a public primes() array. I've demostrated a template size of 30 (2 * 3 * 5) but larger templates can be built such 210  (2 * 3 * 5 * 7) or 2310  (2 * 3 * 5 * 7 * 11). The larger templa…
Suggested Courses

864 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question