asked on

Algorithm right?

I need to write for the program that computes A th smallest number in a set of K distinct integer. But there's an restriction that the running time has to be
O(K+Alog(K)). This is my way of thinking. I think I will need to sort the data first. And I'm think to use basket sorting. Then find the number I want. I just want to know if it is right? Or I should use heap sort? I'll appreciate any idea. I'm not asking for the actual code, I just need the algorithm so that I can start coding.
Thanks,

bcladd

Problem with sorting first:

Sorting is O(K log(K))

so you have blown your constraints out of the gate.

The important insight here is that you don't care about the elements in positions [0,A-2] and [A,(K-1)]. If you could "simulate" sorting just position A-1 you would be happy. Think about the sorting algorithms you know and try to imagine running one of them WITHOUT finishing the sort that could put just the Ath element in the right place (or rather some number of elements in the right place, the number dependent on A).

Note: It does NOT have to be the elements [0, A-2].

-bcl

efn

To find the smallest number in a set of integers, you have to look at each integer, and you don't have to look at each integer more than once. So the cost should be O(K). If A is the value of the smallest number, it shouldn't enter into the complexity calculation as in Alog(K). The value of the number doesn't affect how long the search takes.

Are you sure you have stated the problem correctly? Did I misunderstand something?

--efn

efn

Oh, from bcladd's comment, I see that you probably meant the Ath smallest number. I read it as a typo for "A the smallest number." Sorry!

--efn

bcladd

efn-

I think the problem is to find the A'th smallest or the number that would be in position A-1 if the set were sorted and put into a C++ vector.

ks5d

Hi.

There are many, many pages of information about sorting algorithms on the web. I am studying for a C++ exam, and have been looking at a lot of these sites for answers to questions like the one you are asking here. I really liked the explanations by the author of these pages here:

http://linux.wku.edu/~lamonml/algor/sort/heap.html

You are correct that heap sort is O(k + log(K)) for a data set of size K. Once the data is sorted, you can just grab the Ath item, assuming you are using a container that supports random access, like an array or a vector.

I think, but I'm not sure, the basket sort you are mentioning is more commonly called 'bucket sort', and it takes O(N + K) time where N is the number of buckets and K is the number of items. If the number of buckets is chosen wisely, then the running time can be almost linear O(K). A short article on Bucket sort can be found here:

http://www.ics.uci.edu/~eppstein/161/960123.html

In summary, both bucket sort and heap sort will accomplish your goals in your bounds of O(K + Alog(K)) or better.

I hope this helps,

- Ken

ks5d

OOPS!

Heapsort runs in O(K log(K)), NOT O(K + log(K))! That plus makes a big difference. That difference will constrain you to using basket sort. Sorry for the typo.

- Ken

bcladd

It is possible to sort the integers using a radix sort BUT that only works for this limited set of input. The problem is acually solvable for any sortable type within the given constraints using only key comparisons. Think about all of the sorting algorithms and see what you can come up with.

-bcl

DanRollins

I don't think you need to sort the whole list.
Just make an array of length A and cycle through the length-K list once, inserting into the A array whenever you find a new item that is less than the highest item that is in the A array.

I haven't a clue how to typify that as
Whatzat log W00t,
but that's how I'd do it.

-- Dan

bcladd

DanRollins-

Ordered insert into the A array: O(A) (might have to move the entire content) and it might happen on each item in the K list (reverse sorted order) so O(KA) which blows the assigned execution budget. Your solution is also O(A) in space complexity which is non-optimal (it can be done in the required time with constant (O(1)) additional space).

-bcl

DanRollins

Again, I don't know how to express in those terms, but at most, you would need to insert A items and they would not need to be in any particular order. Once you have A items in the list, you can compare the next item from K to the current max and min in A. If it is less than the min, replace the max with that min and find the new max.

I feel stupoid. Where can I learn what Alog(K) means in this context?

-- Dan

bcladd

And finding the new max requires scanning the whole list if it is unordered. Still O(A) to insert a new item. When I reread my post I KNEW someone would suggest an unordered array and a single comparison. For a minute I was sure that fixed the problem...but then I remembered that finding the largest remaining element in that array is still a linear scan (if the list is unordered).

It is quite possible that your constants are better than if the A array were kept sorted with an insertion sort but I am not even sure that is the case.

Wonder if templeavenue is even reading this anymore.

The trick is to ALMOST sort original array. One well known sort algorithm starts with an ALMOST sort and then recurs to finish the job. For this problem you can use the same start (almost sort the array) but THEN you only need to work on part of the almost sorted space (where in the sort you have to work on all of the space).

-bcl

templeavenue

ASKER

Sorry for coming back lately. I don't have an access to internet on Sunday.
To "bcl" - Should I sort the array completely first? If so, using what method? or should I look for the Kth element as while I sort the array? I'm not quite sure what are you trying to say. Pls explain me more if you can.

Thanks everyone!

ASKER CERTIFIED SOLUTION

bcladd

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

bcladd

templeavenue-

How goes the programming? Any questions?

-bcl

efn

Answered by bcladd.