# Algorithm right?

I need to write for the program that computes A th smallest number in a set of K distinct integer.  But there's an restriction that the running time has to be
O(K+Alog(K)).  This is my way of thinking.  I think I will need to sort the data first.  And I'm think to use basket sorting. Then find the number I want.  I just want to know if it is right?  Or I should use heap sort?  I'll appreciate any idea.  I'm not asking for the actual code, I just need the algorithm so that I can start coding.
Thanks,
###### Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Commented:
Problem with sorting first:

Sorting is O(K log(K))

so you have blown your constraints out of the gate.

The important insight here is that you don't care about the elements in positions [0,A-2] and [A,(K-1)]. If you could "simulate" sorting just position A-1 you would be happy. Think about the sorting algorithms you know and try to imagine running one of them WITHOUT finishing the sort that could put just the Ath element in the right place (or rather some number of elements in the right place, the number dependent on A).

Note: It does NOT have to be the elements [0, A-2].

-bcl
Commented:
To find the smallest number in a set of integers, you have to look at each integer, and you don't have to look at each integer more than once.  So the cost should be O(K).  If A is the value of the smallest number, it shouldn't enter into the complexity calculation as in Alog(K).  The value of the number doesn't affect how long the search takes.

Are you sure you have stated the problem correctly?  Did I misunderstand something?

--efn
Commented:
Oh, from bcladd's comment, I see that you probably meant the Ath smallest number.  I read it as a typo for "A the smallest number."  Sorry!

--efn
Commented:
efn-

I think the problem is to find the A'th smallest or the number that would be in position A-1 if the set were sorted and put into a C++ vector.

Commented:
Hi.

There are many, many pages of information about sorting algorithms on the web.  I am studying for a C++ exam, and have been looking at a lot of these sites for answers to questions like the one you are asking here.  I really liked the explanations by the author of these pages here:

http://linux.wku.edu/~lamonml/algor/sort/heap.html

You are correct that heap sort is O(k + log(K)) for a data set of size K.  Once the data is sorted, you can just grab the Ath item, assuming you are using a container that supports random access, like an array or a vector.

I think, but I'm not sure, the basket sort you are mentioning is more commonly called 'bucket sort', and it takes O(N + K) time where N is the number of buckets and K is the number of items.  If the number of buckets is chosen wisely, then the running time can be almost linear O(K).  A short article on Bucket sort can be found here:

http://www.ics.uci.edu/~eppstein/161/960123.html

In summary, both bucket sort and heap sort will accomplish your goals in your bounds of O(K + Alog(K)) or better.

I hope this helps,

- Ken
Commented:
OOPS!

Heapsort runs in O(K log(K)), NOT O(K + log(K))!  That plus makes a big difference.  That difference will constrain you to using basket sort.  Sorry for the typo.

- Ken
Commented:
It is possible to sort the integers using a radix sort BUT that only works for this limited set of input. The problem is acually solvable for any sortable type within the given constraints using only key comparisons. Think about all of the sorting algorithms and see what you can come up with.

-bcl
Commented:
I don't think you need to sort the whole list.
Just make an array of length A and cycle through the length-K list once, inserting into the A array whenever you find a new item that is less than the highest item that is in the A array.

I haven't a clue how to typify that as
Whatzat log W00t,
but that's how I'd do it.

-- Dan
Commented:
DanRollins-

Ordered insert into the A array: O(A) (might have to move the entire content) and it might happen on each item in the K list (reverse sorted order) so O(KA) which blows the assigned execution budget. Your solution is also O(A) in space complexity which is non-optimal (it can be done in the required time with constant (O(1)) additional space).

-bcl
Commented:
Again, I don't know how to express in those terms, but at most, you would need to insert A items and they would not need to be in any particular order.  Once you have A items in the list, you can compare the next item from K to the current max and min in A.  If it is less than the min, replace the max with that min and find the new max.

I feel stupoid.  Where can I learn what Alog(K) means in this context?

-- Dan
Commented:
And finding the new max requires scanning the whole list if it is unordered. Still O(A) to insert a new item. When I reread my post I KNEW someone would suggest an unordered array and a single comparison. For a minute I was sure that fixed the problem...but then I remembered that finding the largest remaining element in that array is still a linear scan (if the list is unordered).

It is quite possible that your constants are better than if the A array were kept sorted with an insertion sort but I am not even sure that is the case.

Wonder if templeavenue is even reading this anymore.

The trick is to ALMOST sort original array. One well known sort algorithm starts with an ALMOST sort and then recurs to finish the job. For this problem you can use the same start (almost sort the array) but THEN you only need to work on part of the almost sorted space (where in the sort you have to work on all of the space).

-bcl
Author Commented:
Sorry for coming back lately. I don't have an access to internet on Sunday.
To "bcl" - Should I sort the array completely first? If so, using what method?  or should I look for the Kth element as while I sort the array?  I'm not quite sure what are you trying to say.  Pls explain me more if you can.

Thanks everyone!
Commented:
No, no, no: Remember that the lower bound on any sorting method that uses only key-to-key comparison is O(K lg K). You do NOT want to sort.  You want to do something like sort.

If you were to sort it you would probably use quicksort, right? It is a fast sort and if you have unique values and use a median of 5 (or median of 3) to select your pivot you will get optimal performance. The question is, in your problem, why do you have to sort both to the left of the pivot _and_ to the right of the pivot? You know the A'th element is only in one of those two subsets.

That is as far as I can go in spelling it out; any further and I am writing the algorithm for you. Look at quicksort and think about what you DON'T have to do for your solution.

-bcl

Experts Exchange Solution brought to you by

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Commented:
templeavenue-

How goes the programming? Any questions?

-bcl
Commented: