Databases have solved this problem with indexing. You can still sort with essentially zero cost of moving items.

0

CASorterAuthor Commented:

no, i am talking about physically walking an item being sorted from one spot to another...
how much WORK.
not logically sorting items.... physically sorting items

You cannot assume zero cost for moving the elements. The usual assumption is that the cost is constant.

The efficiency is measured using special functions called O (big O). Constants are not important for that kind of evaluation. This is a theoretical tool. You will not get the exact result. But you can use it to compare whether some algorithm is not "badly invented" with respect to efficiency.

The best sorting algorithm for an array (direct indexing) using a single processor is O(n log n) where n is the size of the problem (number of sorted elements).

Google for sort+visualization to get the feeling how it works. Or (better), train your imagination, or (even better) write your own visualization.

Similarly, you can compute a space efficiency (how much memory you need to do the task).

0

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Having a pile of things, you should think more about the cost of moving. It is better to plan everything first and move the item to the final position. For software, if the elements to be moved are big, it is better to create an auxiliary index array to move only the tiny indexes. Then the last pass is used to reorganize the physical positions of the big elements.

In the world of sorting, it seems that the fastest algorithms are those that move items so that it moves to the final position as fast as possible. In other words, having an item, you want to move it as close to the final position as possible (I mean, during one step, one move). There is quite old algorithm named Shell sort (Donald Shell, 1959) that worked this way, even though it was not clear why it was more efficient. Actually, it added more phases to an insertion sort. However, the overall time of the solution is faster than the case when only the basic variant of the sort is done. See here https://en.wikipedia.org/wiki/Shellsort

The problem with sorts is that the time for the solution (their time complexity) heavily depends on the initial order of the elements. The time complexity is calculated for the worst case. The best case of simple sorts (that are actually not very efficient otherwise) can be a kind of ideal O(n). Basically, you just check whether the order is correct.

To summarize, the real world does not work with plain "steps", "indexing", and "elements". When you know more about the situation, you can create a faster sort (that possibly does not work for a general case). Or you even cannot use a general sort for in-memory sorting, because you have too much data that would not fit into your memory.

>> i have a pile of things... in my yard.. i want to put them in order.. with the least amount of work in carrying the things from place to place.

Is this literally the problem you are trying to solve or is this an analogy to a general type of computer problem?
If this is actually the job you have to do, how many helpers do you have? Also, wouldn't the weight of the objects be an important factor?
If not literal, then never mind.

1

CASorterAuthor Commented:

it is an analogy...
but what @pepr said is correct i am .. .in fact .. thinking about the cost of actually moving the item.

the actual real world problem i am trying to optimize is this

and ... truly i am looking for validation of the way that we have been doing it... so, i am going to be a little vague in the details so as to not bias any ideas that may come.

i have 2000 items. these items are "owned" by 230 owners. each owner has 1 to10 items.
the owners have a predefined sequence that they MUST be placed in. so the final result is that all the pieces for each owner are together with the owners in the determined sequence

sorting them in place.. in 1 row is not an option, each item takes up space equivalent to about 1/20th of a step.. so to walk from one end of the line to another is about 100 steps... LOTS of time.
so the "work" involved in swapping 2 pieces in line that are 1000 apart would take 50 steps (not to mention the swap would require the person doing the sorting to KNOW where the item they are swapping with is)

i will further describe our current method in a later post, but are there any ideas now?

If you have one extra array of integer of the same length as the array with big objects, you need (say) two bytes (for 2^16 elements or four bytes if you need more) to store the index. Initially, you assign 0 to element on index 0, 1 to element on index one, etc.

In your original algorithm, whenever you tested big[i].key, you will now test big[a[i]].key -- this means that you access the big object indirectly through the auxiliary array a.

Whenever your original algorithm moved the big objects, you will move only content of the a element (index values). For example, if your stored value 2 moves to the a[5], then you know that the big object that sits on its own index 2 will indirectly sit on index 5.

In other words, the sorting algorithm tests the big object, but it moves only the values in the auxiliary array a.

When finished, you will find that a[0] contains (say) 555. Then you know, that the big[555] should be physically moved to big[0]. So, you have to finally move the big objects, bur you only need n (plus one or so) assignments.

0

CASorterAuthor Commented:

i am not explaining my problem well.
and/or
i am not understanding how what you said applies to it i am not sure what you mean by a "big" object. and there is no "index array" these items physically must be picked up and moved.

maybe it isn't a programming problem, but an engineering one.

i was hoping to mathematically represent what would happen if 1 person, had to physically walk to put 2000 pieces in order, grouped in "bundles" of 1-10 items for 230 item owners.

(not taking into consideration that some how, he would need to KNOW *where* to walk to find the swapping item)
and
(as well as not knowing which item to start with either)...

I see. OK. How the order of the result is prescribed? Does "in bundles" mean that each bundle may have a different space requirements? Is each piece of the same size. Should the result be lined in one row.

Can I simplify it so that each owner has a different number of the same-size objects, but they must be moved as one package? How many pieces can (is capable) the person move at once? Should be the order of pieces inside the package be preserved?

Can you describe the real motivation, the real situation?

<<EDIT>>
I just saw pepr's questions. Likely, some of them overlap with mine. If addressing my questions, it would be helpful if you use my numbers so that I clearly know which answer is applied to which question. Thanks.
<<END EDIT>>

I have some question in trying to understand the real world problem which I suspect will result in an algorithm.

1) Do you care whether a piece replaces another piece in its position; or can the piece be place in a separate position that is not part of the original set of 2000 positions? In other words, is it ok after the 2000 pieces are sorted that they do not occupy the spaces of the pieces before they were sorted? Or, do you have 2000 bins for the pieces, and after the sort, all the pieces will be in those bins.

2) Is the primary sort key something that uniquely identifies on of the 230 item owners; say, a name, or a number?

3) Does the primary sort key have an ordering associated with it; say, an ascending number or names arranged in alphabetical order?

4) Is it true that an owner will own at least 1 piece, and at most 10 pieces?

5) Is it true that after the sort, all the pieces owned by a single owner will be contiguous in the array?

6) Within a grouping of pieces owned by a single owner, is there a secondary key for the pieces; or can they just be lumped together in consecutive bins?

7) Can one person (or machine) carry more than one piece at a time, or can he gather, say, 10 pieces that belong to one owner and move them to their final destination?

8) It appears that you are trying to minimize either the 'the amount of "work" involved' or trying to minimize the distance travelled by the single person.

8a) Do all the pieces weigh the same?
I ask this because you ask in the OP that one goal is "to put a value on the amount of "work" involved in doing the sort." And if I take you literally, the work in lifting and moving a heavy piece will be more than doing the same with a lighter piece.

8b) If weight is not a factor, then are you trying to minimize the total distance a person (or machine) travels to complete the sort?

9) Is scalability part of the problem? That is, did you just pick 2000 pieces, 1-10 pieces per bundle, and 240 owners as an example; but, in fact, you need to solve this problem where those numbers are multiplied by 10, 100, ..., 100,000, ... ?

Since you are trying to minimize some physical entity (work/effort, distance, time?) of a single person (or machine) in sorting the pieces in a physical production line, it may actually be more efficient to walk through the array simply to scan the information of each item, so that you know what attributes are associated with each piece in its original bin. Then you take that information, plug it into a computer sort routine and now you know where every piece should land. And now, the remaining problem is to decide whether there is a movement path that minimizes your chosen physical entity.

Starting with the known answer, the immediate naÃ¯ve algorithm (in-optimized solution) is for the person to take the first piece and walk to the destination bin; take the existing piece out putting it by the side of the bin. Then he walks all the way back to the 2nd piece and does the same. ... Then he walks all the way back to the Kth bin, and if he finds a piece by the side of the bin, then he takes that piece to its final destination.

Surely we can do better than that. But first, let's understand the problem better by addressing the questions since I may have a total misunderstanding of the physical problem.

0

CASorterAuthor Commented:

i apologize.. i had started answering the questions but it looks like i never finished or sent the answers...

here are the questions and answers... as well as a bit more detail...

1) Do you care whether a piece replaces another piece in its position; or can the piece be place in a separate position that is not part of the original set of 2000 positions? In other words, is it ok after the 2000 pieces are sorted that they do not occupy the spaces of the pieces before they were sorted? Or, do you have 2000 bins for the pieces, and after the sort, all the pieces will be in those bins.

-- no, in fact the pieces *must* end up in a different location...

2) Is the primary sort key something that uniquely identifies on of the 230 item owners; say, a name, or a number?

-- yes it is a sequence number associated with the item owner

3) Does the primary sort key have an ordering associated with it; say, an ascending number or names arranged in alphabetical order?

-- alphanumeric

4) Is it true that an owner will own at least 1 piece, and at most 10 pieces?

-- no.. at least 1 but could have N pieces.. average 8

5) Is it true that after the sort, all the pieces owned by a single owner will be contiguous in the array?

-- yes, but in no significant order within the owner

6) Within a grouping of pieces owned by a single owner, is there a secondary key for the pieces; or can they just be lumped together in consecutive bins?

-- see question before... also it might be helpful that the pieces could have a final state of all being in 1 bin for 1 owner...

7) Can one person (or machine) carry more than one piece at a time, or can he gather, say, 10 pieces that belong to one owner and move them to their final destination?

-- for the purposes of this exersise, i am trying to determine "work" of an algorithm

8) It appears that you are trying to minimize either the 'the amount of "work" involved' or trying to minimize the distance travelled by the single person.
8a) Do all the pieces weigh the same?

-- for this purpose, yes

I ask this because you ask in the OP that one goal is "to put a value on the amount of "work" involved in doing the sort." And if I take you literally, the work in lifting and moving a heavy piece will be more than doing the same with a lighter piece.

8b) If weight is not a factor, then are you trying to minimize the total distance a person (or machine) travels to complete the sort?

-- i think the answer is yes

9) Is scalability part of the problem? That is, did you just pick 2000 pieces, 1-10 pieces per bundle, and 240 owners as an example; but, in fact, you need to solve this problem where those numbers are multiplied by 10, 100, ..., 100,000, ... ?

-- scalability to a factor of 10's more ... but not 100 or more multiplier

Being more of an practical engineer that mathemagician, I would try to solve that a more statistical way ... with some programming effort.

You'll have to implement all of the sorting algorithms in programming code to test them. In the part of the code that swaps the items (or - in some kind of algorithms - just moves an item from here to there) you'll just calculate the moving effort prior to swapping the items (i.e. the distanceof the items in the containing array) and then sum that up.

Then you create a random set of items and throw it on every sort you have implemented. For a good statistics, report the results and repeat with new item sets until the picture from the combined results gets stable.

And keep in mind that the best algorithm choice depends even on the assumtion if the items in the set are fully chaotic or usually almost sorted ...

there is no "index array" these items physically must be picked up and moved.

the methods using an index array means that you do the sorting in two phases: a sorting phase where you only find out the new positions of each item, and then a second moving phase.

phase 1 (sorting phase):
for each item you find out at to which final position it has to moved. add a label with the slot number to each item. the list of the relation current position -> new position now builds the index array needed for phase 2.

phase 2 (moving phase):
the moving then is done by taking the first item (if this is already correct take the next item which is wrong) and go to its new slot. remove the item sitting here and bring it to its new slot, and so on. if the destination slot is empty, next item could be found by going forward until there is an item sitting at the wrong position. the total costs are the time you need to determine the final positions + the sum of the distances you had to walk when moving the items. the second part is identical for all "sorting algorithms" you have used in phase 1, since you did not any moving in the sorting phase. the moving phase only needs an own optimization if there are costs for ways without carrying an item. of course you have different costs if you could carry more than one item or have different weights for items.

those kind of problems can be solved with graph theory. if you assume that the efforts for creating the index array is negligible (maybe because it is made by a computer and data are already existing), then only phase 2 needs to get optimized.

7) Can one person (or machine) carry more than one piece at a time, or can he gather, say, 10 pieces that belong to one owner and move them to their final destination?

-- for the purposes of this exersise, i am trying to determine "work" of an algorithm

8b) If weight is not a factor, then are you trying to minimize the total distance a person (or machine) travels to complete the sort?

-- i think the answer is yes

From these answers and all the others, I think we consider all items to have the same weight, and "work" can be defined as the total distance traveled by the one person moving the items from the array to the containers, where each container is associated with one of the owners. But one point is not clear to me.

10) Is the "work" the same for these two scenarios:
a) The person carries two items for a distance, d1.
b) The person carries one item for a distance, d1; and then the person carries another item for a distance, d1.

11) For each of the 2000 slots holding an item, do you know in advance which owner is associated with that item?

0

CASorterAuthor Commented:

because there is no guarentee that 2 pieces will be physically near each other, i would say the only method i would consider is B.

The travelling salesman problem (TSP) asks the following question: "Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once and returns to the origin city?"

There are some nice animations in the above link to further your understanding.

My understanding of your problem is that you have 2000 slots (one item per slot), and 240 destination containers (one for each owner). The analogy to the TSP is that you have 2240 cities.

Your problem has some similarities to the TSP, but some obvious differences, such as you are allowed (or actually, may be required) to visit the 240 container cities.

Since the TSP problem is harder than a sort, by initial intuition is this. If your answer to question (11) is that you do not know which items are in which slots, then the first step is to walk to every slot and scan in the item. Now you can say that a particular owner has all of his items in a specific set of slots. This initial distance is small compared to the rest of the algorithm. The good news is that having done this, there is not necessarily a need to sort these items in their slots.

If the person is allowed to carry only one item at a time, then the first step in your graph is to simply draw directed arrows from each slot to their proper owner's container. These paths must be travelled in the direction given. Unlike TSP, of course, you would never go from one owner's container to a different owner's container.

I asked about carrying more than one item in my questions (7) and (10).

If a person is allowed to carry a multiple items belonging to different owners, then after dropping one or more items in one of the owner's containers, he then can move to another owner's containers and possibly save some steps.

Whether allowed to carry only one item, or more, the problem appears to me to be as hard as TSP.

>> because there is no guarentee that 2 pieces will be physically near each other, i would say the only method i would consider is B.

One consideration is to identify owner's pieces that are clustered in possibly multiple sections. For example, owner1 may have all 12 of his pieces in three clusters at slots:
{ 5, 9, 14, 22}
{455, 477, 488, 506}
{1444, 1481, 1523, 1566}
It may prove optimal for the person to take the steps from 5 to 9 to14 to 22 and then to owner1's container. But the decision to do that has to consider all the clusters.

But let's say that the person can carry 10 items at time. He may pick items in the first 40 slots whos owners' containers are near each other. Once he has his set of items, he can move get to an owner's container, and then move to another close owner's container.

Let's go to another extreme. Have the person walk through the 2000 slots and pick up every item. (As he picks them up in his cart, he puts the items in separate bins, OR NOT.) Then he walks to the nearest owner's container. Then takes all the items belonging to that owner into the owner's container. I don't know whether you consider this last step "work" - it certainly does not involve distance, but it does take up "time".

It seems like having to walk amongst 240 containers is going to require less total distance traveled then walking amongst 2000 slots.

then are you trying to minimize the total distance a person (or machine) travels to complete the sort?

don't think that is true. it is not the total travel distance but the total transport distances. if only one piece can be transported at a time, the total transport distance is constant for a given Distribution granted we know the final position of each piece.

D C B E A

to move all pieces into order A B C D E you may take D at slot 1 and exchange it with E at Slot 4, then move E to slot 5, then A to Slot 1, then C to slot 3 and B to slot 2:

1 -> 4 ->5 -> 1 2 -> 3 -> 2

the transport way has length 3 + 1 + 4 + 1 + 1 = 10 plus 1 empty travel of 1.

the 10 is constant regardless which piece you take first, for example if you start in the middle with C you have

I include the number of owners as part of the problem since you mentioned it; whereas in #a41852386 , the number of owners is not part of that post. If that post is intended to be complete, then what is the need for the number of owners? Apparently we have different views of the problem. Please clarify.

I believe the experts here may have different models of what your real situation is. But, going back to your original statement in the OP:
>> we all can agree that in a pure sorting methodology, the good old quick sort is likely best.

This is not always true, especially if we are given an array of integers. Instead of the on average quicksort complexity being O( n*log(n) ), for integers, the complexity may be even linear, O(n).

Your owners could be mapped to the integers 1..230. As such, each piece can have a number in the range 1..230.

On pass 1, you walk through the 1000 pieces, noting the number; and in an auxiliary counting array of 230 elements (initialized to 0), you increment by one the corresponding auxiliary counting array element.

For example, suppose you have 10 pieces and 3 owners, and the pieces are identified as:
2 1 2 3 1 1 1 3 3 1

The auxiliary counting array consisting of 3 elements will then have values:
5 2 3

On pass 2, we convert the auxiliary counting array to an auxiliary offset array:
1 6 8
(No need to create a separate array since converting to offsets can be done easily in-situ.)

You create another 10 element array to hold the sorted pieces.

On pass 3, move the pieces from the unsorted array to the sorted array as follows:
All the pieces having the number 1 will end up in the final sorted array in positions 1-5
All the pieces having the number 2 will end up in the final sorted array in positions 6-7
All the pieces having the number 3 will end up in the final sorted array in positions 8-10

Clarification:
The first piece in the unsorted array belongs to owner 2, so it is moved to the sorted array in position 6.
Then increment the auxiliary offset array by 1 so that the next time we encounter the other piece 2, it will be moved to the sorted array in position 7.

Complexity:
Pass 1: we walked through n pieces in the unsorted array: O(n)
Pass 2: we convert the auxiliary counting array to an auxiliary offset array: O(k)
where k is the number of owners
Pass 3: move the n pieces in the unsorted array to the sorted array using the auxiliary offset array,
and increment the corresponding elements in the auxiliary offset array by 1 : O(n)

Total Complexity: O(n) + O(k) + O(n) = 2 O(n) = O(n), since k < n (in your problem).

But, going back to your original statement in the OP:

@phoffric, the original statement contained of 3 sentences:

we all can agree that in a pure sorting methodology, the good old quick sort is likely best.
this assumes a 0 cost for actually moving the element from one sorting position to another.
my question is... what sorting algorithm would be best if there was a cost associated with actually moving the item to its new sorting positon.

apparently, the question is about moving costs.

note, on modern computers it is difficult to measure differences between sorting algorithms for counts less than 10k. this changes if the moves will cost a significant amount of time. so - as already elaborated above - if the count of items is moderate and the moves are expensive, we would use a proved sort like quicksort or mergesort (if we need a stable sort where same values keep their original order) to sort only references to the items and then after old and new position are known for each item, we would try to find the optimal (less expensive) way to move all items from their old position to their new position.

Sara

0

CASorterAuthor Commented:

thank you all for your comments and contributions.

i fear some of the difficulty in answering this question goes to my poor attempt at defining the question.

here are the comments that earned points

sara - if the count of items is moderate and the moves are expensive, we would use a proved sort
this is the essence of my problem... the moves are what expensive (actually moving an item and getting it sorted, not just a number in an array)

phoffric - for integers, the complexity may be even linear, O(n).
i think your description is the closest to what we are doing...

i believe the solution i am currently using is linear....
we feed the 1000 random items for 240 wearers to an operator we determine (by using a mechanism to identify the owner to whom the piece belongs) which of 10 buckets the piece would go in, based on the sequence of the owners with the 1st 24 wearers going in bucket 1, the 2nd 24 owners in bucket 2, ......... the 10th 24 owners in bucket 10.
we then feed ( in a second pass on a second set of equipment) the random pieces for the 1st 24 owners to a second operator. these are then sorted into 24 buckets where each of the items for each owner a put into its own bucket.

i have attached a file outlining the flow....
i wanted to get some thinking going on before i showed what we were doing currently.

the real question is .....

is there a faster way

i will split the points between sara and phoffric after yo have a chance to look at the file and respond.

I think I get the gist of your diagram, which I am sure that you could elaborate on to help me understand it better.

It looks like your coding system of the pieces (perhaps, lab coats) actually have several codes rather than just a single owner code. Or, possibly the owner code is broken into contiguous subsections where the first subsection represents a division, or even a building, or a floor; and the next subsection represents a floor, or an aisle, and the last subsection represents a hangar rack or even a cubicle. (Just guessing a wag from your nice chart.)

Now, this breakdown of a code reminds me of a radix sort where we are not really comparing numbers with each other, but using numbers (or other kinds of codes - perhaps, alphanumeric) to begin moving the pieces into a hierarchy of sectional areas to reduce the effort of getting each piece to their final destination. Now, when I posted a conclusion of complexity O(n), I was thinking more of a counting sort where k << n; but after seeing your diagram, I think you have more of a radix sort. Since k << n, I was just estimating the complexity to be O(n), but if we talk more about your picture, the complexity may be O(n+k).

I don't see any way to go faster than O(n+k).
Roughly, I think you indicate that k ~ n/10, so O(n+k) = O(n + n/10) = 1.1 O(n) = O(n).
So, if n doubles, then I expect the effort to sort to double. (But, I would need to get a good discussion of your image to confirm that I am not missing something.)

0

CASorterAuthor Commented:

this is exactly right.... the inital expected order is set by the LXX boxes and colors in the lower right. this is a predetermined order by owner in a specific sequence.
there are 24 owners (wearers) in each group. each owner with 1-30+ pieces
all the pieces that belong to the first 24 owners get put on the first sort position of "Lot" sort, 2nd on 2nd and so on. this is represented by the vertical lines at the top.

then each of those lines is brought down sequentially and fed to the final sort area where the owners items are placed on the appropriate 1st, 2nd, 3rd, etc position for the 1,2,3rd owner in sequence.

every piece is handled 2 times and the sort is complete.
AS LONG as they keep the number of owners less than the # positions in "Lot" sort times # positions in "Final" sort.