Hi experts,
I'll try to explain my problem as simple as I can.
Imagine objects (each of them are complicated STRUCT objects) "A", "B" and "C".
Based on application's logic, object "B" depends on object "A" and object "C" depends on object "B".
I'd like to use QSort and custom sort function to sort these objects based on their dependency.
Of course, the expected result should be {"A"; "B"; "C"}.
And now the problem:
custom sort function used by QSort expects two arguments (compared objects) and should return one of the values {-1; 0; 1}. When comparing "A" and "B", or "B" and "C", I'm absolutely able to compare those two and decide the return value. However, there's no direct dependency between "A" and "C" and therefore I'm unable to decide which one of them is "greater". Of course, there is dependency (transitive dependency), but not direct dependency. Is this doable using QSort (and custom sort function) or any other existing sort algorithms?
So, where/how is the dependency stored? For example, does an object encode its dependency information? If so, you'll need to query deeper into each object in the comparison function.
It'll be very difficult to sort your element with any normal algorithm, if you don't have a way of comparing all the objects.
Consider to use some dynamic programming, and calculate the transitivity.
Note: QSort performance depends allot of your starting data. There are some contexts where QSort is the fastest algorithm (almost ordered lists), but check if your are in this situation.
0
With monday.comâ€™s project management tool, you can see what everyone on your team is working in a single glance. Its intuitive dashboards are customizable, so you can create systems that work for you.
peetm: yes, this is the worst scenario - not to use any of the "known" algorithms and to do it by myself by going deeper into the dependencies.
patrickriva: yes, I have up to 500.000 objects that need to be sorted.
oleber: this is exactly my problem - that I can compare only objects with a direct dependency. Calculating the transitivity is my worst scenario. Therefore wanted to ask if there's any better solution.
- how are the relations between items stored? (do you have parents/children collections on each object or are they in a table that uses id/guid as references?
I'm using a dynamically created array of STRUCT objects. I was unable to achieve such a performance using other concept.
no IDs, no GUIDs, I have another array of pointers, each pointer points to a single object of the "main" array (some kind of table of contents:-). And my goal is to sort the array of pointers, not the array of structs itself.
no loop dependencies, of course. this would cause an infinite loop...
ok, each element (struct object) has a parameter - pointer to an array of pointers, each of them referencing to predecessor objects.
Take my example from the top - B depends on A and C depends on B.
object B has a parameter - an array storing only 1 pointer - referencing object A.
similary, object C has a parameter - an array storing only 1 pointer - referencing object B.
of course, there are more complicated dependencies, but for illustration this should be enough...
oleber: of course, each object can depend on any number of predecessors.
and my goal is to sort the objects based on their dependency.
imagine my objects as Excel cells with formulas, and the predecessors as other cells used in the formula. I need to "sort" the cells into an order, so that I can go through the sorted list and calculate each cell based on its formula and arguments (predecessors). Due to the right order I'll have to calculate each cell only once.
in your case, it's a bad idea to do a formal sorting, from the begging.
A simple way to do this is the count the predecessors.
you have a list that will have your sorted list.
1 You get one cell without predecessors, and add it in the list.
2 You go to all the child's and you decrease in one the number of parents.
3 you go again to the step 1 until find no more cell
Your problem can be mapped by a directed graph with no cycles. There is not really a fast way to get the nodes of such a graph sorted but if the complexity really is comparable with the nested algorithms of Excel cells, the graph may have many vertices but hardly is deep. If so, you could write a compare function which makes a deep search on the predecessor list of the left operand looking for the right operand. If it finds the right operand in any of the predecessor's lists it returns 1. If not it returns -1.
If the dependency lists contain some kind of mathematical 'operands', a deep search will have only a relatively small number of dependicies on avarage and the lookup search should terminate quickly. Assuming the avarage amount of operands after full resolution is less than 10, you have at worst case (bubble sort with a reverse sorted array) 10 * 500k comparisions, which should last far less than a minute. Using a fast sort algorithm, I would guess it is a few seconds only.
You don't really need to sort the dependency graph. To Calculate object B you just need to traverse it's dependency graph and calculate it's children. So, I would consider:
1. Create a dummy header object that has as it's dependency list All objects that are not dependent on other objects (Best to build your graph so this is already true.)
2. Do a depth first traversal from the header and calculate everything.
3. Assuming that multiple objects can be dependent on the same object, you can store the calculated value at each level so that you only need to perform the calculation once.
3a. When an object updates, everything "above" it on the heirarchy needs to be recalculated.
--
JimFive
Do you know that every two objects are comparable? It may be possible that the dependencies do not generate a well-ordering. A partial ordering and a topological sort may be all you can achieve.
I will take the idea of paths in directed graphs, since it maps perfectly your problem.
If the assumption that the digraph does not have cycles is false, you're kinda doomed: your problem is exactly finding the Hamiltonian path, if there is one.
This problem is described in http://en.wikipedia.org/wiki/Hamiltonian_path_problem and it is, as complexity, NP-hard.
Thanks to all of you. Your solutions seem to work perfectly, however I decided to use another approach. I created second array of pointers pointing to my objects (something like a topic of content). Each object was assigned a sequence_number (0,1,....,n-1). Then I simply go through my "index" array (pointers to objects) and compare the sequence_number of the current object and all its "children". Once child's sequence_number is higher than the current object's sequence_number, i swap their position and sequence_numbers, as well. Using this method I get a sorted list of objects, where all "children" of every object are "in front" of the object itself...
Thanks a lot again to all of you...
0
Featured Post
The revolutionary project management tool is here! Plan visually with a single glance and make sure your projects get done.