Link to home
Start Free TrialLog in
Avatar of NovaDenizen
NovaDenizen

asked on

Looking for information about a kind of partial ordering

This is a very Computer-Science-type question.  I'm looking for information about algorithms, data structures, books, and implementation techniques to help me efficiently create what I call a minimal partial ordering graph, explained below.

I have a bunch of objects (say, 50 thousand) and a partial ordering function.  I want to find the minimal partial ordering graph for all of these objects.  I want to get the objects organized in terms of the partial ordering function.

Partial ordering function:  f(a_i,a_j) that returns '+' if a_i exceeds a_j, '-' if a_j exceeds a_i, and '?' if there is no sure ordering between a_i and a_j.  This function has the transitive and reflexive properties you would expect from a comparison function (f(a,a) == '?', f(a,b)=='+' and f(b,c)=='+' implies f(a,c)=='+', f(a,b)=='?' iff f(b,a)=='?', f(a,b)=='+' iff f(b,a)=='-')

Partial ordering graph:  A directed acyclic graph that completely reflects the partial ordering function.  You can brute-force construct a valid one in O(n^2) space and time by performing the ordering function on all pairs of objects, adding an edge (a,b) whenever f(a,b)=='+'.  

Minimal partial ordering graph:  A partial ordering graph with all redundant edges removed.  For instance, if the edges (a,b), (b,c), and (a,c) are in a partial ordering graph, the (a,c) edge is redundant because you can deduce it from the transitive property.  

Creating a minimal PO graph is obviously brute-forceable in O(n^3) time and O(n^2) space, but I suspect it can be done faster than O(n^2) and with space proportional to the number of edges in the minimal graph.  I would also like to efficiently remove nodes from and add nodes to a minimal graph.

Is this unexplored territory, or has someone already solved this problem?  Should I be thinking about this problem in a different way?
Avatar of kouroshparsa
kouroshparsa
Flag of Canada image

I'm not sure what you want in what programming language. What type of objects? If you know the basics, and you're looking for a solution...Well
I have had similar projects with functions...
Let's say you're comparing f(a,b) and f(d,e)
if you read it as a string from index=2 to just before ")" you'll get:   a,b     and     d,e
Now, if you remove commas:  ab       and     de
Then you can compare them as strings...

The similar thing goes with f(a_i,a_j)... by removing "_"
I personally analyse or derive mathematics or physics functions using strings...
Avatar of Claudix
Claudix

What you are talking about is actually topological sort
which can be done in time proportional to O(V+E)
which could mean n^2 if you have the number
of edges(E) close to n^2

here is a quick link :
http://www.personal.kent.edu/~rmuhamma/Algorithms/MyAlgorithms/GraphAlgor/topoSort.htm

Now about the graph.You can get the edges you don't need off using
depth first search and the edge classification it produces like
this following book explains

T. Cormen, C. Leiserson, R. Rivest, and C. Stein, Introduction to Algorithms

the basic summary is that you remove the edges that span from a node ,from the
tree you build there, to its descendent
Avatar of NovaDenizen

ASKER

kouroshparsa:  I don't think you get the question very well.

Claudix:  You're close, but I don't think I made myself quite clear enough.  My problem is closely related to topological sorts, except coming from the opposite direction.  

I have a bunch of nodes, and a comparison function that tells me whether a node is greater than, less than, or of indeterminate comparison with another node.  I want to use the information I get from the comparison function to efficiently build a minimal directed acyclic graph, "efficiently" being the most important term.



ASKER CERTIFIED SOLUTION
Avatar of Claudix
Claudix

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Actually, f is quite cheap.   Not that that matters in terms of O() notation.  

I don't think your algorithm will keep a graph minimal.   I think I have found an N^2 algorithm inspired by yours, though.  That DFS can be pretty useful.

Each node must have a "dfsID" field, which will contain the index of the originating node from the last time a DFS search traversed this node.

This is an insertion algorithm for a new node going into a minimal graph.  New node is node n+1

// This loop is O(n+edges) because, in total, the DFS's will iterate over at most all existing nodes and edges.  Marking the dfsIDs keeps the DFS from doing redundant work
for i = 1 to n {
    if node i dfsID == n+1 then next i
    if f(n+1, i) == '+' {
        if node i dfsID != n+1 {
            DFS in the '-' direction from node i.  If any node j traversed already has its dfsID==n+1, then remove edge (j, n+1) if it exists and don't continue the DFS down this node.  Set dfsID on each traversed node to n+1.
            add edge (i, n+1)
        }
    } else if f(n+1,i) == '-' {
        if node i dfsID != n+1 {
              DFS in the '+' direction from node i.  If any node j traversed already has its dfsID==n+1, then remove edge (n+1, j) if it exists and don't continue the DFS down this node.  Set dfsID on each traversed node to n+1.
               add edge(n+1, i)
         }
    }
}

// since none of the nodes < n+1 should be pointing directly at nodes > n+1, get rid of those edges.  This loop is <= O(nodes + edges)
for all i such that edge (i, n+1) exists {
        remove all edges (i, j) where node j dfsID == (n+1)
}

The removal algorithm is tricky, and can take longer than O(n+edges), something like O((# of predecessors i such that edge(i, r) exists)*(avg. # nodes of indeterminate comparison to r and greater than some predecessors)), not counting the set operations.  To remove node r:
Create set A of all nodes such that edge (a, r) exists.
Create set B of all nodes such that edge (r, b) exists.
DFS from all nodes in B, in the '+' direction, and mark these nodes 'out of bounds'
Remove all edges (i,r) and (r,i)
For all a in set A {
    let set T = B
    dfs from a in the '+' direction.  When a traversed node j is in set T, remove j from set T.  do not traverse 'out of bounds' nodes.
    for all nodes t left in set T, add edge (a, t)
}

I've got a couple other ideas that might bring the insertion below O(n) or the wholesale creation of the graph below O(n^2), but I'll have to actually implement them to see if they work OK.  Thanks for your help.