We have an application that allows project managers to list any other project that they are

dependent on. Each time they enter a new dependency a new record is created in our database.

Below is a table that shows how these records are stored:

Predecessor (delivering) Project Successor (Dependent) Project

--------------------------

------ --------------------------------

Project 47 Project 69

Project 47 Project 78

Project 47 Project 355

Project 47 Project 408

Project 355 Project 229

Project 355 Project 78

Project 355 Project 408

Project 408 Project 355

In the table above, Project 47 has 4 dependent projects, project 355 has 3 dependent projects and project 408 has 1 depedent project.

What I want to do is create an algorithm that takes a predecessor project, traverses through a table (like the one above) a makes a makes a tree type view of all the Successor projects..

For example, if I choose to start at Project 47, I would find all of project 47's Successors, which are 69,78,355,408, and then the successors of those projects, etc. In the end, I would have a created a tree like the one pictures in this link:http://www.alexalapatt.com/myspace/Drawing1.jpg

I've been trying to do this in arrays but it is getting ugly. Does anyone know an algorithm that I can use to store this data in this fashion? I would do this by hand, but there are thousands of records like this to go through!

Thanks!!

Alex

Seems not to be a tree.

Trees have n nodes and n-1 edges.

Actual projects have tasks that are dependent on more than just one predecessor and each task can be predecessor of more than one dependent task. The same applies to interdependent projects. So, there are n nodes and much more than n-1 edges. That are general directed graphs. Can be a tree, of course, but this would be a special case.

General graphs are defined by an adjacent matrix (of dependencies) or by adjacent lists. That approach allow multi-dependency to be represented. If the dependencies have the same value, on put 0 or 1 in each matrix cell.

Example:

Suppose a n by n matrix with columns 1, 2,...,n and rows 1,2,...,n.

It is an array P[n][n].

If n=5 and row 3 is 1,0,0,1,1 then project 3 is sucessor of projects 1, 4 and 5.

If column 2 is 0,0,1,0,1 then projects 2 is predecessor of projects 3 and 5.

Disadvantage of matrix is the large space it wastes. If you have 1000 projects, will spend 1 million cells to store data. And, worst, only half of the matrix is actually needed, because the non-redundant data is a triangular matrix (a right triangle with sides = n and the hypotenuse is the diagonal of such matrix). As you are storing just 0 or 1, you need 1MB by chosing the cell to be a char (or byte) or bool. Not so much.

To use less memory, you can chose the adjacent list, a little bit more difficult to maintain.

Example of adjacent list (for n=5) is:

1: 2,4,0

2: 3,4,5,0

3: 0

4: 3,5,0

5: 3,4,0

That means: Project 1 is predecessor of projects 2 and 4; Project 2 is predecessor of projects 3, 4 and 5. The zero is the list row terminator.

To query which projects Project 2 depends on, we must search for 2 in each row.

Actually the above data is stored this way:

Dependents = {2,4,0,3,4,5,0,0,3,5,0,3,4

Predecessors = {1, 4, 9, 10, 12}

It spends 14 + 5 = 19 cells. A matriz would spend 25 cells to store the same information. Is that economy a real economy? Must examine your case to decide.

By using adjacent list, we have the below algorithm:

print "Project 2 depends on"

for i=1,n

j=Predecessors[i]

stop=false

while Dependents[j] not = 0 and not stop

if P[j]=2 print "Project ", i

stop=true

end if

j=j+1

end while

end for

The above will print the list of direct predecessors of project 2. By using it recursively for each found predecessor, we have the complete dependency tree.

If I am understanding correctly the question, this solves the core algorithm. The graphical output is another question...

Hope it helps,

Jose