Improve company productivity with a Business Account.Sign Up

  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 292
  • Last Modified:

Help with Stored Procedure

I have a stored procedure that contains two INSERT statements. When I perform the inserts manually, everything works fine, but when I put them in my stored procedure, they do not work. Basically, there aren't supposed to be any duplicates. The first insert inserts an edge into my graph, then the next insert is supposed to transitively close the graph, excluding the edge I've already inserted. These queries work fine by themeselves but inside the procedure, the initial edge does not get excluded.

Here is the Procedure:


ALTER       PROCEDURE TransClose
 @uid INTEGER,
 @is_cross_poll INTEGER

begin tran
-- create new edge first
INSERT INTO tblGHGraph (cross_poll, fk_uid, init, term) VALUES (@is_cross_poll, @uid, @j, @k)
commit tran

begin tran
INSERT INTO tblGHGraph (cross_poll, fk_uid, init, term)
SELECT DISTINCT @is_cross_poll AS cross_poll, @uid AS fk_uid, G2.init, G3.term
  FROM tblGHGraph AS G1, tblGHGraph AS G2, tblGHGraph AS G3
WHERE -- define G1 as the new edge
  (G1.init IN (@j,@k) )--AND G1.term IN (4,2))
 --    AND (G2.term = G1.init AND G1.term = G3.init)
-- link G2 and G3 together thru G1
     AND G2.term IN (@j,@k)
     AND G3.init IN (@j,@k)
-- do not recreate any existing edges
         FROM tblGHGraph AS G4
         WHERE G4.init = G2.init
         AND G4.term = G3.term)
commit tran


1 Solution
dignifiedAuthor Commented:
Failed to mention, this is also in MS SQL. Should have posted in the MS SQL section....
Wow, transitive closure in a database, who'd of thunk it.

This will be horribly inefficient, as you add each vertex, you'll wind up with a cross join of 3 views of the same table only to eliminate a big chunk of rows via the "distinct" clause. You really should take a look at Warshall's algorithm (aka Floyd Warshall since Floyd is a refinement on Warshall). Plenty of this on the web. To be honest, I gave up looking at your database code once I figured out what you were attempting.

Warshall's algorithm is quite easy to implement, run's in N^3 (for the full transitive closure, not just 1 edge), and run's in-memory with a simple triple-nested loop using an adjacency matrix. If you must have the results in a database for some reason, use Warshall's, then dump the results to the database.

Warshall's is not the ultimate algorithm, but it is easy to understand and program. I guess you can run Warshall's on up to a couple of hundred nodes in under a second, I can't guess just slow your database solution would be (minutes? hours?)

Some simple Warshall's refs - source included. (There is many of these online). Sedgewick and similar books should explain this topic well too.
If you have to get the database to "do" the computation, build your edge list, dumping it into a table, then write a C program, etc. to run on the database server, slurp up the edge list, then dump the output from Warshall into a table. BTW, I don't intend to sound condescending, you would have a pretty smart guy to figure out how to do this using a database, it's just not the right way in this case.
Improve Your Query Performance Tuning

In this FREE six-day email course, you'll learn from Janis Griffin, Database Performance Evangelist. She'll teach 12 steps that you can use to optimize your queries as much as possible and see measurable results in your work. Get started today!


I have a couple of comments.

First, you seem to have one edge hard-coded in the procedure, which doesn't seem right
    WHERE -- define G1 as the new edge
       (G1.init IN (@j,@k) )--AND G1.term IN (4,2)) <----

Second, tho whole idea of transactions is to commit the whole procedure as one block.
The two insert statements are atomical by themselves and don't need to be enclosed by a transaction

I can't say for sure what's wrong with your code, but for starters I would take out all transactions.

You say you tested the insert queries by themselves. Are you sure there isn't some unique index
or foreign key constraint that's being violated ?

dignifiedAuthor Commented:
rvooijs - That hard coded edge is commented out....

Maybe I'll just have to have two procedures because I don't know whats up at the moment. As far as the complexity of the procedure. I don't plan on having to transitively close a huge dataset. The complexity for my purposes would be M^3 where M is a subset of the nodes N. M will prolly be no larger than 20. It is an interesting question to what would be the most efficient way to acheive the transitive closure. I'd like to see something more scientific to indicate if a SQL implementation is significantly slower than  taking the time to pipe the output to a C program and back. In either case I think what I have will be fine. If it isn't then I'll look into a platform optimized alternative.
dignifiedAuthor Commented:
Seems I found the problem... .this line....


M$ did this for me, my initial nodes have NULL values... thanks M$!
PAQed with points refunded (250)

Community Support Moderator
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now