Solved

Help with Stored Procedure

Posted on 2004-09-13
8
230 Views
Last Modified: 2009-07-29
I have a stored procedure that contains two INSERT statements. When I perform the inserts manually, everything works fine, but when I put them in my stored procedure, they do not work. Basically, there aren't supposed to be any duplicates. The first insert inserts an edge into my graph, then the next insert is supposed to transitively close the graph, excluding the edge I've already inserted. These queries work fine by themeselves but inside the procedure, the initial edge does not get excluded.

Here is the Procedure:

SET QUOTED_IDENTIFIER OFF
GO
SET ANSI_NULLS OFF
GO






ALTER       PROCEDURE TransClose
 @j INTEGER,
 @k INTEGER,
 @uid INTEGER,
 @is_cross_poll INTEGER
AS
BEGIN

begin tran
-- create new edge first
INSERT INTO tblGHGraph (cross_poll, fk_uid, init, term) VALUES (@is_cross_poll, @uid, @j, @k)
commit tran

begin tran
INSERT INTO tblGHGraph (cross_poll, fk_uid, init, term)
SELECT DISTINCT @is_cross_poll AS cross_poll, @uid AS fk_uid, G2.init, G3.term
  FROM tblGHGraph AS G1, tblGHGraph AS G2, tblGHGraph AS G3
WHERE -- define G1 as the new edge
  (G1.init IN (@j,@k) )--AND G1.term IN (4,2))
 --    AND (G2.term = G1.init AND G1.term = G3.init)
-- link G2 and G3 together thru G1
     AND G2.term IN (@j,@k)
     AND G3.init IN (@j,@k)
-- do not recreate any existing edges
     AND NOT EXISTS (SELECT *
         FROM tblGHGraph AS G4
         WHERE G4.init = G2.init
         AND G4.term = G3.term)
commit tran

END

GO
SET QUOTED_IDENTIFIER OFF
GO
SET ANSI_NULLS ON
GO
0
Comment
Question by:dignified
8 Comments
 

Author Comment

by:dignified
Comment Utility
Failed to mention, this is also in MS SQL. Should have posted in the MS SQL section....
0
 
LVL 6

Expert Comment

by:gwalkeriq
Comment Utility
Wow, transitive closure in a database, who'd of thunk it.

This will be horribly inefficient, as you add each vertex, you'll wind up with a cross join of 3 views of the same table only to eliminate a big chunk of rows via the "distinct" clause. You really should take a look at Warshall's algorithm (aka Floyd Warshall since Floyd is a refinement on Warshall). Plenty of this on the web. To be honest, I gave up looking at your database code once I figured out what you were attempting.

Warshall's algorithm is quite easy to implement, run's in N^3 (for the full transitive closure, not just 1 edge), and run's in-memory with a simple triple-nested loop using an adjacency matrix. If you must have the results in a database for some reason, use Warshall's, then dump the results to the database.

Warshall's is not the ultimate algorithm, but it is easy to understand and program. I guess you can run Warshall's on up to a couple of hundred nodes in under a second, I can't guess just slow your database solution would be (minutes? hours?)

Some simple Warshall's refs - source included. (There is many of these online). Sedgewick and similar books should explain this topic well too.

http://www.cs.rochester.edu/u/leblanc/csc173/graphs/tc.html
http://datastructures.itgo.com/graphs/transclosure.htm
0
 
LVL 6

Expert Comment

by:gwalkeriq
Comment Utility
If you have to get the database to "do" the computation, build your edge list, dumping it into a table, then write a C program, etc. to run on the database server, slurp up the edge list, then dump the output from Warshall into a table. BTW, I don't intend to sound condescending, you would have a pretty smart guy to figure out how to do this using a database, it's just not the right way in this case.
0
Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

 
LVL 6

Expert Comment

by:rvooijs
Comment Utility
Hi,

I have a couple of comments.

First, you seem to have one edge hard-coded in the procedure, which doesn't seem right
    WHERE -- define G1 as the new edge
       (G1.init IN (@j,@k) )--AND G1.term IN (4,2)) <----

Second, tho whole idea of transactions is to commit the whole procedure as one block.
The two insert statements are atomical by themselves and don't need to be enclosed by a transaction

I can't say for sure what's wrong with your code, but for starters I would take out all transactions.

You say you tested the insert queries by themselves. Are you sure there isn't some unique index
or foreign key constraint that's being violated ?

Robert
0
 

Author Comment

by:dignified
Comment Utility
rvooijs - That hard coded edge is commented out....

Maybe I'll just have to have two procedures because I don't know whats up at the moment. As far as the complexity of the procedure. I don't plan on having to transitively close a huge dataset. The complexity for my purposes would be M^3 where M is a subset of the nodes N. M will prolly be no larger than 20. It is an interesting question to what would be the most efficient way to acheive the transitive closure. I'd like to see something more scientific to indicate if a SQL implementation is significantly slower than  taking the time to pipe the output to a C program and back. In either case I think what I have will be fine. If it isn't then I'll look into a platform optimized alternative.
0
 

Author Comment

by:dignified
Comment Utility
Seems I found the problem... .this line....

SET ANSI_NULLS OFF


M$ did this for me, my initial nodes have NULL values... thanks M$!
0
 

Accepted Solution

by:
modulo earned 0 total points
Comment Utility
PAQed with points refunded (250)

modulo
Community Support Moderator
0

Featured Post

Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

Join & Write a Comment

In today’s complex data management environments, it is not unusual for UNIX servers to be dedicated to a particular department, purpose, or database.  As a result, a SAS® data analyst often works with multiple servers, each with its own data storage…
APEX (Application Express) is used to develop a web application from Oracle. SQL Workshop is one of the tools that comes with Oracle APEX to query or modify the database objects or to make any changes to the structure.
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now