Solved

Stack Table Problem With Join & random Component - Tough One for an SQL champion.

Posted on 2004-10-23
300 Views
Last Modified: 2008-02-01
This is quite a complex problem.
Basically I have a table called categories that looks like this

category_id INT (pkey)
parent_id INT
category_name VARCHAR (50)
cateogry_description VARCHAR (500)

As you might imagine, categoris can have subcategories etc.. and the parent_id field reflects the subcategories' parent category id. Pretty simple.
Top level categories have a parent_id of 0 (the default value of the field when a new record is created)

But wait, there's more.
I have another table called products:

product_id INT (pkey)
category_id INT (Fkey category_id)
name VARCHAR(50)
description VARCHAR(8000)
stock_status_id (INT) (fkey stock_status_id)

in this way a product can be a child of a subcategory, even though it's in another table...
The application rules only allow a category to contain EITHER products or other subcategories but not both.

I have the need to build a stored procedure that when passed in a category ID will return N number of products, randomly, from the products table, provided that the category_id's of the products are descendants of the passed in category_id.
For example Given the category structure:

Bikes
      Racing Bikes
                 10 Speeds
                 15 Speeds
      Mountain Bikes
                 Brand A
                 Brand B
Skateboards
      Long Boards


For 10 Speeds, there might be say 50 products in the products table. If I passed in the ID for Racing Bikes, I would want to get N number of products, randomly selected from
the products under both 10 speeds and 15 speeds.

Similarly, if given simply the ID for Bikes, all products 10 speeds, 15 speeds, brand A and Brand B would be eligible as a pool to randomly select N from.
It's further complicated by the need to only select those items which are in stock (stock_status_id=4).

I hope I have explained this in enough detail, It's a dire situation and I sure could use the help! I have alreadt trolled the q's here and checekd out the MSDN article that deals with stack tables... it didn't seem to make sense in this scenario.

Thanks In advance,
jaemo
0
Question by:jaemo
    19 Comments
     
    LVL 7

    Expert Comment

    by:ala_frosty
    what's the range on 'N' 10, 100, 1000, 10000, 1000000000000?

    Is N selected randomly or built into the query?
    0
     
    LVL 8

    Expert Comment

    by:sigmacon
    CAUTION: The following method may work well for a fairly small number of products found in all categories. Otherwise, this approach gets too slow. Please test with real data.

    First, you need an easy way to get all children categories back. Since this is SQL Server, we don't have the luxury of simple statements for this as provided in Oracle.  This user-defined function has been modified for your special case - it uses a push-pop stack approach for going thru all category level. Although recursive would be possible too, there is a limit of 32 recursion levels in SQL Server:

    create function dbo.getAllChildren ( @category int )
        returns @result table ( category_id int )
    as
    begin

        declare
            @nLevel integer,
            @current int

        declare @Stack table ( nLevel int, category_id int )
       
        if @category is null
            return
        else begin
            set @nLevel = 1
            set @current = @category
        end
       
        insert into @Stack values ( @nLevel, @current )

        while ( @nLevel > 0 ) begin

            if exists ( select * from @Stack where nLevel = @nLevel ) begin

                -- Pop One Record
                select top 1 @current = category_id from @Stack where @nLevel = nLevel
                delete from @Stack where category_id = @current and @nLevel = nLevel

                -- Insert the record into hierarchy Table
                insert into @result values ( @current )
       
                -- Insert Children with no more Children Directly in result Table
                insert into @result
                select category_id
                from categories c
                where
                    parent_id = @current and
                    ( select count(*) from categories d where d.parent_id = c.category_id ) = 0

                --Insert children with children in stack
                insert into @Stack
                select
                    @nLevel,
                    category_id
                from categories c
                where
                    parent_id = @current and
                    ( select count(*) from categories d where d.parent_id = c.category_id )> 1

                -- if any children where inserted, increase @nLevel
                if @@rowcount > 0
                    set @nLevel = @nLevel + 1

            end else begin
                set @nLevel = @nLevel - 1
            end -- if exists

        end -- while

    return

    end
    go


    Once we have this function, randomly deleting products from a list of resulting products in all concerned categories leaves you with a random list of products:


    create procedure GetRandomProducts(
        @category_id int,
        @max_products int
    )
    as

    select distinct p.product_id
    into #temp
    from
        dbo.getAllChildren(@category_id) d
            inner join products p
            on p.category_id = d.category_id
    where d.category_id <> @category_id


    -- deleting randomly leaves a number of random of products left in the table!
    declare
        @maxId int, -- max product_id in temp table, used to calculate random id
        @affected int -- continue only if something was actually deleted (@affected > 0)

    select @maxId = max(product_id) from #temp

    while ( select count(*) from #temp ) > @max_products begin

        -- try deleting until something actually got deleted
        set @affected = 0
        while @affected = 0 begin
            -- delete a random product from #temp ( may not delete anything )
            delete #temp
            where product_id = round((rand() * @maxId), 0) + 1
            set @affected = @@rowcount
        end
       
        select @maxId = max(product_id) from #temp
    end

    select p.*
    from
        #temp t
            inner join products p
            on p.product_id = t.product_id

    drop table #temp



    0
     
    LVL 12

    Accepted Solution

    by:
    Here is an iterative method that should perform better than a recursive solution and it does not have any limit on the depth of the parent/child relationships.

    create procedure p_get_products (@id int, @rows int)
    as
    begin

    declare @level int
    create table #results (id int identity(1,1), product_id int)
    set @level = 1

    -- Insert first root level catetegory into #tree table
    select category_id id, @level level
    into #tree  
    from categories where category_id = @id

    -- While there are children of prior rows, add them to #tree.  Notice first time will be children of selected category.
    -- Subsequent inserts will be children of children and so on, increaseing levels each time
    While @@rowcount > 0
    begin
          set @level = @level + 1
          insert #tree
                select category_id, @level
                from categories c
                inner join #tree t on t.id = c.parent_id
                where t.level = @level - 1
    end

    -- Insert list of product_ids into #results ordered by newid() to get random ordering.
    -- Top 100000 here limits results to some reasonable limit. Change to suit max @rows

    insert #results (product_id)
          select top 100000 product_id
          from product p
          inner join #tree t on t.id = p.category_id
          where stock_status_id = 4
          order by newid()

    -- Return products that exist in category or subcategories of our selected category.
    -- Identity on #results allows us to select where r.id <= @rows.
    select c.category_name, p.name
    from product p inner join categories c on p.category_id = c.category_id
    inner join #results r on r.product_id = p.product_id
    where r.id <= @rows

    end
    0
     
    LVL 8

    Expert Comment

    by:sigmacon
    Just a note, in case it was missed: The method I posted to get all the child categories is NOT recursive and does NOT have a restiction on the categories. It's just WAY slower and more cumbersome then kselvia's. I just referred to a recursive method, because it is more elegant, just not practical in SQL.

    Otherwise, kselvia's solution is way more elegant then mine, and I also missed the restriction for stock_status_id=4 which could be simply added.

    I have to remember all those neat tricks kselvia used in his approach to those common problems in more complex queries! Thanks for the education.

    select distinct p.product_id
    ...
    where
       d.category_id <> @category_id
       and stock_status_id=4
    0
     
    LVL 12

    Expert Comment

    by:kselvia
    I should have read closer because I did think your solution was recursive since it is a common way to do heirarchies.  I have never addressed a parent child type question before so I thought I would try my hand at it.  Points to sigma or split them because I don't need any points, I just like the challenge. Also I noticed I called the products table product.  BTW: If you wished to join parent_id back to #tree.id in the final select you could return the level of the product allowing you to display a tree like output with replicate('--',level), name or however you do that in your front end.
    0
     
    LVL 8

    Expert Comment

    by:sigmacon
    kselvia

    This parent-child thing drives most people nuts. I need that a lot and over the years have come across MANY solutions of doing that but NONE as elegant, short, fast and inexpensive as yours.
    0
     
    LVL 12

    Expert Comment

    by:kselvia
    That's rather suprising. I would question if it actually works then :)  I just happened to have done something like this last week to represent directories and files and that basically what I used.
    0
     

    Author Comment

    by:jaemo
    kselvia, I am very excited to try your solution but am getting hung up on the following error when I try to execute the SP:


    (1 row(s) affected)

    Server: Msg 8101, Level 16, State 1, Procedure p_get_products, Line 19
    An explicit value for the identity column in table '#tree' can only be specified when a column list is used and IDENTITY_INSERT is ON.

    Haven't come accross this error before so I am not too sure what it means... any ideas?

    thanks!
    -jaemo
    0
     
    LVL 12

    Expert Comment

    by:kselvia
    Category_id is an identity in your table so the SELECT INTO attempts to create #tree.id as an identity also.

    Create #tree then insert into it;

    create table #tree (id int, level int)

    insert #tree
      select category_id, @level
      from categories where category_id = @id

    I shouldn't have used select into anyway.
    0
     

    Author Comment

    by:jaemo
    kselvia,

    I've modified the proc as you've suggested and it executes without error, but it does not seem to return any results, regardless of any combination i seem to pass in.
    To clarify: @rows is the number of results to return correct?
    Anything I can do to debug this so I can see what is going on behind the scenes?

    sigmacon, i have tried your approach as well but as you mentioned it executes kind of...painfully slow...

    Thanks all for your help so far.. I see light at the end of this tunnel!
    -jaemo

    PS: ala_frosty to answer your initial query, N will be a max of 6, and a min of 1... usually 1.
    0
     
    LVL 12

    Expert Comment

    by:kselvia
    Not sure. I made up some data that looks like this:

    insert categories select 1, 0, 'Bikes', 'Things with 2 wheels'
    insert categories select 2, 1, 'Racing Bikes', 'Sport Bikes'
    insert categories select 3, 2, '10 Speeds', 'Multi-speed Racing Bikes'
    insert categories select 4, 2, '15 Speeds', 'Multi-speed Racing Bikes'
    insert categories select 5, 2, 'Mountain Bikes', 'Trail Bikes'
    insert categories select 6, 5, 'Brand A', 'Brand A Trail Bikes'
    insert categories select 7, 5, 'Brand B', 'Brand B Trail Bikes'
    insert categories select 8, 0, 'Skateboards', 'Skate Boards'
    insert categories select 9, 8, 'Longboards', 'Long Skate Boards'

    insert product (product_id, category_id, name, description, stock_status_id) select 1, 3, '10 Speed blue bike', '10 Speed blue bike' , 4
    insert product (product_id, category_id, name, description, stock_status_id) select 2, 4, '15 Speed blue bike', '15 Speed blue bike' , 4
    insert product (product_id, category_id, name, description, stock_status_id) select 3, 3, '10 Speed red bike', '10 Speed red bike' , 4
    insert product (product_id, category_id, name, description, stock_status_id) select 4, 4, '15 Speed red bike', '15 Speed red bike' , 4
    insert product (product_id, category_id, name, description, stock_status_id) select 5, 6, '1999 Brand A Bike', '1999 Brand A Bike' , 4
    insert product (product_id, category_id, name, description, stock_status_id) select 6, 6, '2000 Brand A Bike', '2000 Brand A Bike' , 1
    insert product (product_id, category_id, name, description, stock_status_id) select 7, 7, '1999 Brand B Bike', '1999 Brand B Bike' , 4
    insert product (product_id, category_id, name, description, stock_status_id) select 8, 9, 'Cheap Longboard', 'Cheap Longboard' , 4
    insert product (product_id, category_id, name, description, stock_status_id) select 8, 9, 'Expensive Longboard', 'Expensive Longboard' , 4

    and

    p_get_products 2, 2

    generates this:

    category_name  name              
    -------------- -------------------
    15 Speeds      15 Speed blue bike
    15 Speeds      15 Speed red bike

    What is the difference between your data and my test data? Did I set it up correctly?   Yes rows is the number of rows to return.

    0
     

    Author Comment

    by:jaemo
    kselvia,

    oops, that was my bad actually.. The stock_status id was supposed to be 1 and not 4...
    It seems to work very well now. I am going to run some final tests on this but I do believe you have solved this for me... Thanks!!!

    -james
    0
     

    Author Comment

    by:jaemo
    kselvia, this is working fine in SQL server (query analyzer) but when I try to use it on my ASP page, i get an "Operation is not allowed when the object is closed." each time I call it.
    This is very odd as it is the ONLY stored proc i am calling that does this.

    Are there any restrctions that you can think of that would prevent me from using it in this environment?
    I have MANY other stored procs that I use in the ASP and they all seem to work without any grief...

    thanks!
    -jaemo
    0
     

    Author Comment

    by:jaemo
    Here is the code I slapped together to test this....


    **********************************
    <%@language=jscript%>

    <%

    var strDsn = 'A connection String I *KNOW* to be working as I use it in a global include file with success everywhere but here....';
    var cn = Server.CreateObject('ADODB.Connection');
    var rs = Server.CreateObject('ADODB.Recordset');
    cn.Open(strDsn);
    var sql = "EXEC p_get_products @id=3,@rows=1";
    rs = cn.Execute(sql);
    var result = "";
    if(!rs.EOF){
          while(!rs.EOF){
                result = rs("image");
                rs.MoveNext();
          }

    }
    cn.Close();
    %>
    Should work....<br />
    <%=result%>
    <br />
    did it?
    ***********************************
    This just gives me:

    ASP Error occurred 10/25/2004 12:14:04 AM in ADODB.Recordset
    Error number: -2146824584 (0x800A0E78)
    File: /test.asp, line 12
    Operation is not allowed when the object is closed.


    Like I said, this works perfectly in SQL Server's Query Analyzer...
    any ideas guys?

    -jameo



    0
     
    LVL 12

    Expert Comment

    by:kselvia
    Try adding set nocounnt on to the beginning of the procedure
    0
     
    LVL 12

    Expert Comment

    by:kselvia
    Actually I looked at your code.

    result = rs("image");

    There is no image column being returned unless you changed it to do so.
    0
     

    Author Comment

    by:jaemo
    correct,

    I have edited the code to read

    select p.product_id, c.category_id, p.name, p.image
    from products p inner join categories c on p.category_id = c.category_id
    inner join #results r on r.product_id = p.product_id
    where r.id <= @rows

    so Image is now valid. It is behaving 100% normally in Query Analyzer, I just cannot seem to make a call from ASP. I am beginning to think it's a permissions error but it has the same permissions as other SP's that work fine from ASP too..

    Stumped. (I'll try that NOCOUNT though)

    -jaemo
    0
     

    Author Comment

    by:jaemo
    Added SET NOTCOUNT ON to the start of the procedure... it's still throwing an error on ASP but we seem to be making SOME progress at least ;)

    ASP Error occurred 10/25/2004 5:21:06 AM in ADODB.Field
    Error number: -2147352567 (0x80020009)
    File: /test.asp, line 22
    Object is no longer valid.

    Googling for this error has not yielded a ton of useful results unfortunately...
    0
     

    Author Comment

    by:jaemo
    kselvia, i was able to get it to work by playing with the ASP

    oddly enough:
    if(!rs.EOF){
          while(!rs.EOF){
                result += rs("image");
                Counter++;
                rs.MoveNext();
          }

    }

    WORKS while
    if(!rs.EOF){
          while(!rs.EOF){
                result = rs("image");
                Counter++;
                rs.MoveNext();
          }

    }
    gets an error....
    anyways, I think I can manage it from here... thanks a lot for your help and time...

    -jaemo
    0

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone. Privacy Policy Terms of Use

    Featured Post

    How to improve team productivity

    Quip adds documents, spreadsheets, and tasklists to your Slack experience
    - Elevate ideas to Quip docs
    - Share Quip docs in Slack
    - Get notified of changes to your docs
    - Available on iOS/Android/Desktop/Web
    - Online/Offline

    This is my first ever Article on EE or elsewhere; therefore, please bear with me if I have some discrepancies in my writing. I read many articles and questions related to "how to pass values to SSIS packages at run-time?"  Hence, this common ques…
    Slowly Changing Dimension Transformation component in data task flow is very useful for us to manage and control how data changes in SSIS.
    This video shows, step by step, how to configure Oracle Heterogeneous Services via the Generic Gateway Agent in order to make a connection from an Oracle session and access a remote SQL Server database table.
    Using examples as well as descriptions, and references to Books Online, show the documentation available for date manipulation functions and by using a select few of these functions, show how date based data can be manipulated with these functions.

    875 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    12 Experts available now in Live!

    Get 1:1 Help Now