?
Solved

Multi-term queries across many-to-many tables

Posted on 2013-06-24
13
Medium Priority
?
465 Views
Last Modified: 2013-07-01
Hello Experts

I am using a database to store "Items" which possess varying numbers of "Assets". Each Item has some fixed columns and may also may have zero, one or more Assets associated with it. Each Asset has a fixed number of columns of different types.

So I have a table of Items, a table of Assets, a many-to-many bridge table between them, and then do a two-stage join on the Item ID and Asset ID. This gives a results set containing one row per Item per Asset, which all works fine.

I can easily find Items that possess a single named Asset, but how do I draft a multi-term SQL query for a combination of named Assets, e.g. "find Items that possess Assets A and B but not C"?

I am sure this is a well-known problem :)
0
Comment
Question by:NickDuffill
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 5
  • 3
13 Comments
 
LVL 49

Expert Comment

by:PortletPaul
ID: 39271236
what query are you using now for "I can easily find Items that possess a single named Asset"?

it will be far easier for you if we provide a sql query that will work straight away.

I could invent one, with names that won't be familiar to you, if you prefer.
0
 

Author Comment

by:NickDuffill
ID: 39271408
Hello PortletPaul

Thanks for your quick response. At the moment I do not have a skeleton query for this. An extract from the schema is something like:

tblItem: item_ID, item_Name, item_Description
tblAsset: ast_ID, ast_Type, ast_Name, ast_Value
tblItemAsset: item_ID, ast_ID

Create view v1 as select * from tblItem left outer join (tblItemAsset on (tblItem.ItemID = tblItemAsset.ItemID) inner join tblAsset on (tblAsset.ast_ID = tblItemAsset.Ast_ID));

... returns a table of items x individual assets possessed, where they exist. I can then query this with

select * from v1 where ast_Name = "Asset A";

The above lines may not be precise, as I have simplified them. It also relies on the assumption that each Item can only possess one copy of a specific Asset.

If you can give me a hint on what approach to use to find combinations of assets, that would be really helpful - it doesn't have to be working code, just a strategy. If it involves redesign of the tables then that would be OK, although a solution in the query would be ideal.

Thanks again
Nick
0
 
LVL 32

Expert Comment

by:awking00
ID: 39271728
If you could provide some sample data for the tblItem, tblAsset and tblItemAsset tables that include astName values of "A", "B", and "C" it would be most helpful.
0
Get your Conversational Ransomware Defense e‑book

This e-book gives you an insight into the ransomware threat and reviews the fundamentals of top-notch ransomware preparedness and recovery. To help you protect yourself and your organization. The initial infection may be inevitable, so the best protection is to be fully prepared.

 
LVL 49

Expert Comment

by:PortletPaul
ID: 39271794
>> find Items ... a multi-term SQL query for a combination of named Assets, e.g. "find Items that possess Assets A and B but not C"?
select distinct
item_ID, item_Name, item_Description
from v1
where ast_name in('Asset A','Asset B')
and not exists (
                select v1b.item_id
                from v1 as v1b
                where ast_name in('Asset C')
                and v1.item_id = v1b.item_id
               )
;

Open in new window

Is this what you are after?  i.e. There may be no need to alter that view - just amend the where clause. Note I'm assuming you just want the related item(s) hence the use of distinct to remove unwanted repetition.
see: http://sqlfiddle.com/#!3/39cf0/3
0
 

Author Comment

by:NickDuffill
ID: 39272023
@PortletPaul

Thank you for taking the time to enter this in SQLFiddle - much appreciated.

This is a good step forwards but I think the "ast_name in(...)" term returns Items that have either Asset A or Asset B, rather than having both. I deleted the last line (2,2) of the tblItemAsset initialisation and still get "e-e" returned from the query. It is the "anding" part that has me stumped.

(Regarding the return of distinct lines, the back-end code that renders the results expects and handles repeated lines anyway, but is ignorant of the query. It just knows how to merge assets across multiple lines).

@Awking00 - thank you for your reply. PortletPaul has been kind enough to put together some sample data.

Best regards
Nick
0
 
LVL 32

Assisted Solution

by:awking00
awking00 earned 400 total points
ID: 39272205
I think PortletPaul has you on the right track although, if you remove the record from tblItemAsset where the item_id is 2 and the ast_id is 2 (i.e. item2 has no ast_type 'B'), it will still return item_id 2, item_name 'item2', and item_description 'e-e'. If the query is modified as follows:
select distinct
item_ID, item_Name, item_Description
from v1
where ast_name = 'Asset A'
and exists (select 1 from v1 v1b where ast_name = 'Asset B'
            and v1.item_id = v1b.item_id)
and not exists (select 1 from v1 v1c
                where ast_name ='Asset C'
                and v1.item_id = v1c.item_id);
I think that will resolve the issue.
0
 
LVL 49

Accepted Solution

by:
PortletPaul earned 1600 total points
ID: 39273406
:) awking00 is correct by making such an adjustment to the data my original would still return item2. I have expanded the test set, and prepared 2 alternatives using group by for consideration together with the corrected use of exists/not exists here : http://sqlfiddle.com/#!3/a0e5e/8 

Please consider the execution plans (although not all tables are indexed). I'm now suggesting the lower variant (using the case expression) as I suspect it may be the easiest to adopt for query flexibility (? an unknown, may be a poor assumption).
select
    item_ID
  , item_Name
  , item_Description
  , count(*)
from v1
where ast_name in('Asset A','Asset B')
and not exists (
                select v1b.item_id
                from v1 as v1b
                where ast_name in('Asset C')
                and v1.item_id = v1b.item_id
               )
group by
    item_ID
  , item_Name
  , item_Description
having
    count(*) = 2
;
  
select
    item_ID
  , item_Name
  , item_Description
  , sum(case when ast_name in('Asset A','Asset B') then 1 
             when ast_name in('Asset C')           then -1 
        end) as matched
from v1
group by
    item_ID
  , item_Name
  , item_Description
having
    sum(case when ast_name in('Asset A','Asset B') then 1 
             when ast_name in('Asset C')           then -1 
        end) = 2

Open in new window

note I have included use of IN() on the basis it would be easy for query flexibility. Use on IN() isn't essential (such as when ast_name in('Asset C')   which could be a simple = comparison.
0
 

Author Comment

by:NickDuffill
ID: 39279770
Thank you so much for these detailed answers. I will study them by the end of the week and respond again.

In general it seems that although the many-to-many junction table is elegant in terms of data structure, it makes some types of query more complex. Is this a reasonable interpretation, and might there be better ways to structure the data?

Best regards
Nick
0
 
LVL 49

Expert Comment

by:PortletPaul
ID: 39279960
Nick, the "query complexity" isn't due to the junction table. It is due to this:

>>the "ast_name in(...)" term returns Items that have either Asset A or Asset B, rather than having both. ... It is the "anding" part that has me stumped.

a simple query will produce the "either" effect - which you do not want
a slightly harder query will produce the "anding" effect - which you do want

not only that, you want that "anding", plus you want "excluding", together as 1 query

SQL is "row oriented" so as it scrolls through the rows it evaluates the where conditions and decides if that row is "in" or "out" (of the result). To produce that "anding" effect those row-by-row decisions are NOT sufficient by themselves so there needs to me more.

The "more" can be done via those "exists ( ... )" or "not exists ( ...)" subqueries, or it can be achieved using "group by" and "having". It is these extra bits that allow the query to combine the "in/out" decisions of multiple rows together, and hence providing that "anding" you crave.

short answer: it's your filtering desire that is "complex"

btw: I don't view these queries as complex :) and I suspect awking00 would agree
0
 
LVL 32

Expert Comment

by:awking00
ID: 39281268
>> btw: I don't view these queries as complex :) and I suspect awking00 would agree <<
Wholeheartedly!
0
 

Author Comment

by:NickDuffill
ID: 39290676
Thank you to PortletPaul and awking00 for your help, which has allowed this to be solved with a slightly modified version of the first variant suggested by PortletPaul. This makes an "exists" test on the junction table between items and assets. It does not use the counting technique so is a bit more long-winded, but because the application accepts user-defined elements of the SQL, I wanted to keep the number of clauses as low as possible and use repeated patterns. An example of the eventual code for a term in the Where clause that tests for an asset is

exists (select item_pk from v2 where (v2.ast_type = x) and (v2.ast_name like 'abc') and (v2.item_guid = v1.item_guid))

where
item_pk, item_guid are the primary key and guid
ast_name and ast_type are asset fields
v2 is a view that is a join between the item/asset junction table and the asset table
v1 is a view that is a join between the item table and v2.
The first two terms find specific assets, and the third ensures that it refers to the row currently under examination. Expressions of this form can be daisy-chained and bracketed to test combinations of multiple assets, which is what I originally needed to achieve.

PortletPaul, you hit the nail on the head when you identified that the root cause is the row-oriented nature of the SQL filtering. A generic filter of the form "where A and B but not C" is straightforward if all the fields exist in the same row, but not if they are in the same columns across multiple rows. I accept the comments about this not being a particularly complex query, but as part of the project I need to be able to explain the querying logic in broad terms to non-programmers. The "exists" test across the junction table will not be as easy to explain as an expression that references fields that are already part of the row being tested.

This seems to be working pretty well and allows queries on one or more assets attached to an item.

Thanks again for your help.
Nick
0
 

Author Closing Comment

by:NickDuffill
ID: 39290682
See last comment.
0
 
LVL 49

Expert Comment

by:PortletPaul
ID: 39292116
Thanks Nick, & good luck explaining any of it to non-programmers. Cheers, Paul.
0

Featured Post

Windows Server 2016: All you need to know

Learn about Hyper-V features that increase functionality and usability of Microsoft Windows Server 2016. Also, throughout this eBook, you’ll find some basic PowerShell examples that will help you leverage the scripts in your environments!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A Stored Procedure in Microsoft SQL Server is a powerful feature that it can be used to execute the Data Manipulation Language (DML) or Data Definition Language (DDL). Depending on business requirements, a single Stored Procedure can return differe…
It is possible to export the data of a SQL Table in SSMS and generate INSERT statements. It's neatly tucked away in the generate scripts option of a database.
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
In this video, Percona Solution Engineer Dimitri Vanoverbeke discusses why you want to use at least three nodes in a database cluster. To discuss how Percona Consulting can help with your design and architecture needs for your database and infras…

777 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question