Link to home
Create AccountLog in
Avatar of rwheeler23
rwheeler23Flag for United States of America

asked on

SQL query that returns records in one list that are outside of another list.

I have two tables.The first SQL query pulls all the orders for September. The second query pulls the orders for September that have a qualifying value for ACTINDX.which comes from a master list. In this case we are talking a ledger chart of accounts. I am trying list list all the orders that did not have a qualifying value for ACTINDX.

Examples:

SOPNUMBE 10000 has three lines with values of 1000,1100 and 1200.

SOPNUMBE 20000 has three lines with values 0f 1000,1100 and 1150

A qualifying number is 1150.

So SOPNUMBE 10000 is no good since none of the three values are 1150.

SOPNUMBE 20000 is good because one of the values is 1150

There are a total of 602 orders for September and 518 that have qualifying values.

My question is how to structure these two queries into one so I will see a list orders that do not have any of the qualifying values coming from GL00100? In this case there are about 100 qualifying values coming from GL00100 for account 4100.


===========================================================

This returns all orders

===========================================================

SELECT  DISTINCT SOPNUMBE,SOPTYPE
FROM (
SELECT T1.SOPNUMBE,T2.SOPTYPE
FROM [dbo].[sop10102] T1
               INNER JOIN [dbo].[sop30200] T2
                       ON T1.sopnumbe = T2.sopnumbe
                          AND T1.soptype = T2.soptype
        WHERE  T1.soptype = 4
               AND T2.docdate >= '20230901'
               AND T2.docdate <= '20230930') T
============================================================

This returns orders with a qualifying values for    ACTINDX

============================================================
SELECT DISTINCT SOPNUMBE,SOPTYPE
FROM   (SELECT T1.sopnumbe AS SOPNUMBE,
               T1.soptype  AS SOPTYPE
        FROM   [dbo].[sop10102] T1
               INNER JOIN [dbo].[sop30200] T2
                       ON T1.sopnumbe = T2.sopnumbe
                          AND T1.soptype = T2.soptype
        WHERE  T1.soptype = 4
               AND T2.docdate >= '20230901'
               AND T2.docdate <= '20230930'
               AND T1.actindx IN(SELECT actindx
                                 FROM   [dbo].[gl00100]
                                 WHERE  actnumbr_1 = '4100')) T  

Avatar of slightwv (䄆 Netminder)
slightwv (䄆 Netminder)

I changed the Topic Area from SQL (Structured Query Laguage) to SQL Server since all your previous questions were for SQL Server and the 'dbo' in this one.

If that is incorrect, let me know and I'll change it back or you can change it back.

Your most recent question was using SQL Server 2008.  Is that still the case?

The reason I ask is that you 'can' select the first query WHERE the SOPNUMBE and SOPTYPE are NOT IN, NOT EXISTS or LEFT JOIN the second query.

There are better ways to solve this problem that will only access the base tables once and will be much more efficient but likely require SQL statements that probably weren't around in SQL Server 2008.
Avatar of rwheeler23

ASKER

This is MS SQL Server 2019. Are there different topics for each version of SQL?

Something like:


SELECT SOPNUMBE,SOPTYPE, MAX(GL_MATCH) AS GL_MATCH
FROM (
SELECT T1.SOPNUMBE,T2.SOPTYPE, 

    CASE WHEN EXISTS(SELECT * 

           FROM [dbo].[gl00100] gl  

           WHERE gl.actnumbr_1 = '4100' AND gl.actindx = T1.actindx) THEN 'Y' ELSE 'N' END AS GL_MATCH   
FROM [dbo].[sop10102] T1
               INNER JOIN [dbo].[sop30200] T2
                       ON T1.sopnumbe = T2.sopnumbe
                          AND T1.soptype = T2.soptype
        WHERE  T1.soptype = 4
               AND T2.docdate >= '20230901'
               AND T2.docdate <= '20230930') T 

GROUP BY SOPNUMBE,SOPTYPE

/*ORDER BY SOPNUMBE,SOPTYPE*/


This query gets me the full set. Now I need something that will remove the records that already have a match.
Total record count is
602 total records
518 that already have a 4100 record(these need to be excluded from the result set)
84 that do not. This is group I am after.

SELECT SOPNUMBE,SOPTYPE, MAX(GL_MATCH) AS GL_MATCH
FROM (
SELECT T1.SOPNUMBE,T2.SOPTYPE

FROM [dbo].[sop10102] T1
               INNER JOIN [dbo].[sop30200] T2
                       ON T1.sopnumbe = T2.sopnumbe
                          AND T1.soptype = T2.soptype
        WHERE  T1.soptype = 4
               AND T2.docdate >= '20230901'
               AND T2.docdate <= '20230930'

               AND NOT EXISTS(SELECT * FROM [dbo].[gl00100] gl WHERE gl.actnumbr_1 = '4100' AND gl.actindx = T1.actindx) ) T 

GROUP BY SOPNUMBE,SOPTYPE

/*ORDER BY SOPNUMBE,SOPTYPE*/


This is Microsoft SQL Server 2019.
These tables reference a sales order transaction. The SOP10102 are the ledger distributions corresponding to the sales document. In this case we are talking sales order returns. There will always be one that goes to 4100 and there could 'N' others depending on what is being returned. Shoes get one sales accounts, shirts get another and so on.These accounts come from the items being returned. The problem is that some of these invoices do not include the one line that contains 4100. The GL00100 table contains hundreds of records where the first segment is 4100. The complete account string is xxxx-xxxx-xxxx. So any account that begins with 4100 is good.The ACTINDX value  is the account index value corresponding to each account. As an example:
4100-0234 has actindx of 1000
4100-1234 has actindx of 1100
Going to GL00100 gets the subset of all valid ACTINDX values.

In the SOP10102 table are the ACTINDX values for each line on the return.
There may be 4 lines for return 7890 like this:
1000
2000
3000
4000
This example is good because at least one of the lines for return 7890 has a valid ACTINDX value. This value being 1000.

Now along comes return 6789 with 4 lines of:
2000
3000
4000
5000
This return is no good because none of the lines contain a valid ACTINDX.

What I am trying to do is take one subset of numbers and compare it to another subset of numbers. If they they inclusive of at least one ACTINDX value, this is good. The end result of what I need is all those records where the two subsets are exclusive of each other.

602 total records
518 that already have a 4100 record(these need to be excluded from the result set)
84 that do not. This is group I am after.
Since there are only 3 tables involved and not many columns, can you create a simple test case with tables, data and expected results?

That was we can set up our own test case with tested SQL that produces the results from the data you provide.

The data will reinforce the explanation without us having to understand the requirements at the level you do.

I cannot speak for all the Experts here but I understand the data much quicker and easier than trying to understand a detail explanation of what you are trying to do.

and I'll be setting up a simple test case anyway and if you provide a more realistic model for us, the closer out SQL will be to:  copy/paste/run.

If we have the test case provided for us, this should be a a quick solution instead of a LOT of typing back and forth.
I understand. This is a 25GB database so what I will do is extract just these three tables into a separate database. Hopefully that will be small enough. I do have my concern about SOP30200 table. That contains 30 years of sales orders. I will probably only extract the data for September. I will get to this tonight.
We aren't looking for ALL data or even real data.  DEFINITLY not ALL the data!

Just make it up.  Just try to keep the column names the same so you can just do a copy/paste against your real tables.

This took about 20 seconds to type in:
create table gl00100(actindx varchar(10), actnumbr_1 varchar(10);
insert into gl00100 values
('aaa','bbb'),
('ccc','ddd');

create table sop10102( ...

Open in new window


Just create enough rows to show 'good' and 'bad' data.  Could be as few as 1 to 4 rows per table.

Then from the made up data for the tables, give is the results you want from it.

========================================================================================
GL0010 -
CREATE TABLE [dbo].[GL00100](
      [ACTINDX] [int] NOT NULL,
      [ACTNUMBR_1] [char](5) NOT NULL,
      [ACTNUMBR_2] [char](7) NOT NULL,
      [ACTNUMBR_3] [char](5) NOT NULL,
)
GO

insert into gl00100 values
(1000,'4000','0000','0000'),
(1001,'4000','ABCD','0000'),
(1002,'4000','EFGH','0000'),
(1003,'4100','0000','0000'),
(1004,'4100','ABCD','0000'),
(1005,'4100','EFGH','0000'),
(1006,'4100','IJKL','0000'),
(1007,'4100','LMNP','0000'),
(1008,'4100','RSTY','0000')
GO
=============================================================================================
SOP10102 -
CREATE TABLE [dbo].[SOP10102](
      [SOPTYPE] [smallint] NOT NULL,
      [SOPNUMBE] [char](21) NOT NULL,
      [SEQNUMBR] [int] NOT NULL,
      [ACTINDX] [int] NOT NULL
)

insert into SOP10102 values
(4,'5678',1,1000),
(4,'5678',2,1001),
(4,'5678',3,1002),
(4,'6789',1,1001),
(4,'6789',2,1002),
(4,'6789',3,1005),
(4,'6789',4,1001),
(4,'6789',5,1002),
(4,'6789',6,1003)
GO
===========================================================================================
SOP 30200 -
CREATE TABLE [dbo].[SOP30200](
      [SOPTYPE] [smallint] NOT NULL,
      [SOPNUMBE] [char](21) NOT NULL,
      [DOCDATE] [datetime] NOT NULL
)
GO

insert into SOP30200 values
(4,'5678','20230901'),
(4,'5678','20230901'),
(4,'5678','20230901'),
(4,'6789','20230901'),
(4,'6789','20230901'),
(4,'6789','20230901'),
(4,'6789','20230901'),
(4,'6789','20230901'),
(4,'6789','20230901')
GO
===========================================================================================
Using the values as an example, order 5678 would be the record I would want returned. This is because the ACTINDX values
for this return are 1000,1001 and 10002. These values are NOT associated with GL account(ACTNUMBR_1) 4100 but only with account 4000.

On the other hand, return 6789 I would not want included due the third line in SOP10102. The ACTINDX value here is 1005 which is associated account 4100. It only takes 1 line to qualify.
ASKER CERTIFIED SOLUTION
Avatar of slightwv (䄆 Netminder)
slightwv (䄆 Netminder)

Link to home
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
See answer
Thanks guys. You are the best. I ran both and I barely click the button and they are both done.
Might I suggest you also give Scott's solution an accept as well since he was the first to fully understand the question and provide a working solution?

All he did was not give you the final WHERE clause.

https://www.experts-exchange.com/questions/29266715/SQL-query-that-returns-records-in-one-list-that-are-outside-of-another-list.html?anchorAnswerId=43570621#a43570621

Sorry Scott, it is late and I am tired. Your suggestion has been marked as helpful.

Thank you both.  I'm very happy with it.  I thought I might indeed have misunderstood, so it's all good.