Link to home
Start Free TrialLog in
Avatar of Geoff Sutton
Geoff SuttonFlag for Canada

asked on

Need assistance expanding TSQL query

I am using the attached query to pull records from 2 tables in SQL Server.  I need to expand this query (or at least the filter) so that amount1=paidtous OR amount1=sum(paidtous WHERE date is in range and account# is the same).  I have no idea how to perform the appropriate join to build this section.  Any help is appreciated.
ALTER PROCEDURE dbo.getRecords
	
	(
	@GetFound int = 1
	)
	
AS
/*
declare @rec table( invoiceNumber nvarchar(40), debtor nvarchar(80),
					accountNumber nvarchar(40), paidToUs numeric(18,2), 
					GST numeric(10,2),ourDate nvarchar(10))
insert into @rec SELECT DISTINCT OurRecords.invoiceNumber, OurRecords.debtor, 
						OurRecords.AccountNumber, SUM(OurRecords.paidToUs), 
						SUM(OurRecords.GST), ourDate From OurRecords 
				 GROUP BY
						invoiceNumber, debtor, AccountNumber, OurDate
						
*/
if @getFound=1
BEGIN
SELECT DISTINCT 
                      OurRecords.fileName, OurRecords.invoiceNumber, OurRecords.debtor, OurRecords.AccountNumber, OurRecords.paidToUs, OurRecords.dueToUs, 
                      OurRecords.GST, OurRecords.ourDate, ExcelData.DateAgency, ExcelData.DateDirect
FROM         ExcelData INNER JOIN
                      OurRecords ON ExcelData.Account = OurRecords.AccountNumber AND ExcelData.Amount1 = OurRecords.paidToUs AND 
                      ExcelData.Amount3 = OurRecords.dueToUs  And ExcelData.DateAgency<=OurRecords.ourDate ANd ExcelData.DateDirect>=OurRecords.ourDate
END
ELSE
BEGIN
	SELECT DISTINCT  Debtor, Account, Amount1 AS PaidToUs, Amount3 AS DueToUs, Amount4 AS GST, DateAgency, DateDirect
	FROM         ExcelData
	WHERE     (id NOT IN
	                          (SELECT DISTINCT XD.id
	                            FROM          ExcelData AS XD INNER JOIN
	                                                   OurRecords AS OD ON XD.Account = OD.AccountNumber 
	                                                   AND XD.Amount1 = OD.paidToUs
	                                                   AND XD.Amount3 = OD.dueToUs 
	                                                   AND (XD.DateAgency<=OD.OurDate AND XD.DateDirect>=OD.OurDate)))
END

Open in new window

Avatar of aaronakin
aaronakin
Flag of United States of America image

Give this a shot.


  SELECT DISTINCT
         OR1.fileName
       , OR1.invoiceNumber
       , OR1.debtor
       , OR1.AccountNumber
       , OR1.paidToUs
       , OR1.dueToUs
       , OR1.GST
       , OR1.ourDate
       , XD1.DateAgency
       , XD1.DateDirect
    FROM ExcelData XD1
      INNER JOIN OurRecords OR1 ON XD1.Account = OR1.AccountNumber
      INNER JOIN
            (
            SELECT OR2.AccountNumber
                 , SUM(OR2.paidToUs) AS paidToUsSum
              FROM ExcelData XD2
                INNER JOIN OurRecords OR2 ON XD2.Account = OR1.AccountNumber
              WHERE OR2.ourDate BETWEEN XD2.DateAgency AND XD2.DateDirect
              GROUP BY OR2.AccountNumber
            ) a ON OR1.AccountNumber = a.AccountNumber
    WHERE OR1.ourDate BETWEEN XD1.DateAgency AND XD1.DateDirect
      AND XD1.Amount1 IN (OR1.paidToUs,a.paidToUsSum)
      AND XD1.Amount3 = OR1.dueToUs
Avatar of Geoff Sutton

ASKER

It looks about what I wanted.  Getting multi-part Identifier OR1.AccountNumber could not be bound.  I will take this approach and see what I can do, unless you have a quick and ready answer :)  I am guessing this is the query for the first part, if GetFound=1?
Yes, this is for the first part.

As for the error....I do see a mistake.  In the derived table query, replace the OR1.AccountNumber in the JOIN with OR2.AccountNumber.  That should fixt it.

SELECT DISTINCT
         OR1.fileName
       , OR1.invoiceNumber
       , OR1.debtor
       , OR1.AccountNumber
       , OR1.paidToUs
       , OR1.dueToUs
       , OR1.GST
       , OR1.ourDate
       , XD1.DateAgency
       , XD1.DateDirect
    FROM ExcelData XD1
      INNER JOIN OurRecords OR1 ON XD1.Account = OR1.AccountNumber
      INNER JOIN
            (
            SELECT OR2.AccountNumber
                 , SUM(OR2.paidToUs) AS paidToUsSum
              FROM ExcelData XD2
                INNER JOIN OurRecords OR2 ON XD2.Account = OR2.AccountNumber
              WHERE OR2.ourDate BETWEEN XD2.DateAgency AND XD2.DateDirect
              GROUP BY OR2.AccountNumber
            ) a ON OR1.AccountNumber = a.AccountNumber
    WHERE OR1.ourDate BETWEEN XD1.DateAgency AND XD1.DateDirect
      AND XD1.Amount1 IN (OR1.paidToUs,a.paidToUsSum)
      AND XD1.Amount3 = OR1.dueToUs
No rows returned :(
Since I don't have your tables and data here, I can't quite tell what the problem would be, so try to run only the query from the derived table (below) and see if that brings back the correct results.  Let me know and we can work from there to see what the problem is.

            SELECT OR2.AccountNumber
                 , SUM(OR2.paidToUs) AS paidToUsSum
              FROM ExcelData XD2
                INNER JOIN OurRecords OR2 ON XD2.Account = OR2.AccountNumber
              WHERE OR2.ourDate BETWEEN XD2.DateAgency AND XD2.DateDirect
              GROUP BY OR2.AccountNumber
No results, and there is guaranteed one result which should come of this.  Out of curiosity:  You are using BETWEEN for the date value.  Is that inclusive of the dates listed?  I was using >= AND <= to make sure of proper coverage.  Tried that in the subquery and it gave no results.  I can see now, though, that I only need get the subquery working correctly then add it in, should give correct results.  Except should the join on the subquery not be a left outer join, since there is nu guarantee there will be any results in the sub?
THanks,
Geoff
Did you find a solution for this problem?  I am having the exact same issue connecting to an IBM AS400 and can't figure it out.  Thanks.
Oops.  Wrong problem :$
Ok.  This is what I have so far.  It returns correct values for the first part, but the second part is returning all values. (I am getting all the values from first query, plus those which didn't match in the first).
Thanks.

if @GetFound=1
BEGIN
SELECT DISTINCT 
         OR1.fileName
       , OR1.invoiceNumber
       , OR1.debtor
       , OR1.AccountNumber
       , OR1.paidToUs
       , OR1.dueToUs
       , OR1.GST
       , OR1.ourDate
       , XD1.DateAgency
       , XD1.DateDirect
    FROM ExcelData XD1
      INNER JOIN OurRecords OR1 ON XD1.Account = OR1.AccountNumber
      INNER JOIN
            (
            SELECT OR2.AccountNumber
                 , SUM(OR2.paidToUs) AS paidToUsSum
              FROM ExcelData XD2
                INNER JOIN OurRecords OR2 ON XD2.Account = OR2.AccountNumber
              WHERE OR2.ourDate BETWEEN XD2.DateAgency AND XD2.DateDirect
              GROUP BY OR2.AccountNumber
            ) a ON OR1.AccountNumber = a.AccountNumber
    WHERE OR1.ourDate BETWEEN XD1.DateAgency AND XD1.DateDirect
      AND XD1.Amount1 IN (OR1.paidToUs,a.paidToUsSum)
      AND XD1.Amount3 = OR1.dueToUs
END
ELSE
BEGIN
	SELECT DISTINCT  Debtor, Account, Amount1 AS PaidToUs, Amount3 AS DueToUs, Amount4 AS GST, DateAgency, DateDirect
	FROM         ExcelData
	WHERE     (id NOT IN
					(SELECT DISTINCT 
					 OR1.id
				FROM ExcelData XD1
				  INNER JOIN OurRecords OR1 ON XD1.Account = OR1.AccountNumber
				  INNER JOIN
						(
						SELECT OR2.AccountNumber
							 , SUM(OR2.paidToUs) AS paidToUsSum
						  FROM ExcelData XD2
							INNER JOIN OurRecords OR2 ON XD2.Account = OR2.AccountNumber
						  WHERE OR2.ourDate BETWEEN XD2.DateAgency AND XD2.DateDirect
						  GROUP BY OR2.AccountNumber
						) a ON OR1.AccountNumber = a.AccountNumber
				WHERE OR1.ourDate BETWEEN XD1.DateAgency AND XD1.DateDirect
				  AND XD1.Amount1 IN (OR1.paidToUs,a.paidToUsSum)
				  AND XD1.Amount3 = OR1.dueToUs))
END

Open in new window

Can you clarify?  When you mention the first and second part/query, are you referring to each query within the IF ELSE statement, or are you talking about the main query and the derived table query from the first part of the IF statement that we've been working with.

From a prev post, you asked me about the BETWEEN condition.  Yes, this is inclusive of the test values.  Also, I don't think you need to make the INNER JOIN on the derived table into a LEFT OUTER JOIN, but you can try that to see if it helps.
Thanks.  I experimented with both the Between and with the inner join/left outer join.  They both gave the same results so I went with your suggestions.  The first are second are referring to the queries in the if/else statement.  This procedure will return either all the records matching the conditions in the initial if then, or it will return all the records which do NOT match those conditions, depending on the @GetFound parameter.
Geoff
Try just running the first query and add the XD1.id & OR1.id columns.  Make sure they are the same values.  If not, this is why your second query isn't working properly.  Also, are these two columns the same datatypes?  If not, you will need to cast/convert them to be the same datatype.

On another note, I added one thing to the SELECT clause of the query.  I added a CASE statement to return the appropriate paidToUs value based on whether or not it matched the regular value or the summed value.  Not sure if you need this, but thought I would add it just in case.
SELECT DISTINCT
         XD1.id
       , OR1.id
       , OR1.fileName
       , OR1.invoiceNumber
       , OR1.debtor
       , OR1.AccountNumber
       , CASE XD1.Amount1
           WHEN OR1.paidToUs THEN OR1.paidToUs
           WHEN a.paidToUsSum THEN a.paidToUsSum
         END AS paidToUs
       , OR1.dueToUs
       , OR1.GST
       , OR1.ourDate
       , XD1.DateAgency
       , XD1.DateDirect
    FROM ExcelData XD1
      INNER JOIN OurRecords OR1 ON XD1.Account = OR1.AccountNumber
      INNER JOIN
            (
            SELECT OR2.AccountNumber
                 , SUM(OR2.paidToUs) AS paidToUsSum
              FROM ExcelData XD2
                INNER JOIN OurRecords OR2 ON XD2.Account = OR2.AccountNumber
              WHERE OR2.ourDate BETWEEN XD2.DateAgency AND XD2.DateDirect
              GROUP BY OR2.AccountNumber
            ) a ON OR1.AccountNumber = a.AccountNumber
    WHERE OR1.ourDate BETWEEN XD1.DateAgency AND XD1.DateDirect
      AND XD1.Amount1 IN (OR1.paidToUs,a.paidToUsSum)
      AND XD1.Amount3 = OR1.dueToUs

Open in new window

Thanks.  I will add the ID's to check.  And yes, when I define any table I always put in id as the first column.  Identity, primary key, integer.  Auto increment (identity (1,1)) so they are both the same type.  And it should be just eh excel id's.  In the second query (the else clause) it should be drawhing from Excel where the id is not in the initial query.
You said ourData matching excel...? Maybe I am drawing the wrong ID for comparison.  Thanks.
 
Yes.. I see the problem..
(SELECT DISTINCT
OR1.id
FROM ExcelData XD1

SHould be (SELECT DISTINCT
XD1.id
FROM ExcelData XD1
 
Thanks for pointing that out.
No problem...hope that solves the issue.
Just resizing my VHD so I couldn't tell you.  If it does you will be the second to know :)  May take a couple of hours though - ran out of space trying to install Silverlight development patch.
Well... I just finished creating a whole new VHD, and installing Server on it, and getting updates.  Tomorrow I install Visual Studio and try to salvage my settings.  Took a little longer than I expected...
No problem...keep me posted.
Still not quite there.  I did some more modifying on this.  I am enclosing the excelData schema (can't give values) and the ourRecords schema.  Hopefully this will help to understand what I am attempting.  The account numbers in the 2 files do not correspond in any way, FYI.
This is what I have done so far, but it gives me an error.  The inner query on this one returns 2 different values since i am noticing that EXPR1 has a couple of values which are coming up NULL for the XD2.id column.

SELECT DISTINCT Debtor, Account, Amount1 AS PaidToUs, Amount3 AS DueToUs, Amount4 AS GST, DateAgency, DateDirect
FROM         ExcelData
WHERE     (id NOT IN
                          (SELECT DISTINCT XD1.id, a.id as EXPR1
                            FROM          ExcelData AS XD1 INNER JOIN
                                                   OurRecords AS OR1 ON XD1.Account = OR1.AccountNumber AND XD1.Amount1 = OR1.paidToUs AND XD1.Amount3 = OR1.dueToUs AND 
                                                   XD1.Amount4 = OR1.GST FULL OUTER JOIN
                                                       (SELECT     XD2.id
                                                         FROM          ExcelData AS XD2 INNER JOIN
                                                                                OurRecords AS OR2 ON XD2.Account = OR2.AccountNumber AND XD2.Amount3 = OR2.dueToUs
                                                         WHERE      (OR2.ourDate BETWEEN XD2.DateAgency AND XD2.DateDirect)
                                                         GROUP BY OR2.AccountNumber, XD2.Amount1, XD2.id
                                                         HAVING      (XD2.Amount1 = SUM(OR2.paidToUs))) AS a ON XD1.id = a.id)))

Open in new window

excelData.xls
ourRecords.xls
It looks like there were a few problems with the query.  I'm not sure if I understand all the changes you made, but if it helps you get the results you're looking for, then great.

The first problem was the "id NOT IN (SELECT DISTINCT XD1.id, a.id as EXPR1 ..." section.  The subquery for an IN condition cannot have more than one value in the SELECT clause.  You need to use a NOT EXISTS condition instead.

The second problem was with the inner most subquery that contained the GROUP BY ... HAVING clauses.  All non-aggregated columns in the SELECT and HAVING clauses must be in the GROUP BY clause, and all columns in the GROUP BY must exist in the SELECT and/or HAVING clauses.  You had the OR2.AccountNumber column in the GROUP BY, but it wasn't in the SELECT or HAVING.  On that note, I'm not sure if grouping by the id column will do much since this is an identity column.

I fixed the query so that it at least will run based on the data you gave me, but I'm not quite sure what you want the final output to look like, so I cannot verify its accuracy.
  SELECT DISTINCT
         Debtor
       , Account
       , Amount1 AS PaidToUs
       , Amount3 AS DueToUs
       , Amount4 AS GST
       , DateAgency
       , DateDirect
    FROM ExcelData XD
    WHERE NOT EXISTS --id NOT IN
            (
            SELECT 1 --DISTINCT XD1.id, a.id as EXPR1
              FROM ExcelData XD1
                INNER JOIN OurRecords OR1 ON XD1.Account = OR1.AccountNumber
                FULL OUTER JOIN
                    (
                    SELECT XD2.id
                      FROM ExcelData XD2
                        INNER JOIN OurRecords OR2 ON XD2.Account = OR2.AccountNumber
                      WHERE OR2.ourDate BETWEEN XD2.DateAgency AND XD2.DateDirect
                        AND XD2.Amount3 = OR2.dueToUs
                      GROUP BY XD2.id, XD2.Amount1 --, OR2.AccountNumber
                      HAVING XD2.Amount1 = SUM(OR2.paidToUs)
                    ) a ON XD1.id = a.id
              WHERE XD.id NOT IN (XD1.id, a.id) --Added line
                AND XD1.Amount1 = OR1.paidToUs
                AND XD1.Amount3 = OR1.dueToUs
                AND XD1.Amount4 = OR1.GST
            )

Open in new window

Accoutn number is not an identity column :)  It will join on many to many.  But I think I understand the NOT IN vs NOT EXISTS...
As for the group by and all that... I am in the process of trying to learn LINQ well enough to do this using that since it seems like it would be simpler there.  Or am I mistaken?  I will plug this in and take a look.  Thanks again.
 
I notice you changed the SELECT DISTINCT to Select 1 - Was that necessary for the Exists clause to work?  My reason for asking is I expect there will be instances where multiple rows are returned from that inner query.
 
 
ASKER CERTIFIED SOLUTION
Avatar of aaronakin
aaronakin
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Ahh.  Now I understand.  Thanks.  As for the DBA in our company, well, I guess that's me by default :)  I understand the aversion to dynamic SQL of any sort, but LINQ allows me to write the queries in c# and use c# controls to filter the data.  Anyhow... thanks for the help on this.  It was certainly a learning experience.
Glad you got it working and glad I could help out.

BTW...if you're wanting to write queries using C#, you could always use CLR instead of LINQ.  http://www.developer.com/net/net/article.php/3528601
:)  I have heard of it, but not ventured into it yet.  Isn't that writing .NET code against the database in SQL server?
 
FYI this is this query LINQ produced:

SELECT     Debtor, Account, Amount1, Amount3, Amount4, DateAgency, DateDirect
FROM         ExcelData AS t0
WHERE     (NOT EXISTS
                          (SELECT     NULL AS EMPTY
                            FROM          OurRecords AS t1 INNER JOIN
                                                   ExcelData AS t2 ON t1.AccountNumber = t2.Account AND t1.paidToUs = t2.Amount1 AND t1.dueToUs = t2.Amount3 AND
                                                   t1.GST = t2.Amount4 AND t1.ourDate >= t2.DateAgency AND t1.ourDate <= t2.DateDirect
                            WHERE      (t2.id = t0.id)))
Which gives perfect results.  Thanks again for the help.  I will look at the CLR you mentioned.