Solved

Duplicate records on a SQL query using left joins

Posted on 2011-03-24
20
393 Views
Last Modified: 2012-05-11
Ok, I know this is pretty much by design; I'm just trying to find a way around it for this specific instance...

In short, (and I guess expected) I get multiple records returned when I call a query with an inner join.  But in this case I do not want to know what's on the 2nd table; I just want to know it exists.

MSSQL 2008:

tblMain
pkMainID, int
szName, string(20)
...

tblDates
pkDateID
fkMainID
dtOpen  (datetime)
dtClose (datetime)

In my case I want to be able to search the dates for open/close records without pulling up the dates.  in my C# app, I handle this when the individual record is opened.  It would be way too much information to pull in during the initial query.

I need to be able to search the dates for open records and closed records.  Eg.  if dtOpen > 2010-05-11' and dtClose = '1900-01-01'  would mean that the record was opened but not closed yet.  I also need to search the dtOpen and dtClose for specific dates.

What I have for my query looks something like this:
SELECT DISTINCT szName, pkID, DTS.dtOpen, DTS.dtClose
FROM
tblMain (LEFT JOIN tblDates as DTS on tblMain.pkMainID = DTS.fkMainID)
WHERE
DTS.dtOpen = '2011-05-11' AND DTS.dtOpen = '1900-01-01' .... AND other qualifications

If the tblDate has more than one record, the SQL will give me multiple records for each which I don't want.  Does anyone know how I can use that table for checking but not reporting?

TIA!
0
Comment
Question by:sej69
  • 9
  • 4
  • 2
  • +4
20 Comments
 
LVL 14

Expert Comment

by:Christopher Gordon
ID: 35209946
You can use a subquery.  

Select *
from basetable
where pkMainId in (select fkMainId from detailTable where dtOpen between @startDate and @endDAte)
0
 
LVL 5

Expert Comment

by:bitref
ID: 35210070
SELECT m.szName, m.pkID, MAX(d.dtOpen), MAX(d.dtClose) 
FROM tblMain as m
	LEFT JOIN tblDates as d
		ON m.pkMainID = d.fkMainID
GROUP BY m.szName, m.pkID
HAVING MAX(d.dtOpen) = '2011-05-11' AND MAX(d.dtOpen) = '1900-01-01' 

Open in new window

0
 
LVL 39

Expert Comment

by:lcohan
ID: 35210097
I suggest you check out the IF EXISTS, INTERSECT, EXCEPT in SQL and build your queries with CASE if needed.
0
 
LVL 22

Expert Comment

by:8080_Diver
ID: 35210100
WEell, that poses another question.  Suppose you have a row in the tblDates for pkMainID = 1 with dtOpen = '2011-05-11' and a dtdtClose of '2011-05-11' and then another row in that table for pkMainID = 1 with dtOpen = '2011-05-11' and a dtdtClose of '1900-01-01'.  Which one do you want to get back?

SQL_1 should provide a single row for a distinct dtOpen and dtClose combination for pkMainID;
SQL_2 should provide a single row for a pkMainID and a dtOpen but there is no guarantee on which dtClose will be used;
SQL_3 should provide a single row for a pkMainID and a dtClose but there is no guarantee on which dtOpendtClose will be used;
SELECT DISTINCT szName
  ,M.pkID
  ,DTS.dtOpen
  ,DTS.dtClose
FROM tblMain M
LEFT JOIN 
(
 SELECT  pkMainID
        ,dtOpen 
        ,dtClose
        ,row_number() OVER(PARTITION BY pkMainID, dtOpen, dtClose 
                           ORDER BY pkMainID, dtOpen, dtClose) AS Selector
 FROM   tblDates
) DTS
ON  M.pkMainID = DTS.fkMainID
WHERE DTS.Selector = 1
--    AND DTS.dtOpen = '2011-05-11'
--    AND DTS.dtOpen = '1900-01-01'
;

-- SQL_2
SELECT DISTINCT szName
  ,M.pkID
  ,DTS.dtOpen
  ,DTS.dtClose
FROM tblMain M
LEFT JOIN 
(
 SELECT  Top 1
         pkMainID
        ,dtOpen 
        ,dtClose
        ,row_number() OVER(PARTITION BY pkMainID, dtOpen 
                           ORDER BY pkMainID, dtOpen) AS Selector
 FROM   tblDates
) DTS
ON  M.pkMainID = DTS.fkMainID
WHERE DTS.Selector = 1
--    AND DTS.dtOpen = '2011-05-11'
--    AND DTS.dtOpen = '1900-01-01'
;

SELECT DISTINCT szName
  ,M.pkID
  ,DTS.dtOpen
  ,DTS.dtClose
FROM tblMain M
LEFT JOIN 
(
 SELECT  pkMainID
        ,dtOpen 
        ,dtClose
        ,row_number() OVER(PARTITION BY pkMainID, dtClose 
                           ORDER BY pkMainID, dtClose) AS Selector
 FROM   tblDates
) DTS
ON  M.pkMainID = DTS.fkMainID
WHERE DTS.Selector = 1
--    AND DTS.dtOpen = '2011-05-11'
--    AND DTS.dtOpen = '1900-01-01'

Open in new window

0
 

Author Comment

by:sej69
ID: 35211996
With this app, the data will in inconsistent if the open and close are on the same dates as well as having the same open or same close date for the record.

These are events that start one day and end another, then some time later (And always later) the even will be reopened.

Also for the records, all I need is the ID of the record that contains the dates requested.  I don't care if the date is pulled in the initial query because I run another query if the user wants full detail on the record and clicks on it.  At that point the query goes out and pulls all dates for the listed pkMainId...

All I need is to search on the dates.

I read in a couple of other places about using HAVING but I'm "having" a hard time getting the syntax right for MSSQL 2008.  I keep getting an incorrect syntax near the keyword 'having'.
0
 

Author Comment

by:sej69
ID: 35212002
This is the line that is causing me issues:

ORDER BY Event DESC  HAVING  MAX(CD.dtOpened) = '2011-03-23'
0
 
LVL 40

Expert Comment

by:Sharath
ID: 35212101
It should be other way.
HAVING  MAX(CD.dtOpened) = '2011-03-23' ORDER BY Event DESC

Open in new window

0
 

Author Comment

by:sej69
ID: 35212658
Ok, I finally figured out what I was doing with the HAVING clause... I needed to group by all the selected columns in the query before the having...

However, the query duplicated the dates in the all the rows returned...  Every record has the MAX(CD.Opened) date of '2011-03-23' that I used in the query...
0
 

Author Comment

by:sej69
ID: 35212683
8080, I tried your number 1 but the query seems to have put the sql server management studio into limbo...(not responding) When I looked at the task manager the query was eating up memory really fast...
0
 
LVL 40

Expert Comment

by:Sharath
ID: 35212741
Can you provide some sample data with expected result?
0
Highfive + Dolby Voice = No More Audio Complaints!

Poor audio quality is one of the top reasons people don’t use video conferencing. Get the crispest, clearest audio powered by Dolby Voice in every meeting. Highfive and Dolby Voice deliver the best video conferencing and audio experience for every meeting and every room.

 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 35214681
>>When I looked at the task manager the query was eating up memory really fast...<<
That would probably mean that your table is not appropriately indexed.
0
 

Author Comment

by:sej69
ID: 35214759
This is my current query
SELECT DISTINCT 
nMatter Matter, szClaimant Claimant, szComments Comments, TATT.szInitials Attorney, CLT.szClientName Client, 
CLT.pkClientID ClientID, AOL.szName LawTypeName, fkAttorney, AOL.pkLawID Area, dtLoss DateOfLoss, 
szInsured Insured, szDestroyComment Destroy, MAX(CD.dtOpened) 
FROM 
tblCaseDates CD,(tblMatter 
LEFT JOIN tblAttorney as TATT on tblMatter.fkAttorney = TATT.pkAttorneyID) 
LEFT JOIN tblClient as CLT on tblMatter.fkClient = CLT.pkClientID 
LEFT JOIN tblAreaOfLaw as AOL on tblMatter.fkAreaOfLaw = AOL.pkLawID  
WHERE  pkLawID = '2'  
GROUP BY 
nMatter, szClaimant, szComments, TATT.szInitials, CLT.szClientName, CLT.pkClientID, 
AOL.szName, fkAttorney, AOL.pkLawID, dtLoss, szInsured, szDestroyComment 
HAVING  MAX(CD.dtOpened) = '2011-03-23' 
ORDER BY Matter DESC 

Open in new window


Matter	Claimant	Comments	Attorney	Client	ClientID	LawTypeName	fkAttorney	Area	DateOfLoss	Insured	Destroy	dtOpened
77778	George	Sue vs. George	AAA	Max	1234	Thisone	1818	2	2007-02-13	Sue	123	2011-03-23
55555	Frontmall	Nation Insurance      	BBB	Hanson	121	Thisone	1802	2	1900-01-01	Sue Smith	02-19-20	2011-03-23

Open in new window


I put the dtOpened up to see what was being returned.  The dtOpened is different for both records but yet it shows the same when the query is run.

I don't care about seeing dtOpened or dtClosed in the query results; I just want to search to find an event that had that date to include it in my results list.
0
 

Author Comment

by:sej69
ID: 35215722
I've got indexes on all fields in the main table, and on anything that is searched in the other tables (which is about everything...)
0
 
LVL 22

Expert Comment

by:8080_Diver
ID: 35216956
Uh . . . that query bears NO resemblence to the information in the original post.

Also, you appear to have a cartesian join between tblCaseDates and the rest of the tables involved in the query.  That would explain how this query would consume a lot fo memory.

Is this the query you really want debugged?  I

f so, since it does not resemble your original post:
What are you trying to accomplish with this query?
how does the tblCaseDates relate to at least oneof the other tables? (I am anticipating that it would be best to have it relate to the tblMatter table. ;-)
What is this query doing or not doing that is the problem?
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 35217067
>>you appear to have a cartesian join between tblCaseDates and the rest of the tables involved in the query<<
Yep.  Not to mention
The pointless use of DISTINCT when there is already a GROUP BY clause.
The LEFT JOIN tblAreaOfLaw AS AOL which is an implicit INNER JOIN as the following is in the WHERE clause: pkLawID = '2'
0
 

Author Comment

by:sej69
ID: 35219416
Yes, I was trying to keep the information as confidential as possible.  Right now, I just need to get it to work so I posted the whole thing.  I'm a very good programmer that is just 'ok' with SQL queries (but learning).  I like to add things to the search engine but I probably will not select this now when I close this case.  I have legal requirements that I need to follow on this side non-programming related... =/  

The issue I am having is the same as stated above.

tblCaseDates
 pkDateID
 fkMatterID

Where tblCaseDates.fkMatterID  = tblMatter.nMatter

Again, all I need to do is be able to query the dtOpened and dtClosed for specific dates; I don't need to display that data as I make an implicit query for that data later on if the user wants to look at the record.

In otherwords, if record has a date (dtOpened = 'xx/yy/zzz') in it's table and with the FK = to the matter ID then I want the record to show.  

The DISTINCT is "legacy" when I was experimenting with the effects.  The group by was just put in last night.

What do you mean by "implicit INNER JOIN"?  I have a feeling that shouldn't be the case.....

Thanks guys.
0
 
LVL 75

Accepted Solution

by:
Anthony Perkins earned 500 total points
ID: 35220285
>>What do you mean by "implicit INNER JOIN"?<<
This:
LEFT JOIN tblAreaOfLaw as AOL on tblMatter.fkAreaOfLaw = AOL.pkLawID  
WHERE  pkLawID = '2'  

Is the same as:
INNER JOIN tblAreaOfLaw as AOL on tblMatter.fkAreaOfLaw = AOL.pkLawID  
WHERE  pkLawID = '2'  

Probably what you mean is this:
LEFT JOIN tblAreaOfLaw as AOL on tblMatter.fkAreaOfLaw = AOL.pkLawID  AND pkLawID = '2'  
0
 

Author Comment

by:sej69
ID: 35233317
Ah, I see.  Yes, that is probably what I meant; I didn't know that syntax - thanks.

Any ideas on how I can get the data out in the format I want?
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 35237131
Please post your current query and the schema for all the tables involved.
0
 

Author Closing Comment

by:sej69
ID: 35405016
THanks
0

Featured Post

What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
sql query 7 37
How to create a delete query to delete entire records 2 19
Caste datetime 2 25
Update in Sql 7 12
SQL Server engine let you use a Windows account or a SQL Server account to connect to a SQL Server instance. This can be configured immediatly during the SQL Server installation or after in the Server Authentication section in the Server properties …
JSON is being used more and more, besides XML, and you surely wanted to parse the data out into SQL instead of doing it in some Javascript. The below function in SQL Server can do the job for you, returning a quick table with the parsed data.
This video shows, step by step, how to configure Oracle Heterogeneous Services via the Generic Gateway Agent in order to make a connection from an Oracle session and access a remote SQL Server database table.
Using examples as well as descriptions, and references to Books Online, show the documentation available for date manipulation functions and by using a select few of these functions, show how date based data can be manipulated with these functions.

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now