sej69
asked on
Duplicate records on a SQL query using left joins
Ok, I know this is pretty much by design; I'm just trying to find a way around it for this specific instance...
In short, (and I guess expected) I get multiple records returned when I call a query with an inner join. But in this case I do not want to know what's on the 2nd table; I just want to know it exists.
MSSQL 2008:
tblMain
pkMainID, int
szName, string(20)
...
tblDates
pkDateID
fkMainID
dtOpen (datetime)
dtClose (datetime)
In my case I want to be able to search the dates for open/close records without pulling up the dates. in my C# app, I handle this when the individual record is opened. It would be way too much information to pull in during the initial query.
I need to be able to search the dates for open records and closed records. Eg. if dtOpen > 2010-05-11' and dtClose = '1900-01-01' would mean that the record was opened but not closed yet. I also need to search the dtOpen and dtClose for specific dates.
What I have for my query looks something like this:
SELECT DISTINCT szName, pkID, DTS.dtOpen, DTS.dtClose
FROM
tblMain (LEFT JOIN tblDates as DTS on tblMain.pkMainID = DTS.fkMainID)
WHERE
DTS.dtOpen = '2011-05-11' AND DTS.dtOpen = '1900-01-01' .... AND other qualifications
If the tblDate has more than one record, the SQL will give me multiple records for each which I don't want. Does anyone know how I can use that table for checking but not reporting?
TIA!
In short, (and I guess expected) I get multiple records returned when I call a query with an inner join. But in this case I do not want to know what's on the 2nd table; I just want to know it exists.
MSSQL 2008:
tblMain
pkMainID, int
szName, string(20)
...
tblDates
pkDateID
fkMainID
dtOpen (datetime)
dtClose (datetime)
In my case I want to be able to search the dates for open/close records without pulling up the dates. in my C# app, I handle this when the individual record is opened. It would be way too much information to pull in during the initial query.
I need to be able to search the dates for open records and closed records. Eg. if dtOpen > 2010-05-11' and dtClose = '1900-01-01' would mean that the record was opened but not closed yet. I also need to search the dtOpen and dtClose for specific dates.
What I have for my query looks something like this:
SELECT DISTINCT szName, pkID, DTS.dtOpen, DTS.dtClose
FROM
tblMain (LEFT JOIN tblDates as DTS on tblMain.pkMainID = DTS.fkMainID)
WHERE
DTS.dtOpen = '2011-05-11' AND DTS.dtOpen = '1900-01-01' .... AND other qualifications
If the tblDate has more than one record, the SQL will give me multiple records for each which I don't want. Does anyone know how I can use that table for checking but not reporting?
TIA!
SELECT m.szName, m.pkID, MAX(d.dtOpen), MAX(d.dtClose)
FROM tblMain as m
LEFT JOIN tblDates as d
ON m.pkMainID = d.fkMainID
GROUP BY m.szName, m.pkID
HAVING MAX(d.dtOpen) = '2011-05-11' AND MAX(d.dtOpen) = '1900-01-01'
I suggest you check out the IF EXISTS, INTERSECT, EXCEPT in SQL and build your queries with CASE if needed.
WEell, that poses another question. Suppose you have a row in the tblDates for pkMainID = 1 with dtOpen = '2011-05-11' and a dtdtClose of '2011-05-11' and then another row in that table for pkMainID = 1 with dtOpen = '2011-05-11' and a dtdtClose of '1900-01-01'. Which one do you want to get back?
SQL_1 should provide a single row for a distinct dtOpen and dtClose combination for pkMainID;
SQL_2 should provide a single row for a pkMainID and a dtOpen but there is no guarantee on which dtClose will be used;
SQL_3 should provide a single row for a pkMainID and a dtClose but there is no guarantee on which dtOpendtClose will be used;
SELECT DISTINCT szName
,M.pkID
,DTS.dtOpen
,DTS.dtClose
FROM tblMain M
LEFT JOIN
(
SELECT pkMainID
,dtOpen
,dtClose
,row_number() OVER(PARTITION BY pkMainID, dtOpen, dtClose
ORDER BY pkMainID, dtOpen, dtClose) AS Selector
FROM tblDates
) DTS
ON M.pkMainID = DTS.fkMainID
WHERE DTS.Selector = 1
-- AND DTS.dtOpen = '2011-05-11'
-- AND DTS.dtOpen = '1900-01-01'
;
-- SQL_2
SELECT DISTINCT szName
,M.pkID
,DTS.dtOpen
,DTS.dtClose
FROM tblMain M
LEFT JOIN
(
SELECT Top 1
pkMainID
,dtOpen
,dtClose
,row_number() OVER(PARTITION BY pkMainID, dtOpen
ORDER BY pkMainID, dtOpen) AS Selector
FROM tblDates
) DTS
ON M.pkMainID = DTS.fkMainID
WHERE DTS.Selector = 1
-- AND DTS.dtOpen = '2011-05-11'
-- AND DTS.dtOpen = '1900-01-01'
;
SELECT DISTINCT szName
,M.pkID
,DTS.dtOpen
,DTS.dtClose
FROM tblMain M
LEFT JOIN
(
SELECT pkMainID
,dtOpen
,dtClose
,row_number() OVER(PARTITION BY pkMainID, dtClose
ORDER BY pkMainID, dtClose) AS Selector
FROM tblDates
) DTS
ON M.pkMainID = DTS.fkMainID
WHERE DTS.Selector = 1
-- AND DTS.dtOpen = '2011-05-11'
-- AND DTS.dtOpen = '1900-01-01'
ASKER
With this app, the data will in inconsistent if the open and close are on the same dates as well as having the same open or same close date for the record.
These are events that start one day and end another, then some time later (And always later) the even will be reopened.
Also for the records, all I need is the ID of the record that contains the dates requested. I don't care if the date is pulled in the initial query because I run another query if the user wants full detail on the record and clicks on it. At that point the query goes out and pulls all dates for the listed pkMainId...
All I need is to search on the dates.
I read in a couple of other places about using HAVING but I'm "having" a hard time getting the syntax right for MSSQL 2008. I keep getting an incorrect syntax near the keyword 'having'.
These are events that start one day and end another, then some time later (And always later) the even will be reopened.
Also for the records, all I need is the ID of the record that contains the dates requested. I don't care if the date is pulled in the initial query because I run another query if the user wants full detail on the record and clicks on it. At that point the query goes out and pulls all dates for the listed pkMainId...
All I need is to search on the dates.
I read in a couple of other places about using HAVING but I'm "having" a hard time getting the syntax right for MSSQL 2008. I keep getting an incorrect syntax near the keyword 'having'.
ASKER
This is the line that is causing me issues:
ORDER BY Event DESC HAVING MAX(CD.dtOpened) = '2011-03-23'
ORDER BY Event DESC HAVING MAX(CD.dtOpened) = '2011-03-23'
It should be other way.
HAVING MAX(CD.dtOpened) = '2011-03-23' ORDER BY Event DESC
ASKER
Ok, I finally figured out what I was doing with the HAVING clause... I needed to group by all the selected columns in the query before the having...
However, the query duplicated the dates in the all the rows returned... Every record has the MAX(CD.Opened) date of '2011-03-23' that I used in the query...
However, the query duplicated the dates in the all the rows returned... Every record has the MAX(CD.Opened) date of '2011-03-23' that I used in the query...
ASKER
8080, I tried your number 1 but the query seems to have put the sql server management studio into limbo...(not responding) When I looked at the task manager the query was eating up memory really fast...
Can you provide some sample data with expected result?
>>When I looked at the task manager the query was eating up memory really fast...<<
That would probably mean that your table is not appropriately indexed.
That would probably mean that your table is not appropriately indexed.
ASKER
This is my current query
I put the dtOpened up to see what was being returned. The dtOpened is different for both records but yet it shows the same when the query is run.
I don't care about seeing dtOpened or dtClosed in the query results; I just want to search to find an event that had that date to include it in my results list.
SELECT DISTINCT
nMatter Matter, szClaimant Claimant, szComments Comments, TATT.szInitials Attorney, CLT.szClientName Client,
CLT.pkClientID ClientID, AOL.szName LawTypeName, fkAttorney, AOL.pkLawID Area, dtLoss DateOfLoss,
szInsured Insured, szDestroyComment Destroy, MAX(CD.dtOpened)
FROM
tblCaseDates CD,(tblMatter
LEFT JOIN tblAttorney as TATT on tblMatter.fkAttorney = TATT.pkAttorneyID)
LEFT JOIN tblClient as CLT on tblMatter.fkClient = CLT.pkClientID
LEFT JOIN tblAreaOfLaw as AOL on tblMatter.fkAreaOfLaw = AOL.pkLawID
WHERE pkLawID = '2'
GROUP BY
nMatter, szClaimant, szComments, TATT.szInitials, CLT.szClientName, CLT.pkClientID,
AOL.szName, fkAttorney, AOL.pkLawID, dtLoss, szInsured, szDestroyComment
HAVING MAX(CD.dtOpened) = '2011-03-23'
ORDER BY Matter DESC
Matter Claimant Comments Attorney Client ClientID LawTypeName fkAttorney Area DateOfLoss Insured Destroy dtOpened
77778 George Sue vs. George AAA Max 1234 Thisone 1818 2 2007-02-13 Sue 123 2011-03-23
55555 Frontmall Nation Insurance BBB Hanson 121 Thisone 1802 2 1900-01-01 Sue Smith 02-19-20 2011-03-23
I put the dtOpened up to see what was being returned. The dtOpened is different for both records but yet it shows the same when the query is run.
I don't care about seeing dtOpened or dtClosed in the query results; I just want to search to find an event that had that date to include it in my results list.
ASKER
I've got indexes on all fields in the main table, and on anything that is searched in the other tables (which is about everything...)
Uh . . . that query bears NO resemblence to the information in the original post.
Also, you appear to have a cartesian join between tblCaseDates and the rest of the tables involved in the query. That would explain how this query would consume a lot fo memory.
Is this the query you really want debugged? I
f so, since it does not resemble your original post:
Also, you appear to have a cartesian join between tblCaseDates and the rest of the tables involved in the query. That would explain how this query would consume a lot fo memory.
Is this the query you really want debugged? I
f so, since it does not resemble your original post:
What are you trying to accomplish with this query?
how does the tblCaseDates relate to at least oneof the other tables? (I am anticipating that it would be best to have it relate to the tblMatter table. ;-)
What is this query doing or not doing that is the problem?
>>you appear to have a cartesian join between tblCaseDates and the rest of the tables involved in the query<<
Yep. Not to mention
The pointless use of DISTINCT when there is already a GROUP BY clause.
The LEFT JOIN tblAreaOfLaw AS AOL which is an implicit INNER JOIN as the following is in the WHERE clause: pkLawID = '2'
Yep. Not to mention
The pointless use of DISTINCT when there is already a GROUP BY clause.
The LEFT JOIN tblAreaOfLaw AS AOL which is an implicit INNER JOIN as the following is in the WHERE clause: pkLawID = '2'
ASKER
Yes, I was trying to keep the information as confidential as possible. Right now, I just need to get it to work so I posted the whole thing. I'm a very good programmer that is just 'ok' with SQL queries (but learning). I like to add things to the search engine but I probably will not select this now when I close this case. I have legal requirements that I need to follow on this side non-programming related... =/
The issue I am having is the same as stated above.
tblCaseDates
pkDateID
fkMatterID
Where tblCaseDates.fkMatterID = tblMatter.nMatter
Again, all I need to do is be able to query the dtOpened and dtClosed for specific dates; I don't need to display that data as I make an implicit query for that data later on if the user wants to look at the record.
In otherwords, if record has a date (dtOpened = 'xx/yy/zzz') in it's table and with the FK = to the matter ID then I want the record to show.
The DISTINCT is "legacy" when I was experimenting with the effects. The group by was just put in last night.
What do you mean by "implicit INNER JOIN"? I have a feeling that shouldn't be the case.....
Thanks guys.
The issue I am having is the same as stated above.
tblCaseDates
pkDateID
fkMatterID
Where tblCaseDates.fkMatterID = tblMatter.nMatter
Again, all I need to do is be able to query the dtOpened and dtClosed for specific dates; I don't need to display that data as I make an implicit query for that data later on if the user wants to look at the record.
In otherwords, if record has a date (dtOpened = 'xx/yy/zzz') in it's table and with the FK = to the matter ID then I want the record to show.
The DISTINCT is "legacy" when I was experimenting with the effects. The group by was just put in last night.
What do you mean by "implicit INNER JOIN"? I have a feeling that shouldn't be the case.....
Thanks guys.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Ah, I see. Yes, that is probably what I meant; I didn't know that syntax - thanks.
Any ideas on how I can get the data out in the format I want?
Any ideas on how I can get the data out in the format I want?
Please post your current query and the schema for all the tables involved.
ASKER
THanks
Select *
from basetable
where pkMainId in (select fkMainId from detailTable where dtOpen between @startDate and @endDAte)