Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

Pure SQL Question (expert :-)

Posted on 2016-09-14
16
Medium Priority
?
121 Views
Last Modified: 2016-10-08
Hello,
We have a table of events. Each event has a date, is associated with an object, and has a type. So the table is like :
eventsTable (eventUniqueKey, objectID, EventDate, eventType)
We want a select that lists all the objectID that have an event of eventType 'X' without an eventType 'Y' after.
To make things clear :
eventUniqueKey / objectID / EventDate / eventType
1 / 1 / 2016-01-01 / X
2 / 1 / 2016-01-02 / Y
In this case, object 1 has X and an Y after, so it must not be pulled out.
1 / 1 / 2016-01-01 / X
2 / 2 / 2016-01-02 / Y
In this case, objet 1 has and X event but no Y event (Y event is to object 2), so objectID 1 must be pulled out, and not objectID 2.
1 / 1 / 2016-01-02 / X
2 / 1 / 2016-01-01 / Y
In this case, object 1 has X but the Y is before, so objectID 1 must be pulled out.
We ask for a 'pure' SQL that does that, we did it with temp table and scalar function, etc. but it is pretty slow.
Thanks for help
0
Comment
Question by:javilmer2
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 3
  • 3
  • +1
16 Comments
 
LVL 44

Accepted Solution

by:
zephyr_hex (Megan) earned 1000 total points (awarded by participants)
ID: 41798923
I assume we're only seeing a small subset of your data, and that other scenarios may exist... which could throw a wrench into answers you receive.

In any event, this is one way to get the rows you've identified in your 3 scenarios:

Given the following data set in #tmp_table:
data
Rows 1 and 6 are a valid pair.  Rows 2 and 4 are invalid X events.

SELECT * FROM (
	SELECT ROW_NUMBER() OVER(PARTITION BY o_id ORDER BY e_date ASC) AS rn_asc, 
	ROW_NUMBER() OVER(PARTITION BY o_id ORDER BY e_date DESC) AS rn_desc,
	*
	FROM #tmp_table
	) u
WHERE (u.rn_asc = u.rn_desc AND e_type = 'X') OR (rn_asc = 2 AND e_type = 'X')

Open in new window


The basic premise is to assign row numbers based on the original id and date ASC and then also DESC, and then look for the invalid scenarios:  where the row number for ASC and DESC is the same on an X event, or where the X event occurs before the Y event.

Result:
result
0
 
LVL 16

Assisted Solution

by:Megan Brooks
Megan Brooks earned 1000 total points (awarded by participants)
ID: 41798925
I think you are looking for something like this:
SELECT *
FROM eventsTable et1
WHERE EventType = 'X'
AND NOT EXISTS (
    SELECT * FROM eventsTable et2
    WHERE et1.objectID = et2.objectID
        AND et1.EventDate < et2.EventDate
        AND et2.EventType = 'Y'
)

Open in new window


This isn't tested (supplying a bit of formatted test data would help with that).
0
 
LVL 44

Expert Comment

by:zephyr_hex (Megan)
ID: 41798943
@Megan (hey!  that's my real name, too ;) )

You can use my test data set:

SELECT 1 AS u_id, 1 AS o_id, CAST('2016-01-01' AS DATE) AS e_date, 'X' AS e_type
INTO #tmp_table

INSERT INTO #tmp_table VALUES (2,1,CAST('2016-01-02' AS DATE),'Y'), (1,3,CAST('2016-01-01' AS DATE), 'X'),(2,4,CAST('2016-01-02' AS DATE), 'Y'),(1,5,CAST('2016-01-02' AS DATE), 'X'),(2,5,CAST('2016-01-01' AS DATE), 'Y')

Open in new window


With your query modified to match my lazily named columns:
SELECT *
FROM #tmp_table et1
WHERE e_type = 'X'
AND NOT EXISTS (
    SELECT * FROM #tmp_table et2
    WHERE et1.o_id = et2.o_id
        AND et1.e_date < et2.e_date
        AND et2.e_type = 'Y'
)

Open in new window


Also returns the rows I would expect to be identified as invalid:
result
0
NFR key for Veeam Backup for Microsoft Office 365

Veeam is happy to provide a free NFR license (for 1 year, up to 10 users). This license allows for the non‑production use of Veeam Backup for Microsoft Office 365 in your home lab without any feature limitations.

 
LVL 49

Expert Comment

by:PortletPaul
ID: 41798947
You have not specified what you want as the result of the query, but if you just want a summary you probably need to group by object_id, for example:
select x.objectID, max(x.EventDate) as max_EventDate
from eventsTable x
left join eventsTable y
  on x.objectID = y.objectID
 and x.EventDate < y.EventDate
 and y.eventType = 'Y' 
where x.eventType = 'X'
group by x.objectID
having count(y.objectID) = 0

Open in new window

try this slightly expanded test:
declare @eventsTable table
    (eventUniqueKey int, objectID int, EventDate datetime, eventType varchar(1))
;
    
INSERT INTO @eventsTable 
    ("eventUniqueKey", "objectID", "EventDate", "eventType")
VALUES
    (1, 11, '2016-01-01 00:00:00', 'X'),
    (2, 11, '2016-01-02 00:00:00', 'Y'),
    (3, 12, '2016-01-01 00:00:00', 'X'),
    (4, 12, '2016-01-02 00:00:00', 'Y'),
    (5, 13, '2016-01-02 00:00:00', 'X'),
    (6, 13, '2016-01-01 00:00:00', 'Y')
;

select x.objectID, max(x.EventDate) as max_EventDate
from @eventsTable x
left join @eventsTable y
  on x.objectID = y.objectID
 and x.EventDate < y.EventDate
 and y.eventType = 'Y' 
where x.eventType = 'X'
group by x.objectID
having count(y.objectID) = 0

Open in new window

test result
objectID max_EventDate 
-------- ------------- 
13       2016-01-02 00:00:00

Open in new window

0
 
LVL 49

Expert Comment

by:PortletPaul
ID: 41798952
apologies, think the previous sample set was wrong and so was the query/outcome.I think this corrects my errors.
declare @eventsTable table
    (eventUniqueKey int, objectID int, EventDate datetime, eventType varchar(1))
;
    
INSERT INTO @eventsTable 
    ("eventUniqueKey", "objectID", "EventDate", "eventType")
VALUES
    (1, 11, '2016-01-01 00:00:00', 'X'),
    (2, 11, '2016-01-02 00:00:00', 'Y'),
    (3, 12, '2016-01-01 00:00:00', 'X'),
    (4, 22, '2016-01-02 00:00:00', 'Y'),
    (5, 13, '2016-01-02 00:00:00', 'X'),
    (6, 13, '2016-01-01 00:00:00', 'Y')
;

select x.objectID, max(x.EventDate) as max_EventDate
from @eventsTable x
left join @eventsTable y
  on x.objectID = y.objectID
 and x.EventDate > y.EventDate
 and y.eventType = 'Y' 
where x.eventType = 'X'
group by x.objectID
having count(y.objectID) > 0

Open in new window

result
objectID max_EventDate 
-------- ------------- 
13       2016-01-02 00:00:00

Open in new window

i.e. ONLY the third test case returns a row
1
 
LVL 32

Expert Comment

by:Pawan Kumar
ID: 41799320
Most scalable solution , Enjoy :)

CREATE TABLE eventsTable
(
	eventUniqueKey int, objectID int, EventDate datetime, eventType varchar(1)
)
GO
    
INSERT INTO eventsTable 
    ("eventUniqueKey", "objectID", "EventDate", "eventType")
VALUES
    (1, 11, '2016-01-01 00:00:00', 'X'),
    (2, 11, '2016-01-02 00:00:00', 'Y'),

    (3, 12, '2016-01-01 00:00:00', 'X'),

    (4, 22, '2016-01-02 00:00:00', 'Y'),
    
	(5, 13, '2016-01-02 00:00:00', 'X'),
    (6, 13, '2016-01-01 00:00:00', 'Y')
;

SELECT tbl.objectid, tbl.eventdate FROM eventsTable tbl 
OUTER APPLY (SELECT MAX(tbl1.eventdate) eventdate FROM eventsTable tbl1 WHERE tbl1.eventtype='Y' AND tbl.objectid = tbl1.objectid) tbl2
WHERE tbl.eventtype='X' AND tbl.eventdate > tbl2.eventdate

--

Open in new window


--
0
 
LVL 49

Expert Comment

by:PortletPaul
ID: 41799384
Remarkable how a proposed solution can be "best" or "most" when we are all guessing what the full requirements are.

We also have no idea what the scale is or what indexing has been applied.

Personally I would ignore "sales speak" when considering the options provided.
1
 
LVL 44

Expert Comment

by:zephyr_hex (Megan)
ID: 41800054
I concur with PortletPaul.  I see nothing "scaleable" about Pawan's solution.  I'm not even sure why someone would claim their query is scaleable in this situation.
0
 
LVL 32

Expert Comment

by:Pawan Kumar
ID: 41801176
@Author - Could you please verify the answers and mark one as accepted solution if you dont need more help on this?
0
 
LVL 44

Expert Comment

by:zephyr_hex (Megan)
ID: 41823710
I object.  There is nothing in Pawan's suggested solution that makes it better than the other proposed solutions.  In fact, I don't even get the right results from Pawan's query.

Going back to the OP's request:  
We want a select that lists all the objectID that have an event of eventType 'X' without an eventType 'Y'

Using Pawan's code with the test data provided by the OP:

SELECT 1 AS u_id, 1 AS o_id, CAST('2016-01-01' AS DATE) AS e_date, 'X' AS e_type
INTO #tmp_table

INSERT INTO #tmp_table VALUES (2,1,CAST('2016-01-02' AS DATE),'Y'), (1,3,CAST('2016-01-01' AS DATE), 'X'),(2,4,CAST('2016-01-02' AS DATE), 'Y'),(1,5,CAST('2016-01-02' AS DATE), 'X'),(2,5,CAST('2016-01-01' AS DATE), 'Y')


SELECT tbl.o_id, tbl.e_date FROM #tmp_table tbl 
OUTER APPLY (SELECT MAX(tbl1.e_date) e_date FROM #tmp_table tbl1 WHERE tbl1.e_type='Y' AND tbl.o_id = tbl1.o_id) tbl2
WHERE tbl.e_type='X' AND tbl.e_date > tbl2.e_date

Open in new window


Yields:
result
But the expected result is these TWO records because they both are event type X with no corresponding Y event.
expected result
PortletPaul's solution is also incorrect.  I get the same result PortletPaul shows in his answer, but it's not the right answer.  It's missing the other type X record with no corresponding Y.

As such, I counter the close request with the following proposal:

#41798923 - 250 points
#41798925 - 250 points

These two answers provide the solution the OP requested.
0
 
LVL 32

Expert Comment

by:Pawan Kumar
ID: 41823724
zephyr_hex - The author has not responded. You think our answer is incorrect. It may also possible that your answer is incorrect.  

Only Author can decide on that, what he needs. We waited for 14 days.

In thin case lets close this question with awarding 0 points to all.
0
 
LVL 44

Expert Comment

by:zephyr_hex (Megan)
ID: 41823735
Except, the OP DID give us what the correct / expected result should be in the question. So we know what result is correct and what result is not correct for the data set the OP gave us.
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Ever needed a SQL 2008 Database replicated/mirrored/log shipped on another server but you can't take the downtime inflicted by initial snapshot or disconnect while T-logs are restored or mirror applied? You can use SQL Server Initialize from Backup…
In this article we will learn how to fix  “Cannot install SQL Server 2014 Service Pack 2: Unable to install windows installer msi file” error ?
Using examples as well as descriptions, and references to Books Online, show the documentation available for date manipulation functions and by using a select few of these functions, show how date based data can be manipulated with these functions.
This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.

618 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question