Solved

Pure SQL Question (expert :-)

Posted on 2016-09-14
16
102 Views
Last Modified: 2016-10-08
Hello,
We have a table of events. Each event has a date, is associated with an object, and has a type. So the table is like :
eventsTable (eventUniqueKey, objectID, EventDate, eventType)
We want a select that lists all the objectID that have an event of eventType 'X' without an eventType 'Y' after.
To make things clear :
eventUniqueKey / objectID / EventDate / eventType
1 / 1 / 2016-01-01 / X
2 / 1 / 2016-01-02 / Y
In this case, object 1 has X and an Y after, so it must not be pulled out.
1 / 1 / 2016-01-01 / X
2 / 2 / 2016-01-02 / Y
In this case, objet 1 has and X event but no Y event (Y event is to object 2), so objectID 1 must be pulled out, and not objectID 2.
1 / 1 / 2016-01-02 / X
2 / 1 / 2016-01-01 / Y
In this case, object 1 has X but the Y is before, so objectID 1 must be pulled out.
We ask for a 'pure' SQL that does that, we did it with temp table and scalar function, etc. but it is pretty slow.
Thanks for help
0
Comment
Question by:javilmer2
  • 5
  • 3
  • 3
  • +1
16 Comments
 
LVL 42

Accepted Solution

by:
zephyr_hex (Megan) earned 250 total points (awarded by participants)
ID: 41798923
I assume we're only seeing a small subset of your data, and that other scenarios may exist... which could throw a wrench into answers you receive.

In any event, this is one way to get the rows you've identified in your 3 scenarios:

Given the following data set in #tmp_table:
data
Rows 1 and 6 are a valid pair.  Rows 2 and 4 are invalid X events.

SELECT * FROM (
	SELECT ROW_NUMBER() OVER(PARTITION BY o_id ORDER BY e_date ASC) AS rn_asc, 
	ROW_NUMBER() OVER(PARTITION BY o_id ORDER BY e_date DESC) AS rn_desc,
	*
	FROM #tmp_table
	) u
WHERE (u.rn_asc = u.rn_desc AND e_type = 'X') OR (rn_asc = 2 AND e_type = 'X')

Open in new window


The basic premise is to assign row numbers based on the original id and date ASC and then also DESC, and then look for the invalid scenarios:  where the row number for ASC and DESC is the same on an X event, or where the X event occurs before the Y event.

Result:
result
0
 
LVL 14

Assisted Solution

by:Megan Brooks
Megan Brooks earned 250 total points (awarded by participants)
ID: 41798925
I think you are looking for something like this:
SELECT *
FROM eventsTable et1
WHERE EventType = 'X'
AND NOT EXISTS (
    SELECT * FROM eventsTable et2
    WHERE et1.objectID = et2.objectID
        AND et1.EventDate < et2.EventDate
        AND et2.EventType = 'Y'
)

Open in new window


This isn't tested (supplying a bit of formatted test data would help with that).
0
 
LVL 42

Expert Comment

by:zephyr_hex (Megan)
ID: 41798943
@Megan (hey!  that's my real name, too ;) )

You can use my test data set:

SELECT 1 AS u_id, 1 AS o_id, CAST('2016-01-01' AS DATE) AS e_date, 'X' AS e_type
INTO #tmp_table

INSERT INTO #tmp_table VALUES (2,1,CAST('2016-01-02' AS DATE),'Y'), (1,3,CAST('2016-01-01' AS DATE), 'X'),(2,4,CAST('2016-01-02' AS DATE), 'Y'),(1,5,CAST('2016-01-02' AS DATE), 'X'),(2,5,CAST('2016-01-01' AS DATE), 'Y')

Open in new window


With your query modified to match my lazily named columns:
SELECT *
FROM #tmp_table et1
WHERE e_type = 'X'
AND NOT EXISTS (
    SELECT * FROM #tmp_table et2
    WHERE et1.o_id = et2.o_id
        AND et1.e_date < et2.e_date
        AND et2.e_type = 'Y'
)

Open in new window


Also returns the rows I would expect to be identified as invalid:
result
0
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

 
LVL 48

Expert Comment

by:PortletPaul
ID: 41798947
You have not specified what you want as the result of the query, but if you just want a summary you probably need to group by object_id, for example:
select x.objectID, max(x.EventDate) as max_EventDate
from eventsTable x
left join eventsTable y
  on x.objectID = y.objectID
 and x.EventDate < y.EventDate
 and y.eventType = 'Y' 
where x.eventType = 'X'
group by x.objectID
having count(y.objectID) = 0

Open in new window

try this slightly expanded test:
declare @eventsTable table
    (eventUniqueKey int, objectID int, EventDate datetime, eventType varchar(1))
;
    
INSERT INTO @eventsTable 
    ("eventUniqueKey", "objectID", "EventDate", "eventType")
VALUES
    (1, 11, '2016-01-01 00:00:00', 'X'),
    (2, 11, '2016-01-02 00:00:00', 'Y'),
    (3, 12, '2016-01-01 00:00:00', 'X'),
    (4, 12, '2016-01-02 00:00:00', 'Y'),
    (5, 13, '2016-01-02 00:00:00', 'X'),
    (6, 13, '2016-01-01 00:00:00', 'Y')
;

select x.objectID, max(x.EventDate) as max_EventDate
from @eventsTable x
left join @eventsTable y
  on x.objectID = y.objectID
 and x.EventDate < y.EventDate
 and y.eventType = 'Y' 
where x.eventType = 'X'
group by x.objectID
having count(y.objectID) = 0

Open in new window

test result
objectID max_EventDate 
-------- ------------- 
13       2016-01-02 00:00:00

Open in new window

0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 41798952
apologies, think the previous sample set was wrong and so was the query/outcome.I think this corrects my errors.
declare @eventsTable table
    (eventUniqueKey int, objectID int, EventDate datetime, eventType varchar(1))
;
    
INSERT INTO @eventsTable 
    ("eventUniqueKey", "objectID", "EventDate", "eventType")
VALUES
    (1, 11, '2016-01-01 00:00:00', 'X'),
    (2, 11, '2016-01-02 00:00:00', 'Y'),
    (3, 12, '2016-01-01 00:00:00', 'X'),
    (4, 22, '2016-01-02 00:00:00', 'Y'),
    (5, 13, '2016-01-02 00:00:00', 'X'),
    (6, 13, '2016-01-01 00:00:00', 'Y')
;

select x.objectID, max(x.EventDate) as max_EventDate
from @eventsTable x
left join @eventsTable y
  on x.objectID = y.objectID
 and x.EventDate > y.EventDate
 and y.eventType = 'Y' 
where x.eventType = 'X'
group by x.objectID
having count(y.objectID) > 0

Open in new window

result
objectID max_EventDate 
-------- ------------- 
13       2016-01-02 00:00:00

Open in new window

i.e. ONLY the third test case returns a row
1
 
LVL 28

Expert Comment

by:Pawan Kumar
ID: 41799320
Most scalable solution , Enjoy :)

CREATE TABLE eventsTable
(
	eventUniqueKey int, objectID int, EventDate datetime, eventType varchar(1)
)
GO
    
INSERT INTO eventsTable 
    ("eventUniqueKey", "objectID", "EventDate", "eventType")
VALUES
    (1, 11, '2016-01-01 00:00:00', 'X'),
    (2, 11, '2016-01-02 00:00:00', 'Y'),

    (3, 12, '2016-01-01 00:00:00', 'X'),

    (4, 22, '2016-01-02 00:00:00', 'Y'),
    
	(5, 13, '2016-01-02 00:00:00', 'X'),
    (6, 13, '2016-01-01 00:00:00', 'Y')
;

SELECT tbl.objectid, tbl.eventdate FROM eventsTable tbl 
OUTER APPLY (SELECT MAX(tbl1.eventdate) eventdate FROM eventsTable tbl1 WHERE tbl1.eventtype='Y' AND tbl.objectid = tbl1.objectid) tbl2
WHERE tbl.eventtype='X' AND tbl.eventdate > tbl2.eventdate

--

Open in new window


--
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 41799384
Remarkable how a proposed solution can be "best" or "most" when we are all guessing what the full requirements are.

We also have no idea what the scale is or what indexing has been applied.

Personally I would ignore "sales speak" when considering the options provided.
1
 
LVL 42

Expert Comment

by:zephyr_hex (Megan)
ID: 41800054
I concur with PortletPaul.  I see nothing "scaleable" about Pawan's solution.  I'm not even sure why someone would claim their query is scaleable in this situation.
0
 
LVL 28

Expert Comment

by:Pawan Kumar
ID: 41801176
@Author - Could you please verify the answers and mark one as accepted solution if you dont need more help on this?
0
 
LVL 42

Expert Comment

by:zephyr_hex (Megan)
ID: 41823710
I object.  There is nothing in Pawan's suggested solution that makes it better than the other proposed solutions.  In fact, I don't even get the right results from Pawan's query.

Going back to the OP's request:  
We want a select that lists all the objectID that have an event of eventType 'X' without an eventType 'Y'

Using Pawan's code with the test data provided by the OP:

SELECT 1 AS u_id, 1 AS o_id, CAST('2016-01-01' AS DATE) AS e_date, 'X' AS e_type
INTO #tmp_table

INSERT INTO #tmp_table VALUES (2,1,CAST('2016-01-02' AS DATE),'Y'), (1,3,CAST('2016-01-01' AS DATE), 'X'),(2,4,CAST('2016-01-02' AS DATE), 'Y'),(1,5,CAST('2016-01-02' AS DATE), 'X'),(2,5,CAST('2016-01-01' AS DATE), 'Y')


SELECT tbl.o_id, tbl.e_date FROM #tmp_table tbl 
OUTER APPLY (SELECT MAX(tbl1.e_date) e_date FROM #tmp_table tbl1 WHERE tbl1.e_type='Y' AND tbl.o_id = tbl1.o_id) tbl2
WHERE tbl.e_type='X' AND tbl.e_date > tbl2.e_date

Open in new window


Yields:
result
But the expected result is these TWO records because they both are event type X with no corresponding Y event.
expected result
PortletPaul's solution is also incorrect.  I get the same result PortletPaul shows in his answer, but it's not the right answer.  It's missing the other type X record with no corresponding Y.

As such, I counter the close request with the following proposal:

#41798923 - 250 points
#41798925 - 250 points

These two answers provide the solution the OP requested.
0
 
LVL 28

Expert Comment

by:Pawan Kumar
ID: 41823724
zephyr_hex - The author has not responded. You think our answer is incorrect. It may also possible that your answer is incorrect.  

Only Author can decide on that, what he needs. We waited for 14 days.

In thin case lets close this question with awarding 0 points to all.
0
 
LVL 42

Expert Comment

by:zephyr_hex (Megan)
ID: 41823735
Except, the OP DID give us what the correct / expected result should be in the question. So we know what result is correct and what result is not correct for the data set the OP gave us.
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this article I will describe the Detach & Attach method as one possible migration process and I will add the extra tasks needed for an upgrade when and where is applied so it will cover all.
In this article we will get to know that how can we recover deleted data if it happens accidently. We really can recover deleted rows if we know the time when data is deleted by using the transaction log.
Viewers will learn how to use the SELECT statement in SQL and will be exposed to the many uses the SELECT statement has.
Viewers will learn how to use the UPDATE and DELETE statements to change or remove existing data from their tables. Make a table: Update a specific column given a specific row using the UPDATE statement: Remove a set of values using the DELETE s…

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question