Solved

Pure SQL Question (expert :-)

Posted on 2016-09-14
16
92 Views
Last Modified: 2016-10-08
Hello,
We have a table of events. Each event has a date, is associated with an object, and has a type. So the table is like :
eventsTable (eventUniqueKey, objectID, EventDate, eventType)
We want a select that lists all the objectID that have an event of eventType 'X' without an eventType 'Y' after.
To make things clear :
eventUniqueKey / objectID / EventDate / eventType
1 / 1 / 2016-01-01 / X
2 / 1 / 2016-01-02 / Y
In this case, object 1 has X and an Y after, so it must not be pulled out.
1 / 1 / 2016-01-01 / X
2 / 2 / 2016-01-02 / Y
In this case, objet 1 has and X event but no Y event (Y event is to object 2), so objectID 1 must be pulled out, and not objectID 2.
1 / 1 / 2016-01-02 / X
2 / 1 / 2016-01-01 / Y
In this case, object 1 has X but the Y is before, so objectID 1 must be pulled out.
We ask for a 'pure' SQL that does that, we did it with temp table and scalar function, etc. but it is pretty slow.
Thanks for help
0
Comment
Question by:javilmer2
  • 5
  • 3
  • 3
  • +1
16 Comments
 
LVL 42

Accepted Solution

by:
zephyr_hex earned 250 total points (awarded by participants)
Comment Utility
I assume we're only seeing a small subset of your data, and that other scenarios may exist... which could throw a wrench into answers you receive.

In any event, this is one way to get the rows you've identified in your 3 scenarios:

Given the following data set in #tmp_table:
data
Rows 1 and 6 are a valid pair.  Rows 2 and 4 are invalid X events.

SELECT * FROM (
	SELECT ROW_NUMBER() OVER(PARTITION BY o_id ORDER BY e_date ASC) AS rn_asc, 
	ROW_NUMBER() OVER(PARTITION BY o_id ORDER BY e_date DESC) AS rn_desc,
	*
	FROM #tmp_table
	) u
WHERE (u.rn_asc = u.rn_desc AND e_type = 'X') OR (rn_asc = 2 AND e_type = 'X')

Open in new window


The basic premise is to assign row numbers based on the original id and date ASC and then also DESC, and then look for the invalid scenarios:  where the row number for ASC and DESC is the same on an X event, or where the X event occurs before the Y event.

Result:
result
0
 
LVL 13

Assisted Solution

by:Megan Brooks
Megan Brooks earned 250 total points (awarded by participants)
Comment Utility
I think you are looking for something like this:
SELECT *
FROM eventsTable et1
WHERE EventType = 'X'
AND NOT EXISTS (
    SELECT * FROM eventsTable et2
    WHERE et1.objectID = et2.objectID
        AND et1.EventDate < et2.EventDate
        AND et2.EventType = 'Y'
)

Open in new window


This isn't tested (supplying a bit of formatted test data would help with that).
0
 
LVL 42

Expert Comment

by:zephyr_hex
Comment Utility
@Megan (hey!  that's my real name, too ;) )

You can use my test data set:

SELECT 1 AS u_id, 1 AS o_id, CAST('2016-01-01' AS DATE) AS e_date, 'X' AS e_type
INTO #tmp_table

INSERT INTO #tmp_table VALUES (2,1,CAST('2016-01-02' AS DATE),'Y'), (1,3,CAST('2016-01-01' AS DATE), 'X'),(2,4,CAST('2016-01-02' AS DATE), 'Y'),(1,5,CAST('2016-01-02' AS DATE), 'X'),(2,5,CAST('2016-01-01' AS DATE), 'Y')

Open in new window


With your query modified to match my lazily named columns:
SELECT *
FROM #tmp_table et1
WHERE e_type = 'X'
AND NOT EXISTS (
    SELECT * FROM #tmp_table et2
    WHERE et1.o_id = et2.o_id
        AND et1.e_date < et2.e_date
        AND et2.e_type = 'Y'
)

Open in new window


Also returns the rows I would expect to be identified as invalid:
result
0
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
You have not specified what you want as the result of the query, but if you just want a summary you probably need to group by object_id, for example:
select x.objectID, max(x.EventDate) as max_EventDate
from eventsTable x
left join eventsTable y
  on x.objectID = y.objectID
 and x.EventDate < y.EventDate
 and y.eventType = 'Y' 
where x.eventType = 'X'
group by x.objectID
having count(y.objectID) = 0

Open in new window

try this slightly expanded test:
declare @eventsTable table
    (eventUniqueKey int, objectID int, EventDate datetime, eventType varchar(1))
;
    
INSERT INTO @eventsTable 
    ("eventUniqueKey", "objectID", "EventDate", "eventType")
VALUES
    (1, 11, '2016-01-01 00:00:00', 'X'),
    (2, 11, '2016-01-02 00:00:00', 'Y'),
    (3, 12, '2016-01-01 00:00:00', 'X'),
    (4, 12, '2016-01-02 00:00:00', 'Y'),
    (5, 13, '2016-01-02 00:00:00', 'X'),
    (6, 13, '2016-01-01 00:00:00', 'Y')
;

select x.objectID, max(x.EventDate) as max_EventDate
from @eventsTable x
left join @eventsTable y
  on x.objectID = y.objectID
 and x.EventDate < y.EventDate
 and y.eventType = 'Y' 
where x.eventType = 'X'
group by x.objectID
having count(y.objectID) = 0

Open in new window

test result
objectID max_EventDate 
-------- ------------- 
13       2016-01-02 00:00:00

Open in new window

0
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
apologies, think the previous sample set was wrong and so was the query/outcome.I think this corrects my errors.
declare @eventsTable table
    (eventUniqueKey int, objectID int, EventDate datetime, eventType varchar(1))
;
    
INSERT INTO @eventsTable 
    ("eventUniqueKey", "objectID", "EventDate", "eventType")
VALUES
    (1, 11, '2016-01-01 00:00:00', 'X'),
    (2, 11, '2016-01-02 00:00:00', 'Y'),
    (3, 12, '2016-01-01 00:00:00', 'X'),
    (4, 22, '2016-01-02 00:00:00', 'Y'),
    (5, 13, '2016-01-02 00:00:00', 'X'),
    (6, 13, '2016-01-01 00:00:00', 'Y')
;

select x.objectID, max(x.EventDate) as max_EventDate
from @eventsTable x
left join @eventsTable y
  on x.objectID = y.objectID
 and x.EventDate > y.EventDate
 and y.eventType = 'Y' 
where x.eventType = 'X'
group by x.objectID
having count(y.objectID) > 0

Open in new window

result
objectID max_EventDate 
-------- ------------- 
13       2016-01-02 00:00:00

Open in new window

i.e. ONLY the third test case returns a row
1
 
LVL 17

Expert Comment

by:Pawan Kumar Khowal
Comment Utility
Most scalable solution , Enjoy :)

CREATE TABLE eventsTable
(
	eventUniqueKey int, objectID int, EventDate datetime, eventType varchar(1)
)
GO
    
INSERT INTO eventsTable 
    ("eventUniqueKey", "objectID", "EventDate", "eventType")
VALUES
    (1, 11, '2016-01-01 00:00:00', 'X'),
    (2, 11, '2016-01-02 00:00:00', 'Y'),

    (3, 12, '2016-01-01 00:00:00', 'X'),

    (4, 22, '2016-01-02 00:00:00', 'Y'),
    
	(5, 13, '2016-01-02 00:00:00', 'X'),
    (6, 13, '2016-01-01 00:00:00', 'Y')
;

SELECT tbl.objectid, tbl.eventdate FROM eventsTable tbl 
OUTER APPLY (SELECT MAX(tbl1.eventdate) eventdate FROM eventsTable tbl1 WHERE tbl1.eventtype='Y' AND tbl.objectid = tbl1.objectid) tbl2
WHERE tbl.eventtype='X' AND tbl.eventdate > tbl2.eventdate

--

Open in new window


--
0
Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
Remarkable how a proposed solution can be "best" or "most" when we are all guessing what the full requirements are.

We also have no idea what the scale is or what indexing has been applied.

Personally I would ignore "sales speak" when considering the options provided.
1
 
LVL 42

Expert Comment

by:zephyr_hex
Comment Utility
I concur with PortletPaul.  I see nothing "scaleable" about Pawan's solution.  I'm not even sure why someone would claim their query is scaleable in this situation.
0
 
LVL 17

Expert Comment

by:Pawan Kumar Khowal
Comment Utility
@Author - Could you please verify the answers and mark one as accepted solution if you dont need more help on this?
0
 
LVL 42

Expert Comment

by:zephyr_hex
Comment Utility
I object.  There is nothing in Pawan's suggested solution that makes it better than the other proposed solutions.  In fact, I don't even get the right results from Pawan's query.

Going back to the OP's request:  
We want a select that lists all the objectID that have an event of eventType 'X' without an eventType 'Y'

Using Pawan's code with the test data provided by the OP:

SELECT 1 AS u_id, 1 AS o_id, CAST('2016-01-01' AS DATE) AS e_date, 'X' AS e_type
INTO #tmp_table

INSERT INTO #tmp_table VALUES (2,1,CAST('2016-01-02' AS DATE),'Y'), (1,3,CAST('2016-01-01' AS DATE), 'X'),(2,4,CAST('2016-01-02' AS DATE), 'Y'),(1,5,CAST('2016-01-02' AS DATE), 'X'),(2,5,CAST('2016-01-01' AS DATE), 'Y')


SELECT tbl.o_id, tbl.e_date FROM #tmp_table tbl 
OUTER APPLY (SELECT MAX(tbl1.e_date) e_date FROM #tmp_table tbl1 WHERE tbl1.e_type='Y' AND tbl.o_id = tbl1.o_id) tbl2
WHERE tbl.e_type='X' AND tbl.e_date > tbl2.e_date

Open in new window


Yields:
result
But the expected result is these TWO records because they both are event type X with no corresponding Y event.
expected result
PortletPaul's solution is also incorrect.  I get the same result PortletPaul shows in his answer, but it's not the right answer.  It's missing the other type X record with no corresponding Y.

As such, I counter the close request with the following proposal:

#41798923 - 250 points
#41798925 - 250 points

These two answers provide the solution the OP requested.
0
 
LVL 17

Expert Comment

by:Pawan Kumar Khowal
Comment Utility
zephyr_hex - The author has not responded. You think our answer is incorrect. It may also possible that your answer is incorrect.  

Only Author can decide on that, what he needs. We waited for 14 days.

In thin case lets close this question with awarding 0 points to all.
0
 
LVL 42

Expert Comment

by:zephyr_hex
Comment Utility
Except, the OP DID give us what the correct / expected result should be in the question. So we know what result is correct and what result is not correct for the data set the OP gave us.
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Join & Write a Comment

Suggested Solutions

SQL Server engine let you use a Windows account or a SQL Server account to connect to a SQL Server instance. This can be configured immediatly during the SQL Server installation or after in the Server Authentication section in the Server properties …
This article explains how to reset the password of the sa account on a Microsoft SQL Server.  The steps in this article work in SQL 2005, 2008, 2008 R2, 2012, 2014 and 2016.
Using examples as well as descriptions, and references to Books Online, show the documentation available for date manipulation functions and by using a select few of these functions, show how date based data can be manipulated with these functions.
Via a live example combined with referencing Books Online, show some of the information that can be extracted from the Catalog Views in SQL Server.

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now