Solved

Pure SQL Question (expert :-)

Posted on 2016-09-14
16
111 Views
Last Modified: 2016-10-08
Hello,
We have a table of events. Each event has a date, is associated with an object, and has a type. So the table is like :
eventsTable (eventUniqueKey, objectID, EventDate, eventType)
We want a select that lists all the objectID that have an event of eventType 'X' without an eventType 'Y' after.
To make things clear :
eventUniqueKey / objectID / EventDate / eventType
1 / 1 / 2016-01-01 / X
2 / 1 / 2016-01-02 / Y
In this case, object 1 has X and an Y after, so it must not be pulled out.
1 / 1 / 2016-01-01 / X
2 / 2 / 2016-01-02 / Y
In this case, objet 1 has and X event but no Y event (Y event is to object 2), so objectID 1 must be pulled out, and not objectID 2.
1 / 1 / 2016-01-02 / X
2 / 1 / 2016-01-01 / Y
In this case, object 1 has X but the Y is before, so objectID 1 must be pulled out.
We ask for a 'pure' SQL that does that, we did it with temp table and scalar function, etc. but it is pretty slow.
Thanks for help
0
Comment
Question by:javilmer2
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 3
  • 3
  • +1
16 Comments
 
LVL 43

Accepted Solution

by:
zephyr_hex (Megan) earned 250 total points (awarded by participants)
ID: 41798923
I assume we're only seeing a small subset of your data, and that other scenarios may exist... which could throw a wrench into answers you receive.

In any event, this is one way to get the rows you've identified in your 3 scenarios:

Given the following data set in #tmp_table:
data
Rows 1 and 6 are a valid pair.  Rows 2 and 4 are invalid X events.

SELECT * FROM (
	SELECT ROW_NUMBER() OVER(PARTITION BY o_id ORDER BY e_date ASC) AS rn_asc, 
	ROW_NUMBER() OVER(PARTITION BY o_id ORDER BY e_date DESC) AS rn_desc,
	*
	FROM #tmp_table
	) u
WHERE (u.rn_asc = u.rn_desc AND e_type = 'X') OR (rn_asc = 2 AND e_type = 'X')

Open in new window


The basic premise is to assign row numbers based on the original id and date ASC and then also DESC, and then look for the invalid scenarios:  where the row number for ASC and DESC is the same on an X event, or where the X event occurs before the Y event.

Result:
result
0
 
LVL 14

Assisted Solution

by:Megan Brooks
Megan Brooks earned 250 total points (awarded by participants)
ID: 41798925
I think you are looking for something like this:
SELECT *
FROM eventsTable et1
WHERE EventType = 'X'
AND NOT EXISTS (
    SELECT * FROM eventsTable et2
    WHERE et1.objectID = et2.objectID
        AND et1.EventDate < et2.EventDate
        AND et2.EventType = 'Y'
)

Open in new window


This isn't tested (supplying a bit of formatted test data would help with that).
0
 
LVL 43

Expert Comment

by:zephyr_hex (Megan)
ID: 41798943
@Megan (hey!  that's my real name, too ;) )

You can use my test data set:

SELECT 1 AS u_id, 1 AS o_id, CAST('2016-01-01' AS DATE) AS e_date, 'X' AS e_type
INTO #tmp_table

INSERT INTO #tmp_table VALUES (2,1,CAST('2016-01-02' AS DATE),'Y'), (1,3,CAST('2016-01-01' AS DATE), 'X'),(2,4,CAST('2016-01-02' AS DATE), 'Y'),(1,5,CAST('2016-01-02' AS DATE), 'X'),(2,5,CAST('2016-01-01' AS DATE), 'Y')

Open in new window


With your query modified to match my lazily named columns:
SELECT *
FROM #tmp_table et1
WHERE e_type = 'X'
AND NOT EXISTS (
    SELECT * FROM #tmp_table et2
    WHERE et1.o_id = et2.o_id
        AND et1.e_date < et2.e_date
        AND et2.e_type = 'Y'
)

Open in new window


Also returns the rows I would expect to be identified as invalid:
result
0
Complete VMware vSphere® ESX(i) & Hyper-V Backup

Capture your entire system, including the host, with patented disk imaging integrated with VMware VADP / Microsoft VSS and RCT. RTOs is as low as 15 seconds with Acronis Active Restore™. You can enjoy unlimited P2V/V2V migrations from any source (even from a different hypervisor)

 
LVL 48

Expert Comment

by:PortletPaul
ID: 41798947
You have not specified what you want as the result of the query, but if you just want a summary you probably need to group by object_id, for example:
select x.objectID, max(x.EventDate) as max_EventDate
from eventsTable x
left join eventsTable y
  on x.objectID = y.objectID
 and x.EventDate < y.EventDate
 and y.eventType = 'Y' 
where x.eventType = 'X'
group by x.objectID
having count(y.objectID) = 0

Open in new window

try this slightly expanded test:
declare @eventsTable table
    (eventUniqueKey int, objectID int, EventDate datetime, eventType varchar(1))
;
    
INSERT INTO @eventsTable 
    ("eventUniqueKey", "objectID", "EventDate", "eventType")
VALUES
    (1, 11, '2016-01-01 00:00:00', 'X'),
    (2, 11, '2016-01-02 00:00:00', 'Y'),
    (3, 12, '2016-01-01 00:00:00', 'X'),
    (4, 12, '2016-01-02 00:00:00', 'Y'),
    (5, 13, '2016-01-02 00:00:00', 'X'),
    (6, 13, '2016-01-01 00:00:00', 'Y')
;

select x.objectID, max(x.EventDate) as max_EventDate
from @eventsTable x
left join @eventsTable y
  on x.objectID = y.objectID
 and x.EventDate < y.EventDate
 and y.eventType = 'Y' 
where x.eventType = 'X'
group by x.objectID
having count(y.objectID) = 0

Open in new window

test result
objectID max_EventDate 
-------- ------------- 
13       2016-01-02 00:00:00

Open in new window

0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 41798952
apologies, think the previous sample set was wrong and so was the query/outcome.I think this corrects my errors.
declare @eventsTable table
    (eventUniqueKey int, objectID int, EventDate datetime, eventType varchar(1))
;
    
INSERT INTO @eventsTable 
    ("eventUniqueKey", "objectID", "EventDate", "eventType")
VALUES
    (1, 11, '2016-01-01 00:00:00', 'X'),
    (2, 11, '2016-01-02 00:00:00', 'Y'),
    (3, 12, '2016-01-01 00:00:00', 'X'),
    (4, 22, '2016-01-02 00:00:00', 'Y'),
    (5, 13, '2016-01-02 00:00:00', 'X'),
    (6, 13, '2016-01-01 00:00:00', 'Y')
;

select x.objectID, max(x.EventDate) as max_EventDate
from @eventsTable x
left join @eventsTable y
  on x.objectID = y.objectID
 and x.EventDate > y.EventDate
 and y.eventType = 'Y' 
where x.eventType = 'X'
group by x.objectID
having count(y.objectID) > 0

Open in new window

result
objectID max_EventDate 
-------- ------------- 
13       2016-01-02 00:00:00

Open in new window

i.e. ONLY the third test case returns a row
1
 
LVL 28

Expert Comment

by:Pawan Kumar
ID: 41799320
Most scalable solution , Enjoy :)

CREATE TABLE eventsTable
(
	eventUniqueKey int, objectID int, EventDate datetime, eventType varchar(1)
)
GO
    
INSERT INTO eventsTable 
    ("eventUniqueKey", "objectID", "EventDate", "eventType")
VALUES
    (1, 11, '2016-01-01 00:00:00', 'X'),
    (2, 11, '2016-01-02 00:00:00', 'Y'),

    (3, 12, '2016-01-01 00:00:00', 'X'),

    (4, 22, '2016-01-02 00:00:00', 'Y'),
    
	(5, 13, '2016-01-02 00:00:00', 'X'),
    (6, 13, '2016-01-01 00:00:00', 'Y')
;

SELECT tbl.objectid, tbl.eventdate FROM eventsTable tbl 
OUTER APPLY (SELECT MAX(tbl1.eventdate) eventdate FROM eventsTable tbl1 WHERE tbl1.eventtype='Y' AND tbl.objectid = tbl1.objectid) tbl2
WHERE tbl.eventtype='X' AND tbl.eventdate > tbl2.eventdate

--

Open in new window


--
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 41799384
Remarkable how a proposed solution can be "best" or "most" when we are all guessing what the full requirements are.

We also have no idea what the scale is or what indexing has been applied.

Personally I would ignore "sales speak" when considering the options provided.
1
 
LVL 43

Expert Comment

by:zephyr_hex (Megan)
ID: 41800054
I concur with PortletPaul.  I see nothing "scaleable" about Pawan's solution.  I'm not even sure why someone would claim their query is scaleable in this situation.
0
 
LVL 28

Expert Comment

by:Pawan Kumar
ID: 41801176
@Author - Could you please verify the answers and mark one as accepted solution if you dont need more help on this?
0
 
LVL 43

Expert Comment

by:zephyr_hex (Megan)
ID: 41823710
I object.  There is nothing in Pawan's suggested solution that makes it better than the other proposed solutions.  In fact, I don't even get the right results from Pawan's query.

Going back to the OP's request:  
We want a select that lists all the objectID that have an event of eventType 'X' without an eventType 'Y'

Using Pawan's code with the test data provided by the OP:

SELECT 1 AS u_id, 1 AS o_id, CAST('2016-01-01' AS DATE) AS e_date, 'X' AS e_type
INTO #tmp_table

INSERT INTO #tmp_table VALUES (2,1,CAST('2016-01-02' AS DATE),'Y'), (1,3,CAST('2016-01-01' AS DATE), 'X'),(2,4,CAST('2016-01-02' AS DATE), 'Y'),(1,5,CAST('2016-01-02' AS DATE), 'X'),(2,5,CAST('2016-01-01' AS DATE), 'Y')


SELECT tbl.o_id, tbl.e_date FROM #tmp_table tbl 
OUTER APPLY (SELECT MAX(tbl1.e_date) e_date FROM #tmp_table tbl1 WHERE tbl1.e_type='Y' AND tbl.o_id = tbl1.o_id) tbl2
WHERE tbl.e_type='X' AND tbl.e_date > tbl2.e_date

Open in new window


Yields:
result
But the expected result is these TWO records because they both are event type X with no corresponding Y event.
expected result
PortletPaul's solution is also incorrect.  I get the same result PortletPaul shows in his answer, but it's not the right answer.  It's missing the other type X record with no corresponding Y.

As such, I counter the close request with the following proposal:

#41798923 - 250 points
#41798925 - 250 points

These two answers provide the solution the OP requested.
0
 
LVL 28

Expert Comment

by:Pawan Kumar
ID: 41823724
zephyr_hex - The author has not responded. You think our answer is incorrect. It may also possible that your answer is incorrect.  

Only Author can decide on that, what he needs. We waited for 14 days.

In thin case lets close this question with awarding 0 points to all.
0
 
LVL 43

Expert Comment

by:zephyr_hex (Megan)
ID: 41823735
Except, the OP DID give us what the correct / expected result should be in the question. So we know what result is correct and what result is not correct for the data set the OP gave us.
0

Featured Post

NFR key for Veeam Backup for Microsoft Office 365

Veeam is happy to provide a free NFR license (for 1 year, up to 10 users). This license allows for the non‑production use of Veeam Backup for Microsoft Office 365 in your home lab without any feature limitations.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
T-SQL: Stored Procedure Syntax 3 34
SQL Instance service gone? 5 39
SQL Convert "vertical" data to "horizontal" data 4 30
SQL syntax question 6 44
Occasionally there is a need to clean table columns, especially if you have inherited legacy data. There are obviously many ways to accomplish that, including elaborate UPDATE queries with anywhere from one to numerous REPLACE functions (even within…
International Data Corporation (IDC) prognosticates that before the current the year gets over disbursing on IT framework products to be sent in cloud environs will be $37.1B.
Using examples as well as descriptions, and references to Books Online, show the different Recovery Models available in SQL Server and explain, as well as show how full, differential and transaction log backups are performed
Viewers will learn how to use the SELECT statement in SQL to return specific rows and columns, with various degrees of sorting and limits in place.

726 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question