Link to home
Create AccountLog in
Avatar of Litation
Litation

asked on

Theoretical - search and grouping by odd criteria.

Hey.
 I have a bit of a conundrum and I have no idea how to really get the data I need.  I'm not really interested in someone doing the code for me (yet!!) as I want to have a crack as I'm sure I can do it I just need some help getting my brain to think in the right way.

Just to outline the data.
Test candidates are given an ID and take 2-4 test's each day for 14 days and get a score for each test from 0.0 - 100.0 (yes there are floating points rather than whole numbers on the result).  Each test is date/time stamped.

What I want to pull off is essentially where a candidate gets three tests above or below a certain value in a row and then pull back all of their tests within a set time frame.

I.E.  Joe (ID 899).   and my criteria is over 70
On day one he does 4 tests and gets the scores 75.2,  54.5,  12.8,  70.3
On day two he does 1 test and gets 99.9
On day three he does 3 tests and gets 79.1,  56.7 and 10.5
12 other candidates have done tests in this time as well but none meet the requirement


So Joe has the three consecutive tests over 70 spanning three days so I want to pull back Joes data for the duration.

So is there a simple way to do this?   I can sort by the candidate ID then  date/time but then its saying "when the result is greater than 'X' Ntimes give all results for candidate"

Also I will be exporting this data into Crystal so if there's a function there that might make it easier

Thanks in advance.
Avatar of Patrick Matthews
Patrick Matthews
Flag of United States of America image

It can be done.  I'd do it with self-joins...
Avatar of Mike McCracken
Mike McCracken

It can be done in Crystal though you will have to bring all the data for the date range into the report.  There is no built in function to do it.

Are you only comparing the tests scores for tests within the timeframe or do you need to check tests for all 14 days then pull data for the time frame?

Do 3 scores on 1 day count for selecting all the data for the person?

Are you going to run the reports from an application?

Are you trying to do this in SQL?

mlmcc
The following SQL approach appears to be working for me.

Please note that if you get 4 days in a row where at least one score hit 70, my query will see those as two different blocks (e.g., 1-2-3 and 2-3-4) and so the tests on the overlap days will be reported twice.


CREATE TABLE People (PersonID int, PersonName varchar(50))
INSERT INTO People(PersonID, PersonName)
SELECT 1, 'Joe' UNION ALL
SELECT 2, 'Fred' UNION ALL
SELECT 3, 'Lucy' UNION ALL
SELECT 4, 'Ethel'

CREATE TABLE TestScores (TestID int, PersonID int, TestDateTime datetime, TestScore numeric(6, 1))
INSERT INTO TestScores (TestID, PersonID, TestDateTime, TestScore)
SELECT 1, 1, '2012-05-01 03:25:01', 80.6 UNION ALL
SELECT 2, 1, '2012-05-01 05:15:22', 68.4 UNION ALL
SELECT 3, 1, '2012-05-02 00:53:07', 75.8 UNION ALL
SELECT 4, 1, '2012-05-02 05:54:49', 22.1 UNION ALL
SELECT 5, 1, '2012-05-02 08:41:37', 60.7 UNION ALL
SELECT 6, 1, '2012-05-02 13:01:57', 71.2 UNION ALL
SELECT 7, 1, '2012-05-03 04:15:57', 57.1 UNION ALL
SELECT 8, 1, '2012-05-03 09:29:21', 43.8 UNION ALL
SELECT 9, 1, '2012-05-04 02:48:55', 71.0 UNION ALL
SELECT 10, 1, '2012-05-04 07:51:55', 42.1 UNION ALL
SELECT 11, 1, '2012-05-04 08:54:29', 60.0 UNION ALL
SELECT 12, 1, '2012-05-05 00:30:31', 91.7 UNION ALL
SELECT 13, 1, '2012-05-05 01:12:13', 18.7 UNION ALL
SELECT 14, 1, '2012-05-05 05:36:59', 64.4 UNION ALL
SELECT 15, 1, '2012-05-05 11:17:18', 81.7 UNION ALL
SELECT 16, 1, '2012-05-06 04:59:41', 21.2 UNION ALL
SELECT 17, 1, '2012-05-06 06:10:48', 46.9 UNION ALL
SELECT 18, 1, '2012-05-06 11:08:12', 28.3 UNION ALL
SELECT 19, 1, '2012-05-07 02:01:06', 73.1 UNION ALL
SELECT 20, 1, '2012-05-07 03:09:16', 100.0 UNION ALL
SELECT 21, 1, '2012-05-07 04:10:04', 91.8 UNION ALL
SELECT 22, 1, '2012-05-07 07:20:59', 34.0 UNION ALL
SELECT 23, 1, '2012-05-08 03:20:37', 29.6 UNION ALL
SELECT 24, 1, '2012-05-08 04:17:29', 64.9 UNION ALL
SELECT 25, 1, '2012-05-08 05:25:44', 46.7 UNION ALL
SELECT 26, 1, '2012-05-08 06:16:27', 91.9 UNION ALL
SELECT 27, 1, '2012-05-09 05:55:40', 30.3 UNION ALL
SELECT 28, 1, '2012-05-09 11:45:24', 70.4 UNION ALL
SELECT 29, 1, '2012-05-09 16:11:06', 66.7 UNION ALL
SELECT 30, 1, '2012-05-10 05:06:16', 37.8 UNION ALL
SELECT 31, 1, '2012-05-10 10:15:45', 72.2 UNION ALL
SELECT 32, 1, '2012-05-11 02:48:48', 46.5 UNION ALL
SELECT 33, 1, '2012-05-11 06:06:57', 57.5 UNION ALL
SELECT 34, 1, '2012-05-11 08:44:08', 56.4 UNION ALL
SELECT 35, 1, '2012-05-11 09:17:23', 41.9 UNION ALL
SELECT 36, 1, '2012-05-12 03:51:57', 65.9 UNION ALL
SELECT 37, 1, '2012-05-12 06:16:43', 38.8 UNION ALL
SELECT 38, 1, '2012-05-12 07:27:40', 52.8 UNION ALL
SELECT 39, 1, '2012-05-13 02:17:05', 56.6 UNION ALL
SELECT 40, 1, '2012-05-13 04:57:00', 42.0 UNION ALL
SELECT 41, 1, '2012-05-14 05:52:43', 64.3 UNION ALL
SELECT 42, 1, '2012-05-14 08:20:30', 45.3 UNION ALL
SELECT 43, 2, '2012-05-01 05:48:23', 65.8 UNION ALL
SELECT 44, 2, '2012-05-01 09:53:47', 63.3 UNION ALL
SELECT 45, 2, '2012-05-01 13:28:28', 45.3 UNION ALL
SELECT 46, 2, '2012-05-02 03:55:38', 77.6 UNION ALL
SELECT 47, 2, '2012-05-02 07:10:44', 22.0 UNION ALL
SELECT 48, 2, '2012-05-02 12:18:42', 66.2 UNION ALL
SELECT 49, 2, '2012-05-02 15:07:42', 48.1 UNION ALL
SELECT 50, 2, '2012-05-03 02:25:48', 89.0 UNION ALL
SELECT 51, 2, '2012-05-03 05:06:33', 36.8 UNION ALL
SELECT 52, 2, '2012-05-03 08:40:03', 45.8 UNION ALL
SELECT 53, 2, '2012-05-03 14:00:22', 79.0 UNION ALL
SELECT 54, 2, '2012-05-04 02:26:09', 39.1 UNION ALL
SELECT 55, 2, '2012-05-04 05:54:15', 32.5 UNION ALL
SELECT 56, 2, '2012-05-04 07:05:36', 37.8 UNION ALL
SELECT 57, 2, '2012-05-05 03:34:33', 81.5 UNION ALL
SELECT 58, 2, '2012-05-05 05:25:50', 17.5 UNION ALL
SELECT 59, 2, '2012-05-05 08:46:29', 31.6 UNION ALL
SELECT 60, 2, '2012-05-06 02:52:00', 18.5 UNION ALL
SELECT 61, 2, '2012-05-06 07:12:51', 53.3 UNION ALL
SELECT 62, 2, '2012-05-07 01:53:01', 39.9 UNION ALL
SELECT 63, 2, '2012-05-07 05:24:06', 64.9 UNION ALL
SELECT 64, 2, '2012-05-07 06:38:52', 57.3 UNION ALL
SELECT 65, 2, '2012-05-07 07:58:19', 70.4 UNION ALL
SELECT 66, 2, '2012-05-08 05:02:11', 64.1 UNION ALL
SELECT 67, 2, '2012-05-08 06:50:28', 53.5 UNION ALL
SELECT 68, 2, '2012-05-08 10:46:20', 100.0 UNION ALL
SELECT 69, 2, '2012-05-08 12:28:52', 47.9 UNION ALL
SELECT 70, 2, '2012-05-09 03:21:09', 42.5 UNION ALL
SELECT 71, 2, '2012-05-09 08:12:23', 34.8 UNION ALL
SELECT 72, 2, '2012-05-10 04:05:15', 54.1 UNION ALL
SELECT 73, 2, '2012-05-10 08:18:00', 33.3 UNION ALL
SELECT 74, 2, '2012-05-10 12:40:38', 82.8 UNION ALL
SELECT 75, 2, '2012-05-11 04:33:08', 54.2 UNION ALL
SELECT 76, 2, '2012-05-11 06:49:11', 22.5 UNION ALL
SELECT 77, 2, '2012-05-11 10:14:01', 55.3 UNION ALL
SELECT 78, 2, '2012-05-11 14:12:51', 47.8 UNION ALL
SELECT 79, 2, '2012-05-12 01:44:57', 34.5 UNION ALL
SELECT 80, 2, '2012-05-12 05:39:47', 76.0 UNION ALL
SELECT 81, 2, '2012-05-13 05:45:11', 46.2 UNION ALL
SELECT 82, 2, '2012-05-13 08:09:56', 58.9 UNION ALL
SELECT 83, 2, '2012-05-13 12:08:31', 73.6 UNION ALL
SELECT 84, 2, '2012-05-14 03:26:11', 44.9 UNION ALL
SELECT 85, 2, '2012-05-14 05:16:45', 80.4 UNION ALL
SELECT 86, 3, '2012-05-01 04:26:33', 19.3 UNION ALL
SELECT 87, 3, '2012-05-01 06:13:27', 44.7 UNION ALL
SELECT 88, 3, '2012-05-01 11:07:24', 60.0 UNION ALL
SELECT 89, 3, '2012-05-01 14:57:55', 31.9 UNION ALL
SELECT 90, 3, '2012-05-02 03:58:32', 44.1 UNION ALL
SELECT 91, 3, '2012-05-02 09:42:17', 66.6 UNION ALL
SELECT 92, 3, '2012-05-02 11:39:24', 52.8 UNION ALL
SELECT 93, 3, '2012-05-02 12:19:12', 72.8 UNION ALL
SELECT 94, 3, '2012-05-03 00:39:17', 58.3 UNION ALL
SELECT 95, 3, '2012-05-03 02:27:42', 45.9 UNION ALL
SELECT 96, 3, '2012-05-03 07:23:44', 70.3 UNION ALL
SELECT 97, 3, '2012-05-03 12:52:54', 44.5 UNION ALL
SELECT 98, 3, '2012-05-04 00:51:19', 70.0 UNION ALL
SELECT 99, 3, '2012-05-04 03:30:54', 38.5 UNION ALL
SELECT 100, 3, '2012-05-04 08:51:47', 25.8 UNION ALL
SELECT 101, 3, '2012-05-05 01:54:02', 54.3 UNION ALL
SELECT 102, 3, '2012-05-05 06:32:00', 52.0 UNION ALL
SELECT 103, 3, '2012-05-06 05:26:39', 36.8 UNION ALL
SELECT 104, 3, '2012-05-06 10:15:01', 65.5 UNION ALL
SELECT 105, 3, '2012-05-06 15:46:27', 59.3 UNION ALL
SELECT 106, 3, '2012-05-06 20:47:20', 35.9 UNION ALL
SELECT 107, 3, '2012-05-07 04:23:42', 47.5 UNION ALL
SELECT 108, 3, '2012-05-07 09:02:46', 40.9 UNION ALL
SELECT 109, 3, '2012-05-07 09:36:38', 48.7 UNION ALL
SELECT 110, 3, '2012-05-07 11:49:10', 66.9 UNION ALL
SELECT 111, 3, '2012-05-08 01:35:27', 45.2 UNION ALL
SELECT 112, 3, '2012-05-08 06:17:47', 50.4 UNION ALL
SELECT 113, 3, '2012-05-08 09:59:05', 52.1 UNION ALL
SELECT 114, 3, '2012-05-09 01:05:04', 70.5 UNION ALL
SELECT 115, 3, '2012-05-09 04:46:42', 56.7 UNION ALL
SELECT 116, 3, '2012-05-09 05:16:59', 30.9 UNION ALL
SELECT 117, 3, '2012-05-10 00:56:47', 47.7 UNION ALL
SELECT 118, 3, '2012-05-10 04:17:02', 32.0 UNION ALL
SELECT 119, 3, '2012-05-10 09:29:36', 68.5 UNION ALL
SELECT 120, 3, '2012-05-10 12:37:32', 63.0 UNION ALL
SELECT 121, 3, '2012-05-11 02:22:56', 75.8 UNION ALL
SELECT 122, 3, '2012-05-11 05:05:45', 41.4 UNION ALL
SELECT 123, 3, '2012-05-11 07:16:12', 50.3 UNION ALL
SELECT 124, 3, '2012-05-11 08:44:46', 46.7 UNION ALL
SELECT 125, 3, '2012-05-12 05:58:47', 65.1 UNION ALL
SELECT 126, 3, '2012-05-12 10:26:55', 68.4 UNION ALL
SELECT 127, 3, '2012-05-12 11:04:33', 49.5 UNION ALL
SELECT 128, 3, '2012-05-13 05:41:25', 72.1 UNION ALL
SELECT 129, 3, '2012-05-13 10:39:13', 88.1 UNION ALL
SELECT 130, 3, '2012-05-13 13:30:13', 67.4 UNION ALL
SELECT 131, 3, '2012-05-13 15:58:57', 51.2 UNION ALL
SELECT 132, 3, '2012-05-14 01:40:44', 56.1 UNION ALL
SELECT 133, 3, '2012-05-14 03:19:01', 55.4 UNION ALL
SELECT 134, 4, '2012-05-01 02:59:09', 41.5 UNION ALL
SELECT 135, 4, '2012-05-01 07:59:43', 58.9 UNION ALL
SELECT 136, 4, '2012-05-02 02:32:30', 92.9 UNION ALL
SELECT 137, 4, '2012-05-02 04:10:26', 55.1 UNION ALL
SELECT 138, 4, '2012-05-02 06:38:56', 68.1 UNION ALL
SELECT 139, 4, '2012-05-02 08:04:57', 45.1 UNION ALL
SELECT 140, 4, '2012-05-03 04:52:46', 74.3 UNION ALL
SELECT 141, 4, '2012-05-03 09:40:02', 40.6 UNION ALL
SELECT 142, 4, '2012-05-04 00:36:20', 65.9 UNION ALL
SELECT 143, 4, '2012-05-04 04:20:05', 60.0 UNION ALL
SELECT 144, 4, '2012-05-04 09:18:07', 61.0 UNION ALL
SELECT 145, 4, '2012-05-04 10:01:38', 62.0 UNION ALL
SELECT 146, 4, '2012-05-05 00:41:48', 43.9 UNION ALL
SELECT 147, 4, '2012-05-05 01:31:04', 57.4 UNION ALL
SELECT 148, 4, '2012-05-05 03:33:21', 82.9 UNION ALL
SELECT 149, 4, '2012-05-05 09:19:49', 61.5 UNION ALL
SELECT 150, 4, '2012-05-06 04:31:01', 80.7 UNION ALL
SELECT 151, 4, '2012-05-06 05:14:09', 25.0 UNION ALL
SELECT 152, 4, '2012-05-06 07:30:58', 39.5 UNION ALL
SELECT 153, 4, '2012-05-07 01:40:13', 62.8 UNION ALL
SELECT 154, 4, '2012-05-07 05:16:29', 46.1 UNION ALL
SELECT 155, 4, '2012-05-07 10:57:39', 59.9 UNION ALL
SELECT 156, 4, '2012-05-07 13:49:26', 88.1 UNION ALL
SELECT 157, 4, '2012-05-08 04:28:34', 43.9 UNION ALL
SELECT 158, 4, '2012-05-08 10:07:23', 22.0 UNION ALL
SELECT 159, 4, '2012-05-08 14:33:25', 57.1 UNION ALL
SELECT 160, 4, '2012-05-09 02:25:00', 49.7 UNION ALL
SELECT 161, 4, '2012-05-09 05:57:11', 87.9 UNION ALL
SELECT 162, 4, '2012-05-09 07:03:47', 48.3 UNION ALL
SELECT 163, 4, '2012-05-10 02:59:51', 48.1 UNION ALL
SELECT 164, 4, '2012-05-10 03:30:01', 71.5 UNION ALL
SELECT 165, 4, '2012-05-11 03:50:25', 54.8 UNION ALL
SELECT 166, 4, '2012-05-11 09:39:00', 68.1 UNION ALL
SELECT 167, 4, '2012-05-12 04:35:58', 80.6 UNION ALL
SELECT 168, 4, '2012-05-12 07:01:06', 42.4 UNION ALL
SELECT 169, 4, '2012-05-12 10:48:16', 100.0 UNION ALL
SELECT 170, 4, '2012-05-12 15:17:07', 73.0 UNION ALL
SELECT 171, 4, '2012-05-13 00:39:32', 65.7 UNION ALL
SELECT 172, 4, '2012-05-13 05:25:59', 54.7 UNION ALL
SELECT 173, 4, '2012-05-13 06:02:13', 18.4 UNION ALL
SELECT 174, 4, '2012-05-13 11:28:58', 74.0 UNION ALL
SELECT 175, 4, '2012-05-14 05:29:20', 11.9 UNION ALL
SELECT 176, 4, '2012-05-14 07:19:40', 75.0

SELECT x1.PersonID, p.PersonName, x1.TestDate AS StartDate, x3.TestDate AS EndDate, 
    x1.MaxScore AS Day1Max, x2.MaxScore AS Day2Max, x3.MaxScore AS Day3Max, 
    ts.TestDateTime, ts.TestScore
FROM 
    (SELECT ts1.PersonID, CONVERT(date, ts1.TestDateTime) AS TestDate, 
        MAX(ts1.TestScore) AS MaxScore
    FROM TestScores ts1
    WHERE ts1.TestScore >= 70
    GROUP BY ts1.PersonID, CONVERT(date, ts1.TestDateTime)) x1 INNER JOIN
    (SELECT ts2.PersonID, CONVERT(date, ts2.TestDateTime) AS TestDate, 
        MAX(ts2.TestScore) AS MaxScore
    FROM TestScores ts2
    WHERE ts2.TestScore >= 70
    GROUP BY ts2.PersonID, CONVERT(date, ts2.TestDateTime)) x2 ON x1.PersonID = x2.PersonID AND
        x1.TestDate = DATEADD(d, -1, x2.TestDate) INNER JOIN
    (SELECT ts3.PersonID, CONVERT(date, ts3.TestDateTime) AS TestDate, 
        MAX(ts3.TestScore) AS MaxScore
    FROM TestScores ts3
    WHERE ts3.TestScore >= 70
    GROUP BY ts3.PersonID, CONVERT(date, ts3.TestDateTime)) x3 ON x2.PersonID = x3.PersonID AND
        x2.TestDate = DATEADD(d, -1, x3.TestDate) INNER JOIN
    TestScores ts ON ts.PersonID = x1.PersonID AND ts.TestDateTime >= x1.TestDate AND 
        ts.TestDateTime < DATEADD(d, 1, x3.TestDate) INNER JOIN
    People p ON ts.PersonID = p.PersonID
ORDER BY x1.PersonID, x1.TestDate, ts.TestDateTime

DROP TABLE TestScores
DROP TABLE People

Open in new window

Can you confirm that in your example, the split across 3 days is not relevant  and that what matters is the test sequence; so that the important results are those highlighted below..

On day one .. 75.2,  54.5,  12.8, 70.3
On day two ... 99.9
On day three . 79.1,  56.7 and 10.5

So this sequence would NOT qualify...

On day one .. 75.2,  12.8, 70.3,  54.5
On day two ... 99.9
On day three . 79.1,  56.7 and 10.5
Litation,

Yes, please clarify.

My comment was based on the assumption that if a candidate scored a 70 on three consecutive days, then you wanted all of that candidate's scores for that three-day period.

However, as peter57r notes, it is at least equally plausible that you really wanted three consecutive tests scoring 70 or above.

Which is it?

:)

Patrick
Avatar of Litation

ASKER

Hi sorry for the delay (work getting in the way!!)

The split over the days is irrelevant.  Its any three consecutive (or more) tests so it could be they did one a day and the split is over three days.  It maybe the tests where all done on the same day.  

Basically any three plus constitutive tests that meet the criteria but then pulling back ALL their test data.  So candidates who don't get the three in a  row criteria are ignored but anyone getting three in a row to pull back all their tests (and for the sake of crystal etc.. it maybe that the tests shown in the report will be after the fact as well so it would show all of the tests.

@mlmcc essentially im trying to do the grunt work in SQL to create a view of the data that I can then use in Crystal.  This may be bad practice (feel free to let me know =) ) but as I've had to learn myself it seemed  like the best way as the view is dynamically created and rather than getting crystal to run the sample data each time you run the report it just grabs it from the view.

@ peter57r  Yes you're correct the first sequence would qualify but the second would not as it needs to be three consecutive tests as far as the complete time stamp.

@matthewspatrick the overlap would be a problem as it would show that candidate as two separate entries where all I would want is "Frank meets the criteria here are all his results"

(I guess its hard to explain when you have it in your head sometimes)

Regards.
It is generally better to do the filtering in the database or SQL rather than in the report.  This is particularly true when there is a lot of data and you only really need a relatively small part of it.

IN this case it might be ok to handle it in the report since I suspect you aren't with a small subset of millions of records.

mlmcc
ASKER CERTIFIED SOLUTION
Avatar of Patrick Matthews
Patrick Matthews
Flag of United States of America image

Link to home
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
See answer
>> So is there a simple way to do this? <<

There are definitely several ways to do this in SQL Server.  

Probably the only genuinely "simple" one would be to insert rows into a temp table first, then base the checks on the temp table.  Since the temp table can be fully indexed, this method should also provide good performance.