set up a counter to count absent students

Hi

one more thing I need to put a counter in there to say how many days they have been absent

eg for this particular student last class attended was 17/12/2010 but from their first absence on 07/01/2011 until the 11/03/2011 the period which cause them to breach 4 weeks absent they have missed 17 days
if object_id('tempdb..#absences','U') is not null drop table #absences

select dense_rank() over (partition by student_id order by week_no) as weekrn
--    ,dense_rank() over (partition by student_id order by week_no,day_num) as dayrn    -- not used
      ,isnull(absence_code,'') as absence_ind
      ,*
into  #absences
from  nrc_eRegisters_allstudents_testing_tb 
where acad_period = '10/11'

create index idx_temp_absences_1 on #absences (student_id, weekrn, absence_ind)    -- dayrn not used so removed from index
create index idx_temp_absences_2 on #absences (student_id, date, absence_ind)

;with absences_cte as
( -- this will return the absence "blocks" of days - at the moment 4 school weeks using weekrn+4 indicates 4x7 = 28 "school" days not 28 calendar days
 select a.student_id, a.date as start_date, b.date as end_date  
 from #absences a
 outer apply (select top 1 date from #absences c where c.student_id = a.student_id and c.weekrn = a.weekrn+4 and c.day_num >= a.day_num ) b   
 where a.absence_ind = 'O'
 and a.date < getdate()
 and b.date is not null
 and not exists (select NULL from #absences c where c.student_id = a.student_id and c.date between a.date and b.date and c.absence_ind <> 'O')
 group by a.student_id, a.date, b.date
)
Select distinct ab.*
from absences_cte c
outer apply (select top 1 * from #absences na where na.student_id = c.student_id and na.date < c.start_date and na.absence_code <> 'O' order by student_id, date desc, end_time desc, start_time desc) ab
where ab.absence_code is not NULL

Open in new window

LVL 3
lisa_mcAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Mark WillsTopic AdvisorCommented:
Hmmmm....

Kinda opposing thoughts. The query is showing the last attended, so, really is ignoring the absences at this stage.

What are all the things you want to see in the summary ?
0
lisa_mcAuthor Commented:
at the moment I am tying this output to a crystal report

I want to see course information, and staff name for each course
campus location and school

Then student id, student name , date of class last attended, and if possible number of full days missed (remeber to consideri the fact that they may have more than one class in a day) in that period which has caused then to breach four weeks absent
0
Mark WillsTopic AdvisorCommented:
OK, lots of columns, some of which I dont have in my test data (like name, staff and a few others).

But the basis will be combining the two different queries from your previous questions. The "detail" view will be based on (except for those missing columns):

select dense_rank() over (partition by student_id order by week_no) as weekrn
--    ,dense_rank() over (partition by student_id order by week_no,day_num) as dayrn    -- not used
      ,isnull(absence_code,'') as absence_ind
      ,*
into  #absences
from  nrc_eRegisters_allstudents_testing_tb 
where acad_period = '10/11'

create index idx_temp_absences_1 on #absences (student_id, weekrn, absence_ind)    -- dayrn not used so removed from index
create index idx_temp_absences_2 on #absences (student_id, date, absence_ind)

;with absences_cte as
( -- this will return the absence "blocks" of days - at the moment 4 school weeks using weekrn+4 indicates 4x7 = 28 "school" days not 28 calendar days
 select a.student_id, a.date as start_date, b.date as end_date  
 from #absences a
 outer apply (select top 1 date from #absences c where c.student_id = a.student_id and c.weekrn = a.weekrn+4 and c.day_num >= a.day_num ) b   
 where a.absence_ind = 'O'
 and a.date < getdate()
 and b.date is not null
 and not exists (select NULL from #absences c where c.student_id = a.student_id and c.date between a.date and b.date and c.absence_ind <> 'O')
 group by a.student_id, a.date, b.date
)
Select distinct c.student_id, 'missed >>' as missed_ind, m.*, 'lastclass >>' as last_ind, l.*
from absences_cte c
inner join #absences m on m.student_id = c.student_id and m.date between c.start_date and c.end_date
outer apply (select top 1 * from #absences na where na.student_id = c.student_id and na.date < c.start_date and na.absence_code <> 'O' order by student_id, date desc, end_time desc, start_time desc) l
--where ab.absence_code is not NULL

Open in new window


So, now what we will need to do is to replace certain details with aggregated counts and such like. Have added a couple of literal columns so you can see where details belong. Please note this is pretty much a "raw" result and we have to clean it up. But all the correct detail is there.

But it is now 3:15am "down under" and really need my beauty sleep. So, might need to leave it for now and come back later in the day. Is that OK ?
0
IT Pros Agree: AI and Machine Learning Key

We’d all like to think our company’s data is well protected, but when you ask IT professionals they admit the data probably is not as safe as it could be.

lisa_mcAuthor Commented:
yes thats fine

thanks again mark
0
lisa_mcAuthor Commented:
hi again mark I just wanted to ask a question

I am using the code in a stored procedure and I was wondering do you have to drop/deallocated indexes as each time I run my sp it takes longer each time

maybe a stupid question but im curious
0
Mark WillsTopic AdvisorCommented:
The only stupid question is the one that doesnt get asked.

But it is a difficult question.

Short answer is no, you should not have to drop deallocate indexes every time. But the indexes and database might not be sized sufficiently to accommodate new growth. Or, you might be a victim of parameter sniffing.

Have a read of : http://www.experts-exchange.com/Microsoft/Development/MS-SQL-Server/A_1258-Speedier-Execution-of-Stored-Procedures-in-SQL-Server.html

Bottom line is that there could be a few different issues with SP performance.

For example, what is the "health" of your database - do you back up often, do you run index rebuilds, what recovery model is it (and if full recovery, then how often do you backup transaction logs). Then there are things like any potential conflicts with other process and or resources (like disk IO or memory).

So, it can sometimes get very involved - thats what makes it a bit difficult to determine. Not a stupid question at all :)

0
lisa_mcAuthor Commented:
Hi mark

I'll check that question out thanks,  server must have been slow yesterday as it is running quicker now

in the above code
line 16 I changed c.day_num >= a.day_num to c.day_num >= a.day_num-1

This allowed for 3 weeks exactly

ie tues week one to mon of week 3

I have been thinking about the scenario about checking for students who have been absent 3 weeks or more but less than 4 weeks and I see what you mean about day num any suggestions about this as this is the more urgent problem

Did you have your beauty sleep :-)
0
Mark WillsTopic AdvisorCommented:
>> Did you have your beauty sleep :-)

Given the time I have been away I must be looking fantastic - but alas - I will need more than sleep...

I am so very sorry for not coming back sooner, but thankfully the alert went out and (sheepishly) realised that I had a lot more to do here.

Also, will add in the with (nolock) query hint to avoid possible lock contention - that could be part of the slowness on a busy system and our query only has to read the data.

Back in a moment - will set up the test data again.


0
Mark WillsTopic AdvisorCommented:
Goodness, have lots of little spreadsheets I have been playing with... Might need some fresh tricky stuff to test against.

Now, when we were wanting to show just the last class, then we had to suppress those NULL values from the (last) cross apply. This time we want to see that "last class" as either NULL, or blank / zero. Otherwise we will miss the counter. We can also add more (or less) columns depending on what is needed for the "last" details etc.

So, please have a look at :

if object_id('tempdb..#absences','U') is not null drop table #absences

select dense_rank() over (partition by student_id order by week_no) as weekrn
--    ,dense_rank() over (partition by student_id order by week_no,day_num) as dayrn    -- not used
      ,isnull(absence_code,'') as absence_ind
      ,*
into  #absences
from  nrc_eRegisters_allstudents_testing_tb with (nolock)
where acad_period = '10/11'

create index idx_temp_absences_1 on #absences (student_id, weekrn, absence_ind)    -- dayrn not used so removed from index
create index idx_temp_absences_2 on #absences (student_id, date, absence_ind)

;with absences_cte as
( -- this will return the absence "blocks" of days - at the moment 4 school weeks using weekrn+4 indicates 4x7 = 28 "school" days not 28 calendar days
 select a.student_id, a.date as start_date, b.date as end_date  
 from #absences a 
 outer apply (select top 1 date from #absences c where c.student_id = a.student_id and c.weekrn = a.weekrn+4 and c.day_num >= a.day_num ) b   
 where a.absence_ind = 'O'
 and a.date < getdate()
 and b.date is not null
 and not exists (select NULL from #absences c where c.student_id = a.student_id and c.date between a.date and b.date and c.absence_ind <> 'O')
 group by a.student_id, a.date, b.date
)
select student_id, count(*) as days_absent, last_date,last_start_time,last_end_time,last_module_code,last_module_desc
from (
      Select distinct c.student_id
            ,'missed  >>' as missed_ind, m.date as missed_date, m.start_time as missed_start,   m.end_time as missed_end,   m.module_code as missed_module
            ,'lastclass >>' as last_ind, isnull(l.date,0) as last_date,  isnull(l.start_time,0) as last_start_time,isnull(l.end_time,0) as last_end_time,isnull(l.module_code,'') as last_module_code,isnull(l.module_desc,'') as last_module_desc
      from absences_cte c
      inner join #absences m on m.student_id = c.student_id and m.date between c.start_date and c.end_date
      outer apply (select top 1 * from #absences na where na.student_id = c.student_id and na.date < c.start_date and na.absence_code <> 'O' order by student_id, date desc, end_time desc, start_time desc) l
     ) sq
group by student_id, last_date,last_start_time,last_end_time,last_module_code, last_module_desc

Open in new window

0
Mark WillsTopic AdvisorCommented:
>> I have been thinking about the scenario about checking for students who have been absent 3 weeks or more but less than 4 weeks and I see what you mean about day num any suggestions about this as this is the more urgent problem

This requirement will be a bit more intense :)  maybe a new question ? or is it part of one of the other existing questions that we should re-open ? Because I do have some questions for you to help clarify the requirement.
0
lisa_mcAuthor Commented:
hi mark

thank you so much for coming back to this question I really appreciate it

The comment above I will ask a new question and post the code I have for currently working out students who have been absent 3 or more weeks

I will have a look at the code above and get back asap

Also I will post this question

thanks again
0
lisa_mcAuthor Commented:
Hi again mark

I tried the code below and I was getting an error about a missing ")"
so I placed in here

from absences_cte )  c  - is this the right place for it to go?

when I complied this code with the ) as shown above I get errors about c.student_id could not be bound, m.date could not be bound and so on

Maybe I put the ')' in the wrong place???
if object_id('tempdb..#absences','U') is not null drop table #absences

select dense_rank() over (partition by student_id order by week_no) as weekrn
--    ,dense_rank() over (partition by student_id order by week_no,day_num) as dayrn    -- not used
      ,isnull(absence_code,'') as absence_ind
      ,*
into  #absences
from  nrc_eRegisters_allstudents_tb with (nolock)
where acad_period = '10/11'
and student_id = 'kir07000408'

create index idx_temp_absences_1 on #absences (student_id, weekrn, absence_ind)    -- dayrn not used so removed from index
create index idx_temp_absences_2 on #absences (student_id, date, absence_ind)

;with absences_cte as
( -- this will return the absence "blocks" of days - at the moment 4 school weeks using weekrn+4 indicates 4x7 = 28 "school" days not 28 calendar days
 select a.student_id, a.date as start_date, b.date as end_date  
 from #absences a 
 outer apply (select top 1 date from #absences c where c.student_id = a.student_id and c.weekrn = a.weekrn+4 and c.day_num >= a.day_num ) b   
 where a.absence_ind = 'O'
 and a.date < getdate()
 and b.date is not null
 and not exists (select NULL from #absences c where c.student_id = a.student_id and c.date between a.date and b.date and c.absence_ind <> 'O')
 group by a.student_id, a.date, b.date
)
select student_id, count(*) as days_absent, last_date,last_start_time,last_end_time,last_module_code,last_module_desc
from (
      Select distinct c.student_id
            ,'missed  >>' as missed_ind, m.date as missed_date, m.start_time as missed_start,   m.end_time as missed_end,   m.module_code as missed_module
            ,'lastclass >>' as last_ind, isnull(l.date,0) as last_date,  isnull(l.start_time,0) as last_start_time,isnull(l.end_time,0) as last_end_time,isnull(l.module_code,'') as last_module_code,isnull(l.module_desc,'') as last_module_desc
      
		from absences_cte )  c 
      inner join #absences m on m.student_id = c.student_id and m.date between c.start_date and c.end_date
      outer apply (select top 1 * from #absences na where na.student_id = c.student_id and na.date < c.start_date and na.absence_code <> 'O' order by student_id, date desc, end_time desc, start_time desc) l

Open in new window

0
lisa_mcAuthor Commented:
oh my

please ignore the above post I am soooooo embarassed, I didnt copy the last two lines of code

Well at least I gave us both a laugh :-)

I'll check the "whole" code now
0
lisa_mcAuthor Commented:
Right I have tested one student so far and everything is correct except the days absent

I am getting the answer of days absent as 80 but this does represent the number of classes he has missed which could be a piece of information I could use so will keep that I just need the number of whole days missed
0
lisa_mcAuthor Commented:
sorry what I meant to say above was the number of classes missed is a good piece of information to have so I will keep it in my query

All I need now is to calculate the whole number of days missed

Have tested it for a few more students and so far so good
0
Mark WillsTopic AdvisorCommented:
*laughing* yes, it is classes and I should have thought a bit more about that (embarrassing)

try this please :

if object_id('tempdb..#absences','U') is not null drop table #absences

select dense_rank() over (partition by student_id order by week_no) as weekrn
--    ,dense_rank() over (partition by student_id order by week_no,day_num) as dayrn    -- not used
      ,isnull(absence_code,'') as absence_ind
      ,*
into  #absences
from  nrc_eRegisters_allstudents_testing_tb with (nolock)
where acad_period = '10/11'

create index idx_temp_absences_1 on #absences (student_id, weekrn, absence_ind)    -- dayrn not used so removed from index
create index idx_temp_absences_2 on #absences (student_id, date, absence_ind)

;with absences_cte as
( -- this will return the absence "blocks" of days - at the moment 4 school weeks using weekrn+4 indicates 4x7 = 28 "school" days not 28 calendar days
 select a.student_id, a.date as start_date, b.date as end_date  
 from #absences a 
 outer apply (select top 1 date from #absences c where c.student_id = a.student_id and c.weekrn = a.weekrn+4 and c.day_num >= a.day_num ) b   
 where a.absence_ind = 'O'
 and a.date < getdate()
 and b.date is not null
 and not exists (select NULL from #absences c where c.student_id = a.student_id and c.date between a.date and b.date and c.absence_ind <> 'O')
 group by a.student_id, a.date, b.date
)
select student_id, count(*) as classes_absent, count(distinct missed_date) as days_absent, last_date,last_start_time,last_end_time,last_module_code,last_module_desc
from (
      Select distinct c.student_id
            ,'missed  >>' as missed_ind, m.date as missed_date, m.start_time as missed_start,   m.end_time as missed_end,   m.module_code as missed_module
            ,'lastclass >>' as last_ind, isnull(l.date,0) as last_date,  isnull(l.start_time,0) as last_start_time,isnull(l.end_time,0) as last_end_time,isnull(l.module_code,'') as last_module_code,isnull(l.module_desc,'') as last_module_desc
      from absences_cte c
      inner join #absences m on m.student_id = c.student_id and m.date between c.start_date and c.end_date
      outer apply (select top 1 * from #absences na where na.student_id = c.student_id and na.date < c.start_date and na.absence_code <> 'O' order by student_id, date desc, end_time desc, start_time desc) l
     ) sq
group by student_id, last_date,last_start_time,last_end_time,last_module_code, last_module_desc

Open in new window

0
lisa_mcAuthor Commented:
thats great everything is working out

I have one problem tho

the particular student I am testing has two seperate periods of 4 or more weeks absence

I am wanting to show the course code and course description which I can show but with this particular student before the first block of 4 or more weeks absent the student hasn't actually attended any classes so the module code is blank which is fine but the course code is also blank which causes a problem when I try to link this to crystal

Is there anyway to always show the course_code and course_period as I will be grouping by this in crystal

basically if the module code is empty or not I always want to show the course code

Hope this makes sense
0
lisa_mcAuthor Commented:
sorry have just answered my own question

I just added  m.course_code to line 28 and put this into my select statement and it worked

lastly could you give me an explanation of what the code is doing if you don't mind
0
Mark WillsTopic AdvisorCommented:
Sure can...  Without putting in the bits you have already accounted for (e.g. day_num - 1 and the course_code) hope the following comments help explain the code.

-- because we will be "selecting into" a temp table, it cannot already exist. 
-- so we test to see if our temp table is there and remove it - more a precaution than a reality
-- because if in a procedure, the temp table will only exist for the duration of the procedure

if object_id('tempdb..#absences','U') is not null drop table #absences

-- now we want to select from our data source into our temp table
-- the challenge is getting the week number as a sequential count
-- the sequential week counter is based on the classes scheduled for the student
-- to get that sequential week counter we use the dense_rank() function
-- dense_rank allows for duplicate values and numbers each unique occurance sequentially, 
-- so it is possible for our calculated week number to have multiple occurances of 1's 2's etc which is fine by us.
-- the only other thing we need to do is "manage" our NULL values, 
-- we dont want to exclude any data, so we use the ISNULL() function to ensure a value because NULLs can sometimes be excluded
-- by the same token, we also want to check NULL's so we use a new alias for that column.

select dense_rank() over (partition by student_id order by week_no) as weekrn
      ,isnull(absence_code,'') as absence_ind
      ,*
into  #absences
from  nrc_eRegisters_allstudents_testing_tb with (nolock)
where acad_period = '10/11'

-- now we have our raw data, and assuming there is a fair bit of it, we will create indexes
-- this was kind of a retrospective step, based on the subsequent queries and looking at the actual execution plans.

create index idx_temp_absences_1 on #absences (student_id, weekrn, absence_ind)    -- dayrn not used so removed from index
create index idx_temp_absences_2 on #absences (student_id, date, absence_ind)

-- finally... we are ready to run our query
-- we are using a CTE query which is basically a "named subquery" so we can refer to the subquery by name a few times after the event.
-- The subquery (after the "as") is selecting data from our temp table
-- it uses the "cross apply" to extract the end date of the 4th week for the student
-- we then use an "exists" to find out if there are any non-absent days
-- this basically extracts a rolling period of 4weeks - so there could be more than that

;with absences_cte as
( -- this will return the absence "blocks" of days - at the moment 4 school weeks using weekrn+4 indicates 4x7 = 28 "school" days not 28 calendar days
 select a.student_id, a.date as start_date, b.date as end_date  
 from #absences a 
 outer apply (select top 1 date from #absences c where c.student_id = a.student_id and c.weekrn = a.weekrn+4 and c.day_num >= a.day_num ) b   
 where a.absence_ind = 'O'
 and a.date < getdate()
 and b.date is not null
 and not exists (select NULL from #absences c where c.student_id = a.student_id and c.date between a.date and b.date and c.absence_ind <> 'O')
 group by a.student_id, a.date, b.date
)
-- the second part of the CTE is the select - see note 1 below
select student_id, count(*) as classes_absent, count(distinct missed_date) as days_absent, last_date,last_start_time,last_end_time,last_module_code,last_module_desc
from (-- see note 2 below
      Select distinct c.student_id
            ,'missed  >>' as missed_ind, m.date as missed_date, m.start_time as missed_start,   m.end_time as missed_end,   m.module_code as missed_module
            ,'lastclass >>' as last_ind, isnull(l.date,0) as last_date,  isnull(l.start_time,0) as last_start_time,isnull(l.end_time,0) as last_end_time,isnull(l.module_code,'') as last_module_code,isnull(l.module_desc,'') as last_module_desc
      from absences_cte c
      inner join #absences m on m.student_id = c.student_id and m.date between c.start_date and c.end_date
      outer apply (select top 1 * from #absences na where na.student_id = c.student_id and na.date < c.start_date and na.absence_code <> 'O' order by student_id, date desc, end_time desc, start_time desc) l
     ) sq
group by student_id, last_date,last_start_time,last_end_time,last_module_code, last_module_desc

-- note 1
-- we need to aggregate information for the counters so we end up running a select from a subquery "sq" described in note 2
-- the number of classes is defined by a straight counter
-- the number of days, we need to count the different days hence the count(distinct missed_date)
-- because we are using aggregate functions, then we also have to group by all the columns we want to select that are not part of the aggregated values

-- note 2
-- similar to the original CTE we need to identify three main things. 
-- First is the key criteria defined for the student which we get from our CTE
-- Next are the missing classes which we can get by using the CTE data as a join back to our original dataset #absences - this is where indexing helps
-- finally for our "sq" subquery, we use another cross apply to get the last class prior to this "absent" row.
-- this is critical, because this also defines the starting point for a "group" of missing days
-- we have to also manage our NULL values and again we resort to the ISNULL() function.
-- now because of the potential duplication cause by the "rolling periods" we do use a select distinct to get the data we want inside the "sq" subquery

Open in new window

0
lisa_mcAuthor Commented:
aw one more problem mark

I have tried to put this into a stored procedure and using the student who has 2 seperate periods of 4 weeks absence who should have 2 outputs, I am only getting one output combining the results

Do you need to see the stored proc

Thanks for the explanation above really informative
0
Mark WillsTopic AdvisorCommented:
Are they contiguous periods ? Meaning is there anything in between as a non-absent day ?
0
lisa_mcAuthor Commented:
aw rite I see what you mean

the only codes inbetween these period is c - class cancelled
and a - authorised absence

which I have excluded in my code so the reason why nothing is showing for the module is becuase the student hasnt actually attended anything

so if I had a student with 2 periods of 4 week absences who had actually attended classes between these periods then they would show up twice

is that correct?
0
lisa_mcAuthor Commented:
yeah i just found a student who fits the above criteria and they have appeared twice

Im going to do abit more testing on it
0
lisa_mcAuthor Commented:
I have posted that other question about students absent >= 3 weeks but less than 4 weeks

http://www.experts-exchange.com/Microsoft/Development/MS-SQL-Server/SQL-Server-2005/Q_26936686.html
0
lisa_mcAuthor Commented:
Hi again mark

Im just testing the students the sp has brought out and the first 10 are ok then the 11 one the classes missed is 6 less and days missed is 2 less and I cannot see why its dropping these results off
0
Mark WillsTopic AdvisorCommented:
If you can send me the data, then another two eyes might be able to reveal more ?

in the meantime, I'll have a look at the other question.
0
lisa_mcAuthor Commented:
yeah I thought that to so I have copied 2 student information to a spreadsheet for you

The first sheet has the data which is stored in my table for that particular student at the bottom of that page gives the output from the code and below that I have given what I get when I count the days absent and classes missed

On sheet 2 is another example of a different student
example-students.xls
0
Mark WillsTopic AdvisorCommented:
Ummmmm.... My code shows same as your count.

-- because we will be "selecting into" a temp table, it cannot already exist. 
-- so we test to see if our temp table is there and remove it - more a precaution than a reality
-- because if in a procedure, the temp table will only exist for the duration of the procedure

if object_id('tempdb..#absences','U') is not null drop table #absences

-- now we want to select from our data source into our temp table
-- the challenge is getting the week number as a sequential count
-- the sequential week counter is based on the classes scheduled for the student
-- to get that sequential week counter we use the dense_rank() function
-- dense_rank allows for duplicate values and numbers each unique occurance sequentially, 
-- so it is possible for our calculated week number to have multiple occurances of 1's 2's etc which is fine by us.
-- the only other thing we need to do is "manage" our NULL values, 
-- we dont want to exclude any data, so we use the ISNULL() function to ensure a value because NULLs can sometimes be excluded
-- by the same token, we also want to check NULL's so we use a new alias for that column.

select dense_rank() over (partition by student_id order by week_no) as weekrn
      ,isnull(absence_code,'') as absence_ind
      ,*
into  #absences
from  nrc_eRegisters_allstudents_testing_tb with (nolock)
where acad_period = '10/11'

-- now we have our raw data, and assuming there is a fair bit of it, we will create indexes
-- this was kind of a retrospective step, based on the subsequent queries and looking at the actual execution plans.

create index idx_temp_absences_1 on #absences (student_id, weekrn, absence_ind)    -- dayrn not used so removed from index
create index idx_temp_absences_2 on #absences (student_id, date, absence_ind)

-- finally... we are ready to run our query
-- we are using a CTE query which is basically a "named subquery" so we can refer to the subquery by name a few times after the event.
-- The subquery (after the "as") is selecting data from our temp table
-- it uses the "cross apply" to extract the end date of the 4th week for the student
-- we then use an "exists" to find out if there are any non-absent days
-- this basically extracts a rolling period of 4weeks - so there could be more than that

;with absences_cte as
( -- this will return the absence "blocks" of days - at the moment 4 school weeks using weekrn+4 indicates 4x7 = 28 "school" days not 28 calendar days
 select a.student_id, a.date as start_date, b.date as end_date  
 from #absences a 
 outer apply (select top 1 date from #absences c where c.student_id = a.student_id and c.weekrn = a.weekrn+3 and c.day_num >= a.day_num - 1 ) b   
 where a.absence_ind = 'O'
 and a.date < getdate()
 and b.date is not null
 and not exists (select NULL from #absences c where c.student_id = a.student_id and c.date between a.date and b.date and c.absence_ind <> 'O')
 group by a.student_id, a.date, b.date
)
-- the second part of the CTE is the select - see note 1 below
select student_id, count(*) as classes_absent, count(distinct missed_date) as days_absent, last_date,last_start_time,last_end_time,last_module_code,last_module_desc
from (-- see note 2 below
      Select distinct c.student_id
            ,'missed  >>' as missed_ind, m.date as missed_date, m.start_time as missed_start,   m.end_time as missed_end,   m.module_code as missed_module
            ,'lastclass >>' as last_ind, isnull(l.date,0) as last_date,  isnull(l.start_time,0) as last_start_time,isnull(l.end_time,0) as last_end_time,isnull(l.module_code,'') as last_module_code,isnull(l.module_desc,'') as last_module_desc
      from absences_cte c
      inner join #absences m on m.student_id = c.student_id and m.date between c.start_date and c.end_date
      outer apply (select top 1 * from #absences na where na.student_id = c.student_id and na.date < c.start_date and na.absence_code not in ('A','C','E','H','O','P', 'V','W','S','T') order by student_id, date desc, end_time desc, start_time desc) l
     ) sq
group by student_id, last_date,last_start_time,last_end_time,last_module_code, last_module_desc

-- note 1
-- we need to aggregate information for the counters so we end up running a select from a subquery "sq" described in note 2
-- the number of classes is defined by a straight counter
-- the number of days, we need to count the different days hence the count(distinct missed_date)
-- because we are using aggregate functions, then we also have to group by all the columns we want to select that are not part of the aggregated values

-- note 2
-- similar to the original CTE we need to identify three main things. 
-- First is the key criteria defined for the student which we get from our CTE
-- Next are the missing classes which we can get by using the CTE data as a join back to our original dataset #absences - this is where indexing helps
-- finally for our "sq" subquery, we use another cross apply to get the last class prior to this "absent" row.
-- this is critical, because this also defines the starting point for a "group" of missing days
-- we have to also manage our NULL values and again we resort to the ISNULL() function.
-- now because of the potential duplication cause by the "rolling periods" we do use a select distinct to get the data we want inside the "sq" subquery

Open in new window



results :


--student 1

MAR10055848	72	22	2011-02-10 00:00:00.000	1900-01-01 10:45:00.000	1900-01-01 12:59:59.997	M03223     	Image Manipulation Applications

--student 2

GRE09038405	58	19	2010-12-20 00:00:00.000	1900-01-01 11:30:00.000	1900-01-01 12:59:59.997	M02529     	Working in Social Care

Open in new window


Or, are the numbers above the wrong numbers and should be 59+18  and 52+17 respectively ?  
Manually going through the data, it does appear that 72 is the correct count for student1 - has there been some other date range applied ?
0
lisa_mcAuthor Commented:
hi mark

i'll try the code again tomorrow and see maybe I was doing something wrong

il get back asap
0
lisa_mcAuthor Commented:
yeah ur rite 77, 22

and 58 and 19 are the correct answers for the example set of students
0
lisa_mcAuthor Commented:
hi again

the code above that you are using has a.weekrn+3 - i get that result as well when I use weekrn+3

Testing for student_id = MAR10055848

I checked and you should see from the spreadsheet that this student was absent

fri week 24 absent this whole day (friday)
week 26 absent
week 27 absent
week 28 absent
week 29 absent
week 30 absent
week 31 absent up to and including wed of this week

so it is far to say that this student has been absent roughly 5 and a half weeks

so if I run the sp using weekrn+3 OR weekrn+4 then this student will appear on either occassion as they are off greater than 3 OR greater than 4 weeks

I changed day_num to day_num - 1 to cater for exactly that week number

eg if weekrn+3 and day_num - 1 this would cater for exactly 3 weeks

eg absent from wed week1
absent week2
absent week3
absent until tues week 4

This is exactly 3 weeks

But I think puttin day_num - 1 is causing problems when the code is trying to calculate days missed and classes missed

I ran a few tests for this student and this is the results I am getting
week variable, day variable  = showing classes missed, days absent

weekrn+3 and day_num   results   =   72,22
weekrn+4 and day_num   results   =   68,21

weekrn+3 and day_num - 1   results   =   63,19
weekrn+4 and day_num - 1   results   =   59,18

Clearly from the results above we can see that the first result is correct but I cant understand why the count is different from weekrn+3 to weekrn+4 as this is only a variable to show how many weeks we need to look for

Also day_num -1 as this is causing issues to the classes missed and days absent total is there anyway for changing this to catering for someone being off exactly a certain week number

hope this makes sense
0
lisa_mcAuthor Commented:
>>>>Testing for student_id = MAR10055848

I checked and you should see from the spreadsheet that this student was absent

fri week 24 absent this whole day (friday)
week 26 absent
week 27 absent
week 28 absent
week 29 absent
week 30 absent
week 31 absent up to and including wed of this week

so it is far to say that this student has been absent roughly 5 and a half weeks <<<<<<

sorry this student has been off up until thursday of week 31 so they have been off exactly 6 weeks
0
lisa_mcAuthor Commented:
Think I may have found the problem (fingers crossed **)

I Have changed line 41 above to the line below

outer apply (select  date from #absences c where c.student_id = a.student_id and c.weekrn > a.weekrn+3 and c.day_num >= a.day_num - 1  ) b  

(>weekrn+3 would be 4 weeks)
(>weekrn+2 would be 3 weeks)


I noticed that when you equal to a certain week number it only found absences with exactly that about of weeks
now I have it that it is > a certain week number

the day_num -1 works fine with the above changed

I noticed that using top 1 was cutting days off ie it was taking the earlier dates therefore only counting to that day (therefore answers were always less and never more)

I have got an output of 33 students so I am just testing them now and I will also check > weekrn+2
to make sure that the same students and their answers  come out  correctly

will reply back asap
0
Mark WillsTopic AdvisorCommented:
Wow, kinda hard keeping up with the details above.

The idea of the outer apply was to choose the single best fit for end date.

What you have done is allow a lot more rows to be returned, so the results should be quite *interesting*

Will sit back and await your testing.

For what it is worth, the original code was doing the full day alignment checking because of the way in which the end date is being used in the "exists" statement. But can look at all that again.
0
lisa_mcAuthor Commented:
arragh you think its going well them bang you get an incorrect answer

the reason why I added day_num -1 is that because before the code didnt find someone who was off exactly 4 weeks and when I added this it found them

forget my last post that didnt work

ok back to your code and to my comment

ID: 35340242

did you try the different combinations of weekrn and day_num



0
lisa_mcAuthor Commented:
student MAR10055848 is off exactly 6 weeks

the output should be

72 classes missed
and 22 days absent

when I run this query weekrn+4 to show all students who have been off 4 or more weeks
0
lisa_mcAuthor Commented:
hi again

sorry dont mean to keep posting and i know its confusing but checking the results they are correct so far except for the student above

sorry have just realised the day_num-1 did have a big impact on the results (not a good one either)
so I have removed it

the results for the above student using weekrn+4

is 68 classes missed and 21 days absent

correct answer should be

72 classes and 22 days missed
0
Mark WillsTopic AdvisorCommented:
OK, have been playing with it and I think it should be :


outer apply (select top 1 date from #absences c where c.student_id = a.student_id and c.weekrn >= a.weekrn+3 and c.day_num >= a.day_num ) b  


or weekrn+4  either way I get the same results.
0
lisa_mcAuthor Commented:
ok im going to test a few students be back asap
0
lisa_mcAuthor Commented:
Hi

I have attached my code below and when I test it for that student

I am getting classes missed as 63
and days absent as 19

have i done something wrong
if object_id('tempdb..#absences','U') is not null drop table #absences

select dense_rank() over (partition by student_id order by week_no) as weekrn
      ,isnull(absence_code,'') as absence_ind
      ,*
into  #absences
from  nrc_eRegisters_allstudents_tb with (nolock)
where acad_period = '10/11'
and stage_code = 'ENR'
and student_id = 'mar10055848'

create index idx_temp_absences_1 on #absences (student_id, weekrn, absence_ind) 
create index idx_temp_absences_2 on #absences (student_id, date, absence_ind)

;with absences_cte as
( -- this will return the absence "blocks" of days - at the moment 4 school weeks using weekrn+4 indicates 4x7 = 28 "school" days not 28 calendar days
 select a.student_id, a.date as start_date, b.date as end_date  
 from #absences a 
outer apply (select top 1 date from #absences c where c.student_id = a.student_id and c.weekrn >= a.weekrn+4 and c.day_num >= a.day_num ) b
where a.absence_ind = 'O'
 and a.date < getdate()
 and b.date is not null
 and not exists (select NULL from #absences c where c.student_id = a.student_id and c.date between a.date and b.date and c.absence_ind <> 'O'   )
 group by a.student_id, a.date, b.date
)

select student_id, count(*) as classes_absent, count(distinct missed_date) as days_absent, last_date,last_start_time,last_end_time,last_module_code,last_module_desc
from (-- see note 2 below
      Select distinct c.student_id
            ,'missed  >>' as missed_ind, m.date as missed_date, m.start_time as missed_start,   m.end_time as missed_end,   m.module_code as missed_module
            ,'lastclass >>' as last_ind, isnull(l.date,0) as last_date,  isnull(l.start_time,0) as last_start_time,isnull(l.end_time,0) as last_end_time,isnull(l.module_code,'') as last_module_code,isnull(l.module_desc,'') as last_module_desc
      from absences_cte c
      inner join #absences m on m.student_id = c.student_id and m.date between c.start_date and c.end_date
      outer apply (select top 1 * from #absences na where na.student_id = c.student_id and na.date < c.start_date and na.absence_code not in ('A','C','E','H','O','P', 'V','W','S','T') order by student_id, date desc, end_time desc, start_time desc) l
     ) sq
group by student_id, last_date,last_start_time,last_end_time,last_module_code, last_module_desc

Open in new window

0
Mark WillsTopic AdvisorCommented:
Well, there is something different, but relax you are not going mad :)

When I run your code above over the spreadsheet supplied earlier, I get the 72 and 22

So... There is something different between the test data and your data.

Do you want to export again from : nrc_eRegisters_allstudents_tb  ? and I can compare it.

0
lisa_mcAuthor Commented:
Aw thats good im not mad yet :-)

Yeah I have copied in to a spreadsheet and it includes all the information from my database for this one student


example-students-1.xls
0
lisa_mcAuthor Commented:
I have also attached another student which shows them as being absent exactly 4 weeks (highlighted in blue)

yet when I run the code for weekrn+4 it doesnt show in my results example-students-2.xls
0
lisa_mcAuthor Commented:
I think I have fixed it change this line (line 19 in code above)

outer apply (select top 1 date from #absences c where c.student_id = a.student_id and c.weekrn >= a.weekrn+4 and c.day_num >= a.day_num ) b

to the  line below

outer apply (select top 1 date from #absences c where c.student_id = a.student_id and c.weekrn >= a.weekrn+4 and c.day_num <= a.day_num ) b

I tried the student MAR10055848 and I am getting 72,22 for both weekrn+3 and weekrn+4

wooo hoooo so excited when you get the problem (*fingers crossed* ) sorted
try it yourself and double check for me if you don't mind

the other spreadsheet I gave you (example students 2.xls) has the information of a student who is off exactly 4 weeks but the are still not showing up when I make the above change.

ill still continue to try and figure out why



0
Mark WillsTopic AdvisorCommented:
Sorry to put a rain on your moment, but it cant be <= daynum. Here's why :

Say a student has classes Wednesday, Thursday, Friday

On Wednesday we want to check for weeknum + 3 which is fine given the daynum (ie being the first day)
But lets say that Wednesday and Thursday were "attended" leaving Friday as the start of our three week absence.
So, in weeknum + 3 lets say Thursday and Friday were attended.
If we use daynum <= "Friday" then we pick up Wednesday and it would shouw as being absent - except that the Thursday (being the end of the three week period) was attended.

It really must be >= daynum.

Now, in the "exists" instead of saying "between" use :

(select NULL from #absences c where c.student_id = a.student_id and c.date >= a.date and c.date < b.date and c.absence_ind <> 'O'   )

The only other thing I think we need to do is to use an Order By clause (because we are selecting top 1 after all)

outer apply (select top 1 date from #absences c where c.student_id = a.student_id and c.weekrn >= a.weekrn+4 and c.day_num >= a.day_num order by c.date asc) b

The spreadsheet I load is in date sequence already, but your actual data may not be (there is no gaurantees unless using order by) and maybe thats why I get the right results.

0
Mark WillsTopic AdvisorCommented:
Just by way of a little more explanation above, what we want from the "b" query is an end date so we can check if there were any non-absent days.

So it is really important that we get the correct end date which really must be for the corresponding day in the 3rd (or 4th) week from "now" the error is most likely two things. first the Order By and second in the "exists" query to make sure we check up until (but not including) that end date as you have correctly pointed out, so I should not have been using the "between" because that would include that end date - it needs to be less than that. Does that make sense ?
0
Mark WillsTopic AdvisorCommented:
Oh, and at the very end, we also need to swap that "between" for >= and < :

   inner join #absences m on m.student_id = c.student_id and m.date >= c.start_date and m.date < c.end_date
0
Mark WillsTopic AdvisorCommented:
OK, so testing against those two spreadsheets and simply swapping the "between" code, the following does give me the correct results :

if object_id('tempdb..#absences','U') is not null drop table #absences

select dense_rank() over (partition by student_id order by week_no) as weekrn
      ,isnull(absence_code,'') as absence_ind
      ,*
into  #absences
from  nrc_eRegisters_allstudents_testing_tb with (nolock)
where acad_period = '10/11'
and stage_code = 'ENR'
--and student_id = 'mar10055848'           -- doesnt help when testing other students :)

create index idx_temp_absences_1 on #absences (student_id, weekrn, absence_ind) 
create index idx_temp_absences_2 on #absences (student_id, date, absence_ind)

;with absences_cte as
( -- this will return the absence "blocks" of days - at the moment 4 school weeks using weekrn+4 indicates 4x7 = 28 "school" days not 28 calendar days
 select a.student_id, a.date as start_date, b.date as end_date  
 from #absences a 
outer apply (select top 1 date from #absences c where c.student_id = a.student_id and c.weekrn >= a.weekrn+4 and c.day_num >= a.day_num order by c.date asc) b
where a.absence_ind = 'O'
 and a.date < getdate()
 and b.date is not null
 and not exists (select NULL from #absences c where c.student_id = a.student_id and c.date >= a.date and c.date < b.date and c.absence_ind <> 'O'   )
 group by a.student_id, a.date, b.date
)

-- temporary comment out the group by to see details

--select student_id, count(*) as classes_absent, count(distinct missed_date) as days_absent, last_date,last_start_time,last_end_time,last_module_code,last_module_desc
--from (
      Select distinct c.student_id
            ,'missed  >>' as missed_ind, m.date as missed_date, m.start_time as missed_start,   m.end_time as missed_end,   m.module_code as missed_module
            ,'lastclass >>' as last_ind, isnull(l.date,0) as last_date,  isnull(l.start_time,0) as last_start_time,isnull(l.end_time,0) as last_end_time,isnull(l.module_code,'') as last_module_code,isnull(l.module_desc,'') as last_module_desc
      from absences_cte c
      inner join #absences m on m.student_id = c.student_id and m.date >= c.start_date and m.date < c.end_date
      outer apply (select top 1 * from #absences na where na.student_id = c.student_id and na.date < c.start_date and na.absence_code not in ('A','C','E','H','O','P', 'V','W','S','T') order by student_id, date desc, end_time desc, start_time desc) l
--     ) sq
--group by student_id, last_date,last_start_time,last_end_time,last_module_code, last_module_desc

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
lisa_mcAuthor Commented:
Hi again mark sorry for not replying sooner but at the mo I can only test this in work

>>>So it is really important that we get the correct end date which really must be for the corresponding day in the 3rd (or 4th) week from "now" the error is most likely two things. first the Order By and second in the "exists" query to make sure we check up until (but not including) that end date as you have correctly pointed out, so I should not have been using the "between" because that would include that end date - it needs to be less than that. Does that make sense ? <<<<

ok let me see if I understand this using the example student BUT08027059 who is off exactly 4 weeks

They are off from fri week 22 28/1/11 until thurs 3/3/11 week 27  

so a.date = 28/1/11 and b.date = 4/3/11 as this is the next occurance of a value that doesnt not equal to 'O'.  If we use between this will include the 4/3/11 and because the student was present on this day then they are not inclued in this ouput

whereas if we use c.date < b.date and check up until 3/3/11 (last day in block that student was absent) this will show that this student has been absent 4 weeks exactly

Have I understood this correctly?

I am just testing it now so will get back asap

thanks
0
Mark WillsTopic AdvisorCommented:
You have understood correctly.

And no problems, understand that you can only test when at work.
0
lisa_mcAuthor Commented:
Have tested all my results and everything works perfect

many thanks again
0
lisa_mcAuthor Commented:
excellent work as usual thanks mark
0
Mark WillsTopic AdvisorCommented:
Fantastic - very happy to hear that. Well done.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server 2005

From novice to tech pro — start learning today.