Solved

SQL Query

Posted on 2013-06-27
17
163 Views
Last Modified: 2013-07-02
I have been asked to compile data from some outcomes in the format:

Patient count for Day, Total Surveys, Percentage of Total on that day this would be caluclated for a finite number of days until 100% is attained.  Record 1 would indicate the results for Day 1, Record 2 would be the results for Day 2, etc.  In this example 100% is attained on day 21 so in theory I would have 21 records with percentages for each day.

Patient Count for Day is a running total of patients up to that day, for example day 7 would equal day 1 + day 2 + day 3 + day 4, etc.

Day        Count       Total    Percent
1                32            68         47.059
2                39            68         57.353
3                47            68         69.118
4                55            68         80.882
5                56            68         82.353
etc....

The Table is very simple...
OPID  <--PK
QuestionID <--Not needed for this query
Response <-- *Calculated field* The response of the Day.  As in patient stopped on this day.
NA <--Not needed for this query but if checked it was not in survey.
0
Comment
Question by:dbote
  • 11
  • 5
17 Comments
 
LVL 33

Expert Comment

by:knightEknight
ID: 39282392
>> Patient Count for Day is a running total of patients up to that day, for example day 7 would equal day 1 + day 2 + day 3 + day 4, etc.

It seems that this piece of the data is missing in this example.  Isn't there another column containing the number of patients for each given day, for example:

Day     Count(Day)   Count(Total)    Total    Percent
1          32                  32                     68         47.059
2            7                  39                     68         57.353
3            8                  47                     68         69.118
4            8                  55                     68         80.882
5            1                   56                    68         82.353
0
 
LVL 1

Author Comment

by:dbote
ID: 39282428
That count is the number of patients that responsed 1 day in the response column.   In this case it is 32 people that responded 1 in the response column.  On Day 2 it was day 1(32) + day 2(7) for a total of 39.

32 patients responded 1
7 patients responded 2
8 patients responded 3
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39283313
>>Patient Count for Day is a running total of patients up to that day
the VERSION of SQL Server is important for this bit
what version is it please?
0
What is SQL Server and how does it work?

The purpose of this paper is to provide you background on SQL Server. It’s your self-study guide for learning fundamentals. It includes both the history of SQL and its technical basics. Concepts and definitions will form the solid foundation of your future DBA expertise.

 
LVL 48

Expert Comment

by:PortletPaul
ID: 39283881
where is the date?
The Table is very simple...
OPID  <--PK
QuestionID <--Not needed for this query
Response <-- *Calculated field* The response of the Day.  
NA
What can be expected in the Response field?
what does a "patient stopped" mean (stopped what?)
Is it just a count of the records in this table (by day) or is there something else to consider?
How is the " for a finite number of days" defined (e.g. is this provided as input?)

and, what version of sql server please?

can you provide some sample data too please.
0
 
LVL 1

Author Comment

by:dbote
ID: 39285496
I am using SQL Management Studio 2008 R2 but this database is running at Compatibility level: SQL Server 2000 (80).

Thank you!
0
 
LVL 1

Author Comment

by:dbote
ID: 39285540
P,

The response field is of decimal type but the response is generally an integer.  For eample, One question might be How many days did you wear your sling?  So one patient stopped using a sling on day 21 another on day 18 etc.

The finite number of days is the patient with the maximum number of days using a sling.  So if the database has 3 patients....Patient1 stopped on day 3, Patient2 stopped on day 12 and Patient 3 stopped on day 16.  We know that we will track the percentage until day 16 when the percentage of sling stopping will be at 100% or all patients will have stopped using their slings.

In the above example the response column values are 3,12,16.  If we calculate out to day 16 we have a percentage for each day.  Obviously we have way more than 3 patients but they all tend to stop around the same couple of weeks.

I ended up creating a procedure that would hold variables and then stuff the result into a Temp table and export to excel for the user.  It just seemed like there would be a better solution with transact sql.  I got close but I could not "bring it all together" into one table.

Thank you for your help.  Since the project is "done" it is more of a curiousity now to see if it was done correcty.

D
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39286050
SQL Server 2000

sample data
QuestionID ,OPID,Response
How many days did you wear your sling?, P1, 3
How many days did you wear your sling?, P2, 12
How many days did you wear your sling?, P3, 16

expected result?
Day     Count(Day)   Count(Total)    Total    Percent
3          1                  1                     3         33.33
12        1                  2                     3        66.66
16        1                  3                     3        100.00
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39286088
Hi, I'm not familiar with using compatibility level 80 within 2008R2, so I may need help with what one can and cannot use. Does this disable the following analytic functions?
count() over (partition by ..)
sum() over (partition by ..)

Please note I've 'cheated' in the following by using SQL 2012, but I did this just to see if the overall logic was correct first ans 2012 makes it easier as a first step. I have built a sqlfiddle with the small data sample above that produces what I think is the expected result. Could you verify if it is the right direction? (in addition to question above)
http://sqlfiddle.com/#!6/a0af3/4

QUESTION      DAY      COUNT_DAY      COUNT_TOTAL      TOTAL_PERCENT
How many days did you wear your sling?      3      1      1      33.333333333333
How many days did you wear your sling?      12      1      2      66.666666666666
How many days did you wear your sling?      16      1      3      99.999999999999
SELECT
  q.Question
, [Day]
, count_day
, sum(count_day)   over (partition BY QuestionID ORDER BY [Day]) AS count_total
, sum(percent_day) over (partition BY QuestionID ORDER BY [Day]) AS total_percent
FROM (
        SELECT
                QuestionID
              , response AS [Day]
              , count(*) AS count_day
              , count(*) over (partition BY QuestionID) AS count_total
              , count(*) * 100.0 / count(*) over (partition BY QuestionID) AS [percent_day]
        FROM questionresponse
        GROUP BY
                QuestionID
              , response
     ) AS derived
INNER JOIN questions AS Q ON derived.QuestionID = q.id
	

Open in new window

note lines 5 & 6 use function features specific to sql 2012 - please ignore this for now - but I do need to know if lines 12 & 13 are permitted in your environment.
0
 
LVL 1

Author Comment

by:dbote
ID: 39290303
P,

Correct the overall query chokes on lines 5 & 6.  When I run the FROM  query on its own the query runs and produces results.  Is it safe to say that those lines are permitted in my environment?  It looks good and and produces a count and a percent.

Thank you for your continued assistance!

D
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39292048
>>Is it safe to say that those lines are permitted in my environment?
(excepting lines 5&6) >> yes, I believe so

replicating the functionality of line 5&6 will be slightly harder - but can be done.
will look into this soon (I hope).
0
 
LVL 48

Assisted Solution

by:PortletPaul
PortletPaul earned 500 total points
ID: 39292469
I'm hoping this variant will work in our environment (?). It uses a CTE and a "cross apply" to arrive at the running counts and percent. If this does not work please let me know.
/*
QUESTION	                           DAY  COUNT_DAY COUNT_TOTAL  TOTAL_PERCENT
How many days did you wear your sling?  3   1          1            33.333333333333
How many days did you wear your sling?  12  1          2            66.666666666666
How many days did you wear your sling?  16  1          3            99.999999999999
*/
;with
baseCTE as (
            SELECT
                    QuestionID
                  , response AS [Day]
                  , count(*) AS count_day
                  , count(*) over (partition BY QuestionID) AS count_total
                  , count(*) * 100.0 / count(*) over (partition BY QuestionID) AS [percent_day]
                  , row_number() over (partition by QuestionID order by response ) as row_seq
            FROM questionresponse
            GROUP BY
                    QuestionID
                  , response
           )
SELECT
        q.Question
      , baseCTE.[Day]
      , baseCTE.count_day
      , running.count_total
      , running.total_percent
FROM baseCTE
cross apply (
             select
                    sum(count_day)   AS count_total
                  , sum(percent_day) AS total_percent
             from baseCTE as C
             where baseCTE.QuestionID = C.QuestionID
             and baseCTE.row_seq >= C.row_seq
            ) AS running
INNER JOIN questions AS Q ON baseCTE.QuestionID = q.id
;

Open in new window

0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39292479
mmm, maybe not "starting with SQL Server 2005, you can use the APPLY..." darn
ok, another one coming
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39292483
can you let me know if this works in your environment please?

;with CTE as ( select 1 as anInt )
select * from CTE
;
0
 
LVL 48

Accepted Solution

by:
PortletPaul earned 500 total points
ID: 39292489
here's an approach the does not use "cross apply" but still uses a CTE
;with
baseCTE as (
            SELECT
                    QuestionID
                  , response AS [Day]
                  , count(*) AS count_day
                  , count(*) over (partition BY QuestionID) AS count_total
                  , count(*) * 100.0 / count(*) over (partition BY QuestionID) AS [percent_day]
                  , row_number() over (partition by QuestionID order by response ) as row_seq
            FROM questionresponse
            GROUP BY
                    QuestionID
                  , response
           )
SELECT
        q.Question
      , baseCTE.[Day]
      , baseCTE.count_day
      , (
             select
                    sum(count_day)   AS count_total
             from baseCTE as C
             where baseCTE.QuestionID = C.QuestionID
             and baseCTE.row_seq >= C.row_seq
            ) AS count_total
      , (
             select
                    sum(percent_day) AS total_percent
             from baseCTE as C
             where baseCTE.QuestionID = C.QuestionID
             and baseCTE.row_seq >= C.row_seq
            ) AS total_percent
FROM baseCTE
INNER JOIN questions AS Q ON baseCTE.QuestionID = q.id
;

Open in new window

0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39292494
both with CTE and
0
 
LVL 1

Author Closing Comment

by:dbote
ID: 39293043
Excellent!  Impressive work.  Both solutions work and produce the same result so I split the value of 250 for each.  I wish I could assign more points to this solution as you deserve it.

D
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39293061
use the higher one (it's a little more efficient). ID: 39292469

thanks for your kind words, I enjoyed it, so thanks for the question.

Cheers, Paul.
0

Featured Post

Three Reasons Why Backup is Strategic

Backup is strategic to your business because your data is strategic to your business. Without backup, your business will fail. This white paper explains why it is vital for you to design and immediately execute a backup strategy to protect 100 percent of your data.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I wrote this interesting script that really help me find jobs or procedures when working in a huge environment. I could I have written it as a Procedure but then I would have to have it on each machine or have a link to a server-related search that …
Why is this different from all of the other step by step guides?  Because I make a living as a DBA and not as a writer and I lived through this experience. Defining the name: When I talk to people they say different names on this subject stuff l…
Familiarize people with the process of retrieving data from SQL Server using an Access pass-thru query. Microsoft Access is a very powerful client/server development tool. One of the ways that you can retrieve data from a SQL Server is by using a pa…
This videos aims to give the viewer a basic demonstration of how a user can query current session information by using the SYS_CONTEXT function

815 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now