Solved

Query syntax

Posted on 2013-06-03
17
190 Views
Last Modified: 2013-06-07
Hello,

I am looking for a few pointers rather than a complete solution.

I have a table (well indexed) that contains 100000+ rows that lists PatientId, TreatmentDate, and a number of columns that list the patients treatment.


CREATE TABLE [dbo].[PatientTreatments](
[id] [int] IDENTITY(1,1) NOT NULL,
[PatientId] [nvarchar](12) NOT NULL,
[TreatmentDate] [date] NULL,
[Treatment_None] [int] NULL,
[Treatment_PDT] [int] NULL,
[Treatment_Laser] [int] NULL,
[Treatment_Lucentis] [int] NULL,
[Treatment_Macugen] [int] NULL,
[Treatment_Avastin] [int] NULL,
[Treatment_Triamcinolon] [int] NULL,
[Treatment_AnecortAveac] [int] NULL,
[Treatment_Annan] [int] NULL,
[TreatmentType] [int] NULL
) ON [PRIMARY]

Open in new window


I need to count the number of treatments that patients have had over a certain period.

I guess that it is a simple group by TreatmentType however there is one other piece of logic that needs to be addressed.

There are 8 different treatment types each with its own column in the database.  I am only interested in patients that have had the same treatment type in all thier visits.

So if a patient has changed treatment type his/her treatments should be excluded from the totals.

How would the experts recommend doing this?

Maybe there is a simple statement but i was wondering if i should create a temporary table with a "total value" for treament.  For example the combination of the treatment types could be seen as a binary value.  If i create a table with that value then it would be easy to compare records against each other.  

Then maybe a not exists in sub query could look for patients who have changed treatment.
0
Comment
Question by:soozh
  • 6
  • 5
  • 3
  • +2
17 Comments
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39215962
I think counts would satisfy this requirement. A count(distinct <<whatever>>) = 1 would mean no change has occurred within the period and think it should be ok in case any of those fields are null e.g.
select
  patientid
, count(treatmentdate) count_treatments
, count(distinct treatment_pdt) count_pdt
, count(distinct treatment_laser) count_laser
, count(distinct treatment_lucentis) count_lucentis
, count(distinct treatment_macugen) count_macugen
, count(distinct treatment_avastin) count_avastin
, count(distinct treatment_triamcinolon) count_triamcinolon
, count(distinct treatment_anecortaveac) count_anecortaveac
, count(distinct treatment_annan) count_annan
, count(distinct treatmenttype) count_type
from PatientTreatments
where (treatmentdate >= '2013-01-01' and treatmentdate < '2013-06-01')
group by
  patientid
having
   count(distinct treatment_pdt) = 1
or count(distinct treatment_laser) = 1
or count(distinct treatment_lucentis) = 1
or count(distinct treatment_macugen) = 1
or count(distinct treatment_avastin) = 1
or count(distinct treatment_triamcinolon) = 1
or count(distinct treatment_anecortaveac) = 1
or count(distinct treatment_annan) = 1
;

select
*
from PatientTreatments as pt
inner join (
            select
              patientid
            , count(treatmentdate) tcount
            , count(distinct treatment_pdt) count_pdt
            , count(distinct treatment_laser) count_laser
            , count(distinct treatment_lucentis) count_lucentis
            , count(distinct treatment_macugen) count_macugen
            , count(distinct treatment_avastin) count_avastin
            , count(distinct treatment_triamcinolon) count_triamcinolon
            , count(distinct treatment_anecortaveac) count_anecortaveac
            , count(distinct treatment_annan) count_annan
            , count(distinct treatmenttype) count_type
            from PatientTreatments
            where (treatmentdate >= '2013-01-01' and treatmentdate < '2013-06-01')
            group by
              patientid
            ) as tcount
      on pt.patientid = tcount.patientid
where count_pdt=1

Open in new window

see: http://sqlfiddle.com/#!3/6da38/1

perhaps try this on some real data?
0
 
LVL 47

Expert Comment

by:Dale Fye (Access MVP)
ID: 39215987
can a patient receive multiple types of treatment during a single visit?

I'm not sure what the [TreatmentType] field is for, it appears that you have separate fields for each of the treatment types, so why do you need this field?

I see that those other treatment types columns are integers, what are the values associated with the fields?  If those values are 0/1, then I was thinking something that looks more like:

select
  patientid
, count(treatmentdate) count_treatments
, SUM(treatment_pdt) count_pdt
, SUM(treatment_laser) count_laser
, SUM(treatment_lucentis) count_lucentis
, SUM(treatment_macugen) count_macugen
, SUM(treatment_avastin) count_avastin
, SUM(treatment_triamcinolon) count_triamcinolon
, SUM(treatment_anecortaveac) count_anecortaveac
, SUM(treatment_annan) count_annan

from PatientTreatments
where (treatmentdate >= '2013-01-01' and treatmentdate < '2013-06-01')
group by
  patientid
having
SUM(treatment_pdt) count_pdt = count(treatmentdate)
OR SUM(treatment_laser) = count(treatmentdate)
OR SUM(treatment_lucentis) = count(treatmentdate)
OR SUM(treatment_macugen) = count(treatmentdate)
OR SUM(treatment_avastin) = count(treatmentdate)
OR SUM(treatment_triamcinolon) = count(treatmentdate)
OR SUM(treatment_anecortaveac) = count(treatmentdate)
OR SUM(treatment_annan) = count(treatmentdate)
0
 

Author Comment

by:soozh
ID: 39216023
sorry "TreatmentType" should be "treatment type".  Its not a database field.

And yes each treatment has its own column.

There can be muliple treatments.

They columsn contain 1 or 0 to indicated treatment or no treatment.
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39216049
>>They columns contain 1 or 0 to indicated treatment or no treatment.
ah
I had assumed they were foreign keys...
hence distinct has no great significance, and as fyed indicates, changing those counts to sums would work instead
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 39216090
How would the experts recommend doing this?
An expert would probably not design the table where a treatment is hard coded into separate columns.  That is a recipe for disaster, as you have discovered.
0
 

Author Comment

by:soozh
ID: 39216243
a real expert would realise that "hindsight is an exact science..." and leave it at that....
0
 
LVL 69

Expert Comment

by:ScottPletcher
ID: 39216257
SELECT
    PatientId,
    SUM(Treatment_PDT) AS Treatment_PDT,
    SUM(Treatment_Laser) AS Treatment_Laser,
    SUM(Treatment_Lucentis) AS Treatment_Lucentis,
    SUM(Treatment_Macugen) AS Treatment_Macugen,
    SUM(Treatment_Avastin) AS Treatment_Avastin,
    SUM(Treatment_Triamcinolon) AS Treatment_Triamcinolon,
    SUM(Treatment_AnecortAveac) AS Treatment_AnecortAveac,
    SUM(Treatment_Annan) AS Treatment_Annan
FROM dbo.PatientTreatments
WHERE
    --for the past ~12 months only
    TreatmentDate >= '20120604' AND
    TreatmentDate < '20130604'    
GROUP BY
    PatientId
HAVING
    SUM(Treatment_PDT) = COUNT(*) OR
    SUM(Treatment_Laser) = COUNT(*) OR
    SUM(Treatment_Lucentis) = COUNT(*) OR
    SUM(Treatment_Macugen) = COUNT(*) OR
    SUM(Treatment_Avastin) = COUNT(*) OR
    SUM(Treatment_Triamcinolon) = COUNT(*) OR
    SUM(Treatment_AnecortAveac) = COUNT(*) OR
    SUM(Treatment_Annan) = COUNT(*)
ORDER BY
    PatientId
0
 

Author Comment

by:soozh
ID: 39217074
Thanks for the solutions.  However we are missing something.

The example above from ScottPletcher returns rows when a patient has multiple treatments at the same visit.

I have mentioned that a patient can have multiple treatments at any one visit, but also they can also have a visit without any treatment.  In this case they are still included in the result because they have not changed treatment.

What i am looking for is a count of the number of treatments a patient has had when they have not changed treatment.  The next step will be to look for patients that have only had a set of treatments.

Maybe this can not be done in a single statement?  Or maybe some dynamic sql?
0
Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

 
LVL 47

Expert Comment

by:Dale Fye (Access MVP)
ID: 39217125
So, If I understand you correctly, if a patient has multiple treatments in the same visit, and one of those treatment types was given in each of the other treatments during a specified period, then that person should be returned in the result set.

But if they have a visit with no treatments, what is the point of the visit?  Try the following, I've added a criteria to the WHERE clause to exclude those records where no treatment was performed.  This would, however eliminate those patients who only had one visit during the desired period and where no treatment occurred.

select
  patientid
, count(treatmentdate) count_treatments
, SUM(treatment_pdt) count_pdt
, SUM(treatment_laser) count_laser
, SUM(treatment_lucentis) count_lucentis
, SUM(treatment_macugen) count_macugen
, SUM(treatment_avastin) count_avastin
, SUM(treatment_triamcinolon) count_triamcinolon
, SUM(treatment_anecortaveac) count_anecortaveac
, SUM(treatment_annan) count_annan

from PatientTreatments
where (treatmentdate >= '2013-01-01' and treatmentdate < '2013-06-01')
AND ([Treatment_Laser] + [Treatment_lucentis] + [Treatment_macugen] + [Treatment_avastin] + [Treatment_triamcinolon] + [treatment_anecortaveac] + [treatment_annan]) = 0
group by
  patientid
having
SUM(treatment_pdt) count_pdt = count(treatmentdate)
OR SUM(treatment_laser) = count(treatmentdate)
OR SUM(treatment_lucentis) = count(treatmentdate)
OR SUM(treatment_macugen) = count(treatmentdate)
OR SUM(treatment_avastin) = count(treatmentdate)
OR SUM(treatment_triamcinolon) = count(treatmentdate)
OR SUM(treatment_anecortaveac) = count(treatmentdate)
OR SUM(treatment_annan) = count(treatmentdate)
0
 

Author Comment

by:soozh
ID: 39217214
Thanks for the the response.

So, If I understand you correctly, if a patient has multiple treatments in the same visit, and one of those treatment types was given in each of the other treatments during a specified period, then that person should be returned in the result set.


No this is not correct.  I am looking for patients that have only had the same type of treatment in all the visits where they have been treated.  So if they have had PDT and only PDT then we should include them...but if the had some other treatment at any time (either on a visit with PDT, or on a visit with just the new treatment they should be excluded.

But if they have a visit with no treatments, what is the point of the visit?  


We dont know if we are going to give the patient any treatment until they have been examined at the visit.  If they are lucky we can tell them to go home.  However the doctor may decide to treat them - hence the visits without treatment.

We follow the patient even when they are not being treated - in this way we know the treatment has been and still is successfull.
0
 
LVL 69

Expert Comment

by:ScottPletcher
ID: 39217361
I thought you might have meant that, but wasn't 100% sure from the q.

I think we can just change the ORs in the HAVING to ANDs (one reason I wrote the query specifically that way):


HAVING
    SUM(Treatment_PDT) = COUNT(*) AND
    SUM(Treatment_Laser) = COUNT(*) AND
    SUM(Treatment_Lucentis) = COUNT(*) AND
    SUM(Treatment_Macugen) = COUNT(*) AND
    SUM(Treatment_Avastin) = COUNT(*) AND
    SUM(Treatment_Triamcinolon) = COUNT(*) AND
    SUM(Treatment_AnecortAveac) = COUNT(*) AND
    SUM(Treatment_Annan) = COUNT(*)
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39217824
all those AND conditions will only return a match if all treatments are provided in each visit

i.e. if the SUM(of any treatment) = 0  then that patient gets excluded
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39217910
in the following: use of a CTE is purely to simplify the required filtering (i.e. the same could be achieved inside a having clause but it's just ugly that way).

please see it in action here: http://sqlfiddle.com/#!3/a961f/1
;WITH CTE1 AS (
                SELECT
                  patientid
                             /* 1.0 is to provide decimal results later */
                , 1.0 * count(CASE WHEN (treatment_pdt + treatment_laser + treatment_lucentis + treatment_macugen + treatment_avastin + treatment_triamcinolon + treatment_anecortaveac + treatment_annan) > 0 THEN 1 ELSE NULL END) count_treatments
                , sum(treatment_pdt + treatment_laser + treatment_lucentis + treatment_macugen + treatment_avastin + treatment_triamcinolon + treatment_anecortaveac + treatment_annan) sum_treatments
                , sum(treatment_pdt) sum_pdt
                , sum(treatment_laser) sum_laser
                , sum(treatment_lucentis) sum_lucentis
                , sum(treatment_macugen) sum_macugen
                , sum(treatment_avastin) sum_avastin
                , sum(treatment_triamcinolon) sum_triamcinolon
                , sum(treatment_anecortaveac) sum_anecortaveac
                , sum(treatment_annan) sum_annan
                FROM PatientTreatments

                   /* modify where clause to suit, e.g. data range */
                WHERE (treatmentdate >= '2013-01-01' AND treatmentdate < '2013-06-01')

                   /* nb excluding "no treatment" records here to avoid divide by zero later */
                AND (treatment_pdt + treatment_laser + treatment_lucentis + treatment_macugen + treatment_avastin + treatment_triamcinolon + treatment_anecortaveac + treatment_annan) > 0
                GROUP BY
                  patientid
             )
SELECT
*
FROM CTE1
WHERE sum_pdt       /count_treatments IN (0.0,1.0)
AND sum_laser       /count_treatments IN (0.0,1.0)
AND sum_lucentis    /count_treatments IN (0.0,1.0)
AND sum_macugen     /count_treatments IN (0.0,1.0)
AND sum_avastin     /count_treatments IN (0.0,1.0)
AND sum_triamcinolon/count_treatments IN (0.0,1.0)
AND sum_anecortaveac/count_treatments IN (0.0,1.0)
AND sum_annan       /count_treatments IN (0.0,1.0)
;

Open in new window

0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39217916
mmm,
line 6 is optional
line 5 could probably just be

, 1.0 * count(*) as count_treatment

(it survived from an earlier iteration and I overlooked changing it)
0
 
LVL 48

Accepted Solution

by:
PortletPaul earned 250 total points
ID: 39218119
>>The next step will be to look for patients that have only had a set of treatments.
this may seem like overkill, but perhaps the following may set you up for easier analysis of treatment sets (by "unpivoting" the data). Note if you do pursue this route it may pay to move the unpivoted data into a (temp?) table with appropriate indexes - here I've just used CTEs to illustrate
;WITH
CTE1 AS (
            SELECT
                PatientId
              , TreatmentDate
              , TreatedWith
              , isTreated
            FROM PatientTreatments
            CROSS apply (
                          VALUES
                           ('pdt'         ,Treatment_PDT)
                          ,('laser'       ,Treatment_Laser)
                          ,('lucentis'    ,Treatment_Lucentis)
                          ,('macugen'     ,Treatment_Macugen)
                          ,('avastin'     ,Treatment_Avastin)
                          ,('triamcinolon',Treatment_Triamcinolon)
                          ,('anecortaveac',Treatment_AnecortAveac)
                          ,('annan'       ,Treatment_Annan)
                        ) AS CA1(TreatedWith, isTreated)

                     /* modify where clause to suit, e.g. data range */
             WHERE (treatmentdate >= '2013-01-01' AND treatmentdate < '2013-06-01')
             )
, CTE2 AS (
            SELECT
                PatientId
              , TreatedWith
              , count(DISTINCT TreatmentDate) AS count_dates
              , sum(isTreated)                AS sum_treatments
             FROM CTE1
             WHERE istreated = 1
             GROUP BY
                PatientId
              , TreatedWith
          )
SELECT
  PatientId
, min(count_dates)    as count_dates
, sum(sum_treatments) as sum_treatments
FROM CTE2
GROUP BY
  PatientId
HAVING sum(count_dates) = sum(sum_treatments)

Open in new window

see this at: http://sqlfiddle.com/#!3/2bd98/1
0
 
LVL 69

Assisted Solution

by:ScottPletcher
ScottPletcher earned 250 total points
ID: 39220069
CORRECTION:

Sorry, quite right, I rushed my previous HAVING code too much.  Please try this:


SELECT
    PatientId,
    SUM(Treatment_PDT) AS Treatment_PDT,
    SUM(Treatment_Laser) AS Treatment_Laser,
    SUM(Treatment_Lucentis) AS Treatment_Lucentis,
    SUM(Treatment_Macugen) AS Treatment_Macugen,
    SUM(Treatment_Avastin) AS Treatment_Avastin,
    SUM(Treatment_Triamcinolon) AS Treatment_Triamcinolon,
    SUM(Treatment_AnecortAveac) AS Treatment_AnecortAveac,
    SUM(Treatment_Annan) AS Treatment_Annan
FROM dbo.PatientTreatments
WHERE
    --for the past ~12 months only
    TreatmentDate >= '20120604' AND
    TreatmentDate < '20130604'    
GROUP BY
    PatientId
HAVING
    (SUM(Treatment_PDT) = COUNT(*) OR SUM(Treatment_PDT) = 0) AND
    (SUM(Treatment_Laser) = COUNT(*) OR SUM(Treatment_Laser) = 0) AND
    (SUM(Treatment_Lucentis) = COUNT(*) OR SUM(Treatment_Lucentis) = 0) AND
    (SUM(Treatment_Macugen) = COUNT(*) OR SUM(Treatment_Macugen) = 0) AND
    (SUM(Treatment_Avastin) = COUNT(*) OR SUM(Treatment_Avastin) = 0) AND
    (SUM(Treatment_Triamcinolon) = COUNT(*) OR SUM(Treatment_Triamcinolon) = 0) AND
    (SUM(Treatment_AnecortAveac) = COUNT(*) OR SUM(Treatment_AnecortAveac) = 0) AND
    (SUM(Treatment_Annan) = COUNT(*) OR SUM(Treatment_Annan) = 0)
0
 

Author Closing Comment

by:soozh
ID: 39228917
I decided to go down the route of unpivoting the data.  

I have a followup question which i will ask in a new thread.  

Thanks for the fantastic help.
0

Featured Post

Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

Join & Write a Comment

Confronted with some SQL you don't know can be a daunting task. It can be even more daunting if that SQL carries some of the old secret codes used in the Ye Olde query syntax, such as: (+)     as used in Oracle;     *=     =*    as used in Sybase …
PL/SQL can be a very powerful tool for working directly with database tables. Being able to loop will allow you to perform more complex operations, but can be a little tricky to write correctly. This article will provide examples of basic loops alon…
Here's a very brief overview of the methods PRTG Network Monitor (https://www.paessler.com/prtg) offers for monitoring bandwidth, to help you decide which methods you´d like to investigate in more detail.  The methods are covered in more detail in o…
You have products, that come in variants and want to set different prices for them? Watch this micro tutorial that describes how to configure prices for Magento super attributes. Assigning simple products to configurable: We assigned simple products…

705 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now