Link to home
Start Free TrialLog in
Avatar of MarkJulie
MarkJulieFlag for United States of America

asked on

Oracle SQL LISTAGG() syntax problem

I am using a larger version of the following query. EDW_CM_ID defines a unique patient. ICD9_DX_CD is a one to many diagnosis list per patient. I need only those patients who have at least one of the ICD9_DX_CD codes in the prescribed numeric text list (like postal codes).

After finding all patients who have the range of desired ICD9_DX_CDs, I want to use LISTAGG() to create a field that concatenates all related ICD codes for that patient. Using this syntax, however, all LISTAGG() returns is the one code that met the HAVING criteria. I have put the HAVING criteria both in a HAVING block and in the WHERE clause with similar results.

How do I fix it to return all ICD9_DX_CDs?

 SELECT a.ACCT_NO
          ,  a.FCILTY_ID
          ,  b.ICD9_DX_CD
          ,  a.EDW_CM_ID
          ,  LISTAGG ( b.ICD9_DX_CD, ',') WITHIN GROUP ( ORDER BY b.ICD9_DX_CD ) AS ICD9_LIST

     FROM  CM.CASEMIX_SMRY                  a  
LEFT JOIN  CM.ENCNTR_ICD9_DX             b  ON  ( a.EDW_CM_ID          = b.EDW_CM_ID        )
      WHERE  a.FCILTY_ID IN = ( 128, 154, 139, 143, 116, 128)
                 
GROUP BY      
          ,  a.EDW_CM_ID
          ,  a.FCILTY_ID
          ,  b.ICD9_DX_CD
          ,  a.ACCT_NO
                       
HAVING      b.ICD9_DX_CD in
               (   '433', '43300', '43301', '43311', '43320'
               , '43321' , '43331', '43380', '43381'
               , '43390', '43391',   '434', '43400', '43401'
               , '43410', '43411', '43490', '43491',   '435'
               ,  '4350',  '4351',  '4352',  '4358',  '4359'
               ,   '430',   '431',  '4329',   '436',  '4353'
               )
ORDER BY a.EDW_CM_ID, a.FCILTY_ID,  a.ACCT_NO
;
Avatar of namethis
namethis

The fix to the having condition is simple.
And the column which is concatenated by listagg should not be used as a separate grouping column.

SELECT a.ACCT_NO
          ,  a.FCILTY_ID
--          ,  b.ICD9_DX_CD
          ,  a.EDW_CM_ID
          ,  LISTAGG ( b.ICD9_DX_CD, ',') WITHIN GROUP ( ORDER BY b.ICD9_DX_CD ) AS ICD9_LIST

     FROM  CM.CASEMIX_SMRY                  a  
LEFT JOIN  CM.ENCNTR_ICD9_DX             b  ON  ( a.EDW_CM_ID          = b.EDW_CM_ID        )
      WHERE  a.FCILTY_ID IN = ( 128, 154, 139, 143, 116, 128)
                 
GROUP BY      
          ,  a.EDW_CM_ID
          ,  a.FCILTY_ID
--          ,  b.ICD9_DX_CD
          ,  a.ACCT_NO
                        
HAVING   count( case when  b.ICD9_DX_CD in
               (   '433', '43300', '43301', '43311', '43320'
               , '43321' , '43331', '43380', '43381'
               , '43390', '43391',   '434', '43400', '43401'
               , '43410', '43411', '43490', '43491',   '435'
               ,  '4350',  '4351',  '4352',  '4358',  '4359'
               ,   '430',   '431',  '4329',   '436',  '4353'
               ) then 1 end)>0
ORDER BY a.EDW_CM_ID, a.FCILTY_ID,  a.ACCT_NO
;

Open in new window

Avatar of MarkJulie

ASKER

namethis,

That was the ideal fix for the ICD9_LIST field; however, in the process, I lose the value for b.ICD9_DX_CD as an independent value and the identifying criterion for each unique patient. What syntax will allow me to retain both b.ICD9_DX_CD and ICD9_LIST? An alias?
This should work.

SELECT a.ACCT_NO
          ,  a.FCILTY_ID
          ,  b.ICD9_DX_CD
          ,  a.EDW_CM_ID
          ,  LISTAGG ( b.ICD9_DX_CD, ',') WITHIN GROUP ( ORDER BY b.ICD9_DX_CD ) 
over (partition by a.ACCT_NO ,  a.FCILTY_ID ,  a.EDW_CM_ID) AS ICD9_LIST

     FROM  CM.CASEMIX_SMRY                  a  
LEFT JOIN  CM.ENCNTR_ICD9_DX             b  ON  ( a.EDW_CM_ID          = b.EDW_CM_ID        )
      WHERE  a.FCILTY_ID IN = ( 128, 154, 139, 143, 116, 128)
                 
GROUP BY      
          ,  a.EDW_CM_ID
          ,  a.FCILTY_ID
          ,  b.ICD9_DX_CD
          ,  a.ACCT_NO
                        
HAVING   count( case when  b.ICD9_DX_CD in
               (   '433', '43300', '43301', '43311', '43320'
               , '43321' , '43331', '43380', '43381'
               , '43390', '43391',   '434', '43400', '43401'
               , '43410', '43411', '43490', '43491',   '435'
               ,  '4350',  '4351',  '4352',  '4358',  '4359'
               ,   '430',   '431',  '4329',   '436',  '4353'
               ) then 1 end)>0
ORDER BY a.EDW_CM_ID, a.FCILTY_ID,  a.ACCT_NO
;

Open in new window

A follow-up question since I simplified the code for the purpose of the question:
If I actually have 32 SELECT fields (and 32 GROUP BY fields), do all of them need to be included in the OVER ( PARTITION BY... ) list?
ASKER CERTIFIED SOLUTION
Avatar of namethis
namethis

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
FYI, the single partition value returned an error which was eliminated by including all the nonaggregate fields in SELECT and GROUP BY.

Thanks for all the help.
Rapid response with answers easily applicable to my larger question.