How do you remove duplicate lines in a Teradata query?

I have a report I would like to remove duplicates "Claim_Id".   I end up having to export the report and remove duplicates in excel. Problem is I am generating way to much data because of the duplicates, how do I add code to my report to remove these when the report is executed?

clm.internal_id AS "Claim ID"      

Thank you
Culwatrnca11Data AnalystAsked:
Who is Participating?
 
PortletPaulfreelancerCommented:
A "duplicate" is an EXACT COPY, of EVERY detail.

I actually do not know what you mean by "seeing duplicates claim_IDs" because I cannot see anything. I think you mean something different to how I understand the word duplicate, but I honestly cannot be sure what you mean. You did not answer my first set of questions (especially "Is the second table closer to your need?).

Q1: Are whole rows  duplicated?
Q2: Or are you worried about repetition of claim_id "in a column"


The ONLY way I can help further is to suggest a method as I cannot hope to test this query as it relies on over 30 tables none of which I have.

Below is a version of your query with the fewest tables but that will still supply the where clause.
SELECT
      clm.internal_id                                                          AS "claim id"
    , ap.service_start_date                                                    AS "svc date"
    , ap.service_end_date                                                      AS "svc end date"
    , ap.date_received                                                         AS "date received"
    , (CURRENT_DATE - ap.date_received - 1)                                    AS "claim age"
    , ap.admission_type_c                                                      AS "admission type"
    , CASE ap.status_c WHEN '1' THEN 'New' WHEN '2' THEN 'Pending' 
                       WHEN '3' THEN 'Denied'
                       WHEN '4' THEN 'Clean' WHEN '5' THEN 'Void'
      ELSE ' ' END                                                             AS "claim status"
    , patient.PAT_MRN_ID                                                       AS pat_mrn
    , EXTRACT(YEAR FROM CURRENT_DATE) - EXTRACT(YEAR FROM patient.BIRTH_DATE) -
             CASE WHEN CURRENT_DATE (FORMAT 'MMDD') (CHAR(4)) <
                     patient.BIRTH_DATE (FORMAT 'MMDD')  (CHAR(4)) THEN 1
              ELSE 0 END                                                       AS AGE
    , ven.vendor_name   --When pulling for all Vendors
    , CASE      -- NCAL Only   Internal External
      WHEN (ven.rpt_grpr_03 = 'NC8'
                        OR ven.rpt_grpr_04 = 'RESOLUTE' 
                        OR ven.vendor_name LIKE '%AMERICAN%SPECIALTY%HEALTH%PLAN%OF%CA%') 
                        AND COALESCE(ap.tot_net_payable, 0) <= 0 
            THEN 'Encounters' ELSE 'Claims'
      END                                                                      AS intext_nc

FROM ap_claim ap
INNER JOIN clm_map clm ON ap.claim_id = clm.cid
INNER JOIN patient ON ap.pat_id = patient.pat_id
INNER JOIN clarity_vendor ven ON ap.vendor_id = ven.vendor_id
WHERE ap.SERV_DATE BETWEEN '10/01/2017' AND CURRENT_DATE    --'01/01/2017'  
AND ap.status_c IN ('1', '4', '5')
--AND INTEXT_NC = 'Encounters'
AND (ven.rpt_grpr_03 = 'NC8'
      OR ven.rpt_grpr_04 = 'RESOLUTE'
      OR ven.vendor_name LIKE '%AMERICAN%SPECIALTY%HEALTH%PLAN%OF%CA%')
AND ap.tot_net_payable <= 0

Open in new window


Q3: Are there "duplicated rows" in that result?

If you add the following into that query:
/*add to select list */
    , cml.adj_table_name   -- ADJUDICATION TABLE INFO

/* add to from clause, just above where clause */
INNER JOIN patient ON ap.pat_id = patient.pat_id
INNER JOIN PAT_ENC enc ON patient.PAT_ID = enc.pat_id
INNER JOIN (
            select distinct 
                  adj_table_cml_id, adj_table_name
            from Clarity_CML
            where adj_table_name IN ('PROVAHUC%', 'PROVPUC%')
            ) cml ON enc.ben_adj_table_id = cml.adj_table_cml_id     --pull in adjudication table

Open in new window


Q4: Does this now produce "duplicated rows"?

There are 4 questions, please answer each one.

nb: A "duplicate" is an EXACT COPY, of EVERY detail.
0
 
RobertSystem AdminCommented:
Don't know if this is helpful as I'm not familiar with Teradata  but to do this in SQL query you would just do
Select distinct clm.internal_id AS "Claim ID" FROM "Table Name" Where "Criteria" 

Open in new window

0
 
Culwatrnca11Data AnalystAuthor Commented:
Thanks for the input Robert, I am not sure how to work that into my query. this is my first time trying to remove duplicates.
0
Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

 
PortletPaulfreelancerCommented:
"duplicate" (very unfortunately) can mean different things to different people. SQL treats duplicates BY ROW so in the following table there are no duplicates:
+---------+---------+-----------------+--------------+
| ClaimID |  Col2   |      Col3       | NumberColumn |
+---------+---------+-----------------+--------------+
| abcd    | Value 2 | 123A            |         10.0 |
| abcd    | cols    | maybe some text |     -2,027.1 |
| abcd    | another | more here       |              |
+---------+---------+-----------------+--------------+

Open in new window

What I sense you are looking for is to suppress the repetition of values in a column so some value is visible only once (per grouping of data), something like the following:
+---------+---------+-----------------+--------------+
| ClaimID |  Col2   |      Col3       | NumberColumn |
+---------+---------+-----------------+--------------+
| abcd    | Value 2 | 123A            |         10.0 |
|         | cols    | maybe some text |     -2,027.1 |
|         | another | more here       |              |
+---------+---------+-----------------+--------------+

Open in new window

Is the second table closer to you need?
What product are you using to produce the report?


Please provide you current query.
0
 
Culwatrnca11Data AnalystAuthor Commented:
Paul,

Thanks for the feedback, I have attached my code.
TapUZData-Claims-Report-custom---exp.sql
0
 
PortletPaulfreelancerCommented:
Can you answer the 2 questions please.

Is table2 something like your objective? (i.e. you want to suppress value repetition in a column)
What product(s) are you using to produce the report?
0
 
Culwatrnca11Data AnalystAuthor Commented:
Paul,

The your first query is what is happening, my objective is to have just one column to determine a volume. Does that help?
0
 
Culwatrnca11Data AnalystAuthor Commented:
this table (last select column)     cml.adj_table_name       if left in will generate duplicate claim_IDs but I need the table left in but only want to see one line per claim_id
0
 
PortletPaulfreelancerCommented:
Please note that queries involving multiple joins and many columns in the select list are NOT good candidates for "select distinct", in fact the reverse is true.  That single extra word "distinct" can be the cause of extreme slowness in such a query.

If the column cml.adj_table_name is causing unwanted repetition in the query result, then find some way to reduce the rows coming from that table e.g.

LEFT OUTER JOIN (
            select distinct 
                  adj_table_cml_id, adj_table_name
            from Clarity_CML
            ) cml ON enc.ben_adj_table_id = cml.adj_table_cml_id     --pull in adjudication table

Open in new window

SELECT
      clm.internal_id                                                          AS "claim id"
    , cov.group_num
    , PATIENT.PAT_MRN_ID                                                       AS pat_mrn
    ,EXTRACT(YEAR FROM CURRENT_DATE) - EXTRACT(YEAR FROM PATIENT.BIRTH_DATE) -
             CASE WHEN CURRENT_DATE (FORMAT 'MMDD') (CHAR(4)) <
                     PATIENT.BIRTH_DATE (FORMAT 'MMDD')  (CHAR(4)) THEN 1
              ELSE 0 END                                                       AS AGE
    , CASE ap.status_c WHEN '1' THEN 'New' WHEN '2' THEN 'Pending' WHEN '3' THEN 'Denied'   		-- remove for alex group
      WHEN '4' THEN 'Clean' WHEN '5' THEN 'Void'   		    --remove for alex group
      ELSE ' ' END                                                             AS "claim status"
    , ven.vendor_name   --When pulling for all Vendors
    , CASE      -- NCAL Only   Internal External
      WHEN (ven.rpt_grpr_03 = 'NC8'
                        OR ven.rpt_grpr_04 = 'RESOLUTE' 
                        OR ven.vendor_name LIKE '%AMERICAN%SPECIALTY%HEALTH%PLAN%OF%CA%') 
                        AND COALESCE(ap.tot_net_payable, 0) <= 0 
            THEN 'Encounters' ELSE 'Claims'
      END                                                                      AS intext_nc
    , UPPER(spec.name)                                                         AS "specialty"
    , epp.benefit_plan_name                                                    AS "benefit plan"
    , epm.payor_name                                                           AS "payor"
    , pln.plan_grp_name                                                        AS "plan group"
    , ven.vendor_name                                                          AS "vendor"
    , lob.lob_name                                                             AS "lob name"
    , pos.pos_name                                                             AS "place of service"
    , ser.prov_type                                                            AS "provider type"
    , ap.service_start_date                                                    AS "svc date"
    , ap.service_end_date                                                      AS "svc end date"
    , ap.date_received                                                         AS "date received"
    , (CURRENT_DATE - ap.date_received - 1)                                    AS "claim age"
    , ap.admission_type_c                                                      AS "admission type"
    , COALESCE(zc_cf.NAME, 'UNKNOWN')                                          AS claim_type
    , eap1.eap_type_of_ser_c
    , cml.adj_table_name   -- ADJUDICATION TABLE INFO
FROM ap_claim ap
INNER JOIN clm_map clm ON ap.claim_id = clm.cid
LEFT OUTER JOIN ap_claim_dofr_chks dofr ON ap.claim_id = dofr.claim_id AND dofr.line = 1
LEFT OUTER JOIN zc_check_sts chk ON dofr.pmt_info_chk_sts_c = chk.check_status_c
LEFT OUTER JOIN ap_check ckr ON dofr.pmt_info_chk_id = ckr.check_id
LEFT OUTER JOIN clarity_vendor ven ON ap.vendor_id = ven.vendor_id
LEFT OUTER JOIN zc_specialty spec ON ap.assoc_spec_c = spec.specialty_c
/*LEFT OUTER JOIN ib_receiver ib ON ap.inbasket_message_id = ib.msg_id		
LEFT OUTER JOIN clarity_hip hip ON ib.registry_id = hip.registry_id	*/
LEFT OUTER JOIN ZC_CLAIM_FORMAT zc_cf ON ap.CLAIM_FORMAT_C = zc_cf.CLAIM_FORMAT_C
LEFT OUTER JOIN coverage cov ON ap.coverage_id = cov.coverage_id
LEFT OUTER JOIN clarity_epp epp ON ap.plan_id = epp.benefit_plan_id
LEFT OUTER JOIN clarity_epm epm ON ap.payor_id = epm.payor_id
LEFT OUTER JOIN plan_grp pln ON ap.plan_group_id = pln.plan_grp_id
LEFT OUTER JOIN clarity_lob lob ON ap.clm_lob_id = lob.lob_id
LEFT OUTER JOIN clarity_pos pos ON ap.loc_id = pos.pos_id
LEFT OUTER JOIN clarity_ser ser ON ap.prov_id = ser.prov_id
LEFT OUTER JOIN clarity_emp emp ON ap.entry_user_id = emp.user_id
LEFT OUTER JOIN ap_claim_2 ap2 ON ap.claim_id = ap2.claim_id
LEFT OUTER JOIN zc_meth_to_pay_clm zc_mtpc ON ap.meth_to_pay_clm_c = zc_mtpc.meth_to_pay_clm_c
LEFT OUTER JOIN patient ON ap.pat_id = patient.pat_id
LEFT OUTER JOIN ap_claim_px px1 ON ap.claim_id = px1.claim_id
LEFT OUTER JOIN ap_claim_dx dx1 ON ap.claim_id = dx1.claim_id
LEFT OUTER JOIN clarity_edg edg1 ON dx1.dx_id = edg1.dx_id
LEFT OUTER JOIN ap_claim_px_2 px2 ON ap.claim_id = px2.claim_id
LEFT OUTER JOIN clarity_eap eap1 ON px1.proc_id = eap1.proc_id
LEFT OUTER JOIN PAT_ENC enc ON patient.PAT_ID = enc.pat_id
LEFT OUTER JOIN clarity_prc prc ON enc.appt_prc_id = prc.prc_id      --pull in Benefit Group
LEFT OUTER JOIN (
            select distinct 
                  adj_table_cml_id, adj_table_name
            from Clarity_CML
            ) cml ON enc.ben_adj_table_id = cml.adj_table_cml_id     --pull in adjudication table
LEFT OUTER JOIN (
      SELECT
        pxx.POS_TYPE_C
      , pxx.CLAIM_ID
      , MAX (CASE WHEN pxx.LINE = 1 THEN c.PROC_CODE | | '-' | | c.PROC_NAME ELSE NULL END) AS CPT_CODE_1
      , MAX (CASE WHEN pxx.LINE = 2 THEN c.PROC_CODE | | '-' | | c.PROC_NAME ELSE NULL END) AS CPT_CODE_2
      , MAX (CASE WHEN pxx.LINE = 3 THEN c.PROC_CODE | | '-' | | c.PROC_NAME ELSE NULL END) AS CPT_CODE_3
      , MAX (CASE WHEN pxx.LINE = 4 THEN c.PROC_CODE | | '-' | | c.PROC_NAME ELSE NULL END) AS CPT_CODE_4
      , MAX (CASE WHEN pxx.LINE = 5 THEN c.PROC_CODE | | '-' | | c.PROC_NAME ELSE NULL END) AS CPT_CODE_5
      , MAX (CASE WHEN pxx.LINE = 6 THEN c.PROC_CODE | | '-' | | c.PROC_NAME ELSE NULL END) AS CPT_CODE_6
      , MAX (CASE WHEN pxx.LINE = 7 THEN c.PROC_CODE | | '-' | | c.PROC_NAME ELSE NULL END) AS CPT_CODE_7
      , MAX (CASE WHEN pxx.LINE = 8 THEN c.PROC_CODE | | '-' | | c.PROC_NAME ELSE NULL END) AS CPT_CODE_8
      , MAX (CASE WHEN pxx.LINE = 9 THEN c.PROC_CODE | | '-' | | c.PROC_NAME ELSE NULL END) AS CPT_CODE_9
      , MAX (CASE WHEN pxx.LINE = 10 THEN c.PROC_CODE | | '-' | | c.PROC_NAME ELSE NULL END) AS CPT_CODE_10
      FROM AP_CLAIM_PX pxx
      INNER JOIN AP_CLAIM b ON pxx.CLAIM_ID = b.CLAIM_ID --AND B.STATUS_C = 2
      INNER JOIN CLARITY_EAP c ON pxx.PROC_ID = c.PROC_ID
      GROUP BY
            pxx.CLAIM_ID
          , pxx.POS_TYPE_C
      ) eap ON ap.CLAIM_ID = eap.CLAIM_ID
LEFT JOIN (
      SELECT
            pxx.CLAIM_ID
          , MAX (CASE WHEN pxx.LINE = 1 THEN f.REF_BILL_CODE | | ' - ' | | f.DX_NAME ELSE NULL END) AS CLM_DX_1
      , MAX (CASE WHEN pxx.LINE = 2 THEN f.REF_BILL_CODE | | ' - ' | | f.DX_NAME ELSE NULL END) AS CLM_DX_2
      , MAX (CASE WHEN pxx.LINE = 3 THEN f.REF_BILL_CODE | | ' - ' | | f.DX_NAME ELSE NULL END) AS CLM_DX_3
      , MAX (CASE WHEN pxx.LINE = 4 THEN f.REF_BILL_CODE | | ' - ' | | f.DX_NAME ELSE NULL END) AS CLM_DX_4
      , MAX (CASE WHEN pxx.LINE = 5 THEN f.REF_BILL_CODE | | ' - ' | | f.DX_NAME ELSE NULL END) AS CLM_DX_5
      , MAX (CASE WHEN pxx.LINE = 6 THEN f.REF_BILL_CODE | | ' - ' | | f.DX_NAME ELSE NULL END) AS CLM_DX_6
      , MAX (CASE WHEN pxx.LINE = 7 THEN f.REF_BILL_CODE | | ' - ' | | f.DX_NAME ELSE NULL END) AS CLM_DX_7
      , MAX (CASE WHEN pxx.LINE = 8 THEN f.REF_BILL_CODE | | ' - ' | | f.DX_NAME ELSE NULL END) AS CLM_DX_8
      , MAX (CASE WHEN pxx.LINE = 9 THEN f.REF_BILL_CODE | | ' - ' | | f.DX_NAME ELSE NULL END) AS CLM_DX_9
      , MAX (CASE WHEN pxx.LINE = 10 THEN f.REF_BILL_CODE | | ' - ' | | f.DX_NAME ELSE NULL END) AS CLM_DX_10
      FROM AP_CLAIM_DX pxx
      INNER JOIN AP_CLAIM b ON pxx.CLAIM_ID = b.CLAIM_ID        --AND B.STATUS_C = 2
      INNER JOIN CLARITY_EDG f ON pxx.DX_ID = f.DX_ID
      GROUP BY
            pxx.CLAIM_ID
      ) dx ON ap.CLAIM_ID = dx.CLAIM_ID
LEFT OUTER JOIN ZC_POS_TYPE pt ON eap.POS_TYPE_C = pt.POS_TYPE_C  ---WW
LEFT OUTER JOIN CL_UB_REV_CODE rev ON px1.UBC_REVENUE_CODE_ID = rev.UB_REV_CODE_ID
WHERE ap.SERV_DATE BETWEEN '10/01/2017' AND CURRENT_DATE    --'01/01/2017'  
AND ap.status_c IN ('1', '4', '5')
AND cml.adj_table_name LIKE ANY ('PROVAHUC%' , 'PROVPUC%')
AND INTEXT_NC = 'Encounters'
ORDER BY
      clm.internal_id ASC;

Open in new window

Personally I would never use "select distinct" on the top of a query using >30 joins. I would suggest you remove unnecessary "noise" from your query, and then "start small"  to work out what joins I really do need and which ones of those need to be converted into a subquery that produces less rows.

Regarding this: "I have a report I would like to remove duplicates "Claim_Id""

Your query starts and ends with:
SELECT clm.internal_id  AS "claim id"
ORDER BY clm.internal_id ASC;

The repetition of clm.internal_id on every row is NECESSARY to produce the wanted order
0
 
Culwatrnca11Data AnalystAuthor Commented:
Made your suggested changes. still seeing duplicate claim_IDs
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.