Link to home
Start Free TrialLog in
Avatar of tgatif
tgatif

asked on

Query Help needed! Subquery on one table in Sql server 2000.** Kind of urgent **

I have one table and i want a subquery or a scalar subquery within same table:

ID, CODE, effdt                 COLLEGE  COMMENTS
1   0     2005-09-19               S        1K
1   0     2006-03-19             W        2K
2   0     2005-10-19             M        3K
2   1     2005-11-19             S        3K
3   0     2005-10-20             M        4SK
3   1     2006-01-19              S        4SK
3   2     2006-01-10              S       4XK
3   2     2005-12-19             M        GK

id, code, effdt and college are keys

I am looking to write a query which will tell me identify dupliacete code in multiple colleges for one emplid.
in this case one example is ID 1. It has code of 0 in both college S and college W. In my query I want to identify only those records which have duplicate codes in different college with effective dated.

So when i write my query i should see

ID, CODE, effdt                 COLLEGE  COMMENTS
1   0     2006-03-19               W        2K
3   2     2006-01-10             S       4xK


So in above, I got 2 Ids which had multiple code values in two differnet colleges. I got the most recent (effdt) out of those with the college name, code, ID and the comments

I have written sub blurbs that give me the IDs but it doesnt give me effdt.
A sample effdt query is written as

select flda, fldb, effdt from table 1
where effdt = (select max(a.effdt) from table 1 a where
a.flda = flda and a.fldb = fldb)



Avatar of hans_vd
hans_vd

Try this query:


select flda, fldb, effdt, college, comments
from table1
where (flda, fldb, effdt) in (
    select flda, fldb, min(effdt)
    from table1
    group by flda, fldb
    having count(*) > 1
)
Avatar of Guy Hengel [angelIII / a3]
select *
from yourtable t
where exists ( select null from yourtable c
                      where c.id = t.id
                      and c.code = t.code
                      group by c.id, c.code
                      having count(*) > 1
                    )
and effdt = ( select max ( i.effdt ) from yourtable i
                      where c.id = t.id
                      and c.code = t.code
                  )
 
       
Avatar of tgatif

ASKER

HI, can u please use the above names i am using from the table. thats would help a lot. like id, code etc..
to ensure just the 1 row per id and code then i believe it actually needs to become this...
 for an id and code get the max effectivedate
         with the row having the max college and max coment for that college
 
select Distinct t.*
from yourtable t
where exists ( select null from yourtable c
                      where c.id = t.id
                      and c.code = t.code
                      group by c.id, c.code
                      having count(distinct college) > 1
                    )
and effdt = ( select max ( i.effdt ) from yourtable i
                      where c.id = t.id
                      and c.code = t.code
                  )
 and college = (select max(college) from yourtable x
                         where t.id = x.id
                        and t.effdt=x.effdt
                        and t.code=x.code
                       )
and comments = (select max(comments) from yourtable as x
                         where t.id = x.id
                        and t.effdt=x.effdt
                        and t.code=x.code
                        and t.college = x.college
                       )
Avatar of tgatif

ASKER

this is what i have so far after putting into the live system with real tables.
Out of this table.

CUSTID, ACAD_CAREER, STDNT_CAR_NBR, EFFDT, EFFSEQ are keys.
When irun this code, it  bring thousands of data. doesnt seem right....

select *
from ACAD T
where exists ( select null from ACAD C
                      where C.CUSTID = T.CUSTID
                      AND C.STDNT_CAR_NBR = T.STDNT_CAR_NBR
                   AND C.ACAD_CAREER = T.ACAD_CAREER
                  AND C.EFFSEQ = T.EFFSEQ
                  group by C.CUSTID, C.ACAD_CAREER, C.STDNT_CAR_NBR
                      having count(*) > 1
                    )
and T.EFFDT = ( select max ( I.EFFDT ) from ACAD I
                      where T.CUSTID = I.CUSTID
                      and T.STDNT_CAR_NBR = I.STDNT_CAR_NBR
                   AND T.ACAD_CAREER = I.ACAD_CAREER
                        AND T.EFFSEQ = I.EFFSEQ
                  )
Avatar of tgatif

ASKER

Lowfatspread

I also have to doa cross check and see if the code exists for the other college or not.
Meaning that return only the rows which have STDNT_CAR_NBR the same for multiple COLLEGES

SO DATA in the dabase is like

ID      STDNT_CAR_NBR      ACAD_PROG      EFFDT        EFFSEQ      COLLEGE
1            0            DEGR        2005-10-19   0               S
1            0            DEGR        2006-10-19   0               M
1            0            DEGR        2002-10-19   0               S
Avatar of tgatif

ASKER

ID      STDNT_CAR_NBR      ACAD_PROG      EFFDT        EFFSEQ      COLLEGE
1            0                          DEGR        2005-10-19   0               S
1            0                          DEGR        2006-10-19   0               M
1            0                          DEGR        2002-10-19   0               S
2            0                          DEGR        2003-10-19   0               M
2            1                          DEGR        2001-10-19   0               S                        


SO AS ABOVE I WANT
the second row to be selected for ID 1 only. As ID 2 has 2 different stdnt_car_nbr, it shouldnt be in the query results.
line 2 of custid 1 is the one because its effective dated out of all 3 matching ones
Please try this; sorry about col names, will re-work soon if needed:


SELECT acad.*
FROM acad
INNER JOIN (
    SELECT flda, fldb, MIN(effdt) AS minEffdt
    FROM acad
    GROUP BY flda, fldb
    HAVING COUNT(DISTINCT college) > 1
) AS acadDups ON acadDups.flda = acad.flda AND acadDups.fldb = acad.fldb AND
    acad.minEffdt <> acad.effdt
CORRECTED column names:

SELECT acad.*
FROM acad
INNER JOIN (
    SELECT id, stdnt_car_nbr, MIN(effdt) AS minEffdt
    FROM acad
    GROUP BY id, stdnt_car_nbr
    HAVING COUNT(DISTINCT college) > 1
) AS acadDups ON acadDups.id = acad.id AND acadDups.stdnt_car_nbr = acad.stdnt_car_nbr AND
    acadDups.minEffdt <> acad.effdt
Seems to work.  

Probably a better alias is "acadGood" [rather than acadDups], since the subquery/derived table is finding the one to *keep* not the one(s) to get rid of -- the outer query determines that and lists them.
Avatar of tgatif

ASKER

i MODIFIED AND THIS IS MY FINAL


SELECT ACAD.*
FROM ACAD_PROG ACAD
INNER JOIN (
    SELECT CUSTID, STDNT_CAR_NBR, MIN(EFFDT) AS minEffdt, ACAD_CAREER
    FROM ACAD_PROG
    GROUP BY CUSTID, STDNT_CAR_NBR, ACAD_CAREER
    HAVING COUNT(DISTINCT COLLEGE) > 1
) AS acadDups ON acadDups.CUSTID = ACAD.CUSTID AND acadDups.STDNT_CAR_NBR = ACAD.STDNT_CAR_NBR AND
    acadDups.minEffdt <> ACAD.EFFDT AND ACAD.ACAD_CAREER = acadDups.ACAD_CAREER
ORDER BY ACAD.EMPLID

When I run the one you gave, it gives me still 14000 records. When i added rest of the keys like acad_career, it now has narrowed down my results to 117. more reasonable now.

but the problem is that its bringing max effdt results per college. So when i run the report for ID 2300 (lets assume) and that is was in two colleges lets assume (W and M), The current query will bring two rows of data such as

ID     STDNT_CAR_NBR     ACAD_PROG     EFFDT            EFFSEQ     COLLEGE
2300            0                          DEGR        2005-10-19       0               S
2300            0                          DEGR        2006-10-19       0               M

instead it should bring me only row 2 because it has the highest effective date.
Avatar of tgatif

ASKER

     
Comment from hans_vd
Date: 04/05/2006 11:07AM MST
      Comment       Accept

Try this query:


select flda, fldb, effdt, college, comments
from table1
where (flda, fldb, effdt) in (
    select flda, fldb, min(effdt)
    from table1
    group by flda, fldb
    having count(*) > 1
)

Comment from angelIII
Date: 04/05/2006 11:16AM MST
      Comment       Accept

select *
from yourtable t
where exists ( select null from yourtable c
                      where c.id = t.id
                      and c.code = t.code
                      group by c.id, c.code
                      having count(*) > 1
                    )
and effdt = ( select max ( i.effdt ) from yourtable i
                      where c.id = t.id
                      and c.code = t.code
                  )
 
       

Comment from tgatif
Date: 04/05/2006 11:17AM MST
      Your Comment       

HI, can u please use the above names i am using from the table. thats would help a lot. like id, code etc..

Comment from Lowfatspread
Date: 04/05/2006 11:27AM MST
      Comment       Accept

to ensure just the 1 row per id and code then i believe it actually needs to become this...
 for an id and code get the max effectivedate
         with the row having the max college and max coment for that college
 
select Distinct t.*
from yourtable t
where exists ( select null from yourtable c
                      where c.id = t.id
                      and c.code = t.code
                      group by c.id, c.code
                      having count(distinct college) > 1
                    )
and effdt = ( select max ( i.effdt ) from yourtable i
                      where c.id = t.id
                      and c.code = t.code
                  )
 and college = (select max(college) from yourtable x
                         where t.id = x.id
                        and t.effdt=x.effdt
                        and t.code=x.code
                       )
and comments = (select max(comments) from yourtable as x
                         where t.id = x.id
                        and t.effdt=x.effdt
                        and t.code=x.code
                        and t.college = x.college
                       )

Comment from tgatif
Date: 04/05/2006 11:38AM MST
      Your Comment       

this is what i have so far after putting into the live system with real tables.
Out of this table.

CUSTID, ACAD_CAREER, STDNT_CAR_NBR, EFFDT, EFFSEQ are keys.
When irun this code, it  bring thousands of data. doesnt seem right....

select *
from ACAD T
where exists ( select null from ACAD C
                      where C.CUSTID = T.CUSTID
                      AND C.STDNT_CAR_NBR = T.STDNT_CAR_NBR
                 AND C.ACAD_CAREER = T.ACAD_CAREER
                AND C.EFFSEQ = T.EFFSEQ
                group by C.CUSTID, C.ACAD_CAREER, C.STDNT_CAR_NBR
                      having count(*) > 1
                    )
and T.EFFDT = ( select max ( I.EFFDT ) from ACAD I
                      where T.CUSTID = I.CUSTID
                      and T.STDNT_CAR_NBR = I.STDNT_CAR_NBR
                 AND T.ACAD_CAREER = I.ACAD_CAREER
                        AND T.EFFSEQ = I.EFFSEQ
                  )

Comment from tgatif
Date: 04/05/2006 11:52AM MST
      Your Comment       

Lowfatspread

I also have to doa cross check and see if the code exists for the other college or not.
Meaning that return only the rows which have STDNT_CAR_NBR the same for multiple COLLEGES

SO DATA in the dabase is like

ID     STDNT_CAR_NBR     ACAD_PROG     EFFDT       EFFSEQ     COLLEGE
1            0            DEGR        2005-10-19   0               S
1            0            DEGR        2006-10-19   0               M
1            0            DEGR        2002-10-19   0               S

Comment from tgatif
Date: 04/05/2006 11:54AM MST
      Your Comment       

ID     STDNT_CAR_NBR     ACAD_PROG     EFFDT       EFFSEQ     COLLEGE
1            0                          DEGR        2005-10-19   0               S
1            0                          DEGR        2006-10-19   0               M
1            0                          DEGR        2002-10-19   0               S
2            0                          DEGR        2003-10-19   0               M
2            1                          DEGR        2001-10-19   0               S                        


SO AS ABOVE I WANT
the second row to be selected for ID 1 only. As ID 2 has 2 different stdnt_car_nbr, it shouldnt be in the query results.
line 2 of custid 1 is the one because its effective dated out of all 3 matching ones

Comment from ScottPletcher
Date: 04/05/2006 11:58AM MST
      Comment       Accept

Please try this; sorry about col names, will re-work soon if needed:


SELECT acad.*
FROM acad
INNER JOIN (
    SELECT flda, fldb, MIN(effdt) AS minEffdt
    FROM acad
    GROUP BY flda, fldb
    HAVING COUNT(DISTINCT college) > 1
) AS acadDups ON acadDups.flda = acad.flda AND acadDups.fldb = acad.fldb AND
    acad.minEffdt <> acad.effdt

Comment from ScottPletcher
Date: 04/05/2006 12:02PM MST
      Comment       Accept

CORRECTED column names:

SELECT acad.*
FROM acad
INNER JOIN (
    SELECT id, stdnt_car_nbr, MIN(effdt) AS minEffdt
    FROM acad
    GROUP BY id, stdnt_car_nbr
    HAVING COUNT(DISTINCT college) > 1
) AS acadDups ON acadDups.id = acad.id AND acadDups.stdnt_car_nbr = acad.stdnt_car_nbr AND
    acadDups.minEffdt <> acad.effdt

Comment from ScottPletcher
Date: 04/05/2006 12:05PM MST
      Comment       Accept

Seems to work.  

Probably a better alias is "acadGood" [rather than acadDups], since the subquery/derived table is finding the one to *keep* not the one(s) to get rid of -- the outer query determines that and lists them.

Comment from tgatif
Date: 04/05/2006 12:31PM MST
      Your Comment       

i MODIFIED AND THIS IS MY FINAL


SELECT ACAD.*
FROM ACAD_PROG ACAD
INNER JOIN (
    SELECT CUSTID, STDNT_CAR_NBR, MIN(EFFDT) AS minEffdt, ACAD_CAREER
    FROM ACAD_PROG
    GROUP BY CUSTID, STDNT_CAR_NBR, ACAD_CAREER
    HAVING COUNT(DISTINCT COLLEGE) > 1
) AS acadDups ON acadDups.CUSTID = ACAD.CUSTID AND acadDups.STDNT_CAR_NBR = ACAD.STDNT_CAR_NBR AND
    acadDups.minEffdt <> ACAD.EFFDT AND ACAD.ACAD_CAREER = acadDups.ACAD_CAREER
ORDER BY ACAD.CUSTID
Avatar of tgatif

ASKER

shoot!
my post disspeared

This is the final I have
SELECT ACAD.*
FROM ACAD_PROG ACAD
INNER JOIN (
    SELECT CUSTID, STDNT_CAR_NBR, MIN(EFFDT) AS minEffdt, ACAD_CAREER
    FROM ACAD_PROG
    GROUP BY CUSTID, STDNT_CAR_NBR, ACAD_CAREER
    HAVING COUNT(DISTINCT COLLEGE) > 1
) AS acadDups ON acadDups.CUSTID = ACAD.CUSTID AND acadDups.STDNT_CAR_NBR = ACAD.STDNT_CAR_NBR AND
    acadDups.minEffdt <> ACAD.EFFDT AND ACAD.ACAD_CAREER = acadDups.ACAD_CAREER
ORDER BY ACAD.CUSTID


Problem. Its still bringing results for two colleges. its bringing max effdt results per college. lets say custid 2300 has this data

ID     STDNT_CAR_NBR     ACAD_PROG     EFFDT       EFFSEQ     COLLEGE
23            0                          DEGR        2005-10-19   0               S
23          0                            DEGR        2006-10-19   0               M
23          0                            DEGR        2001-10-19   0               S
23          0                            DEGR        2002-10-19   0               S


It should bring the second row as its the highest max effdt of all. but instead it bring two rows. one for max effdt when college is M and one when 'S' (row 1)

My query is to get the college name and IDs from this table as which college had last modified the record (based on effdt) and which had duplicate stdn record number in other colleges
ASKER CERTIFIED SOLUTION
Avatar of Scott Pletcher
Scott Pletcher
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial