Link to home
Start Free TrialLog in
Avatar of kong01
kong01

asked on

SQL query

How do I get this:
TESTER             Latest Record         Design
WTES31PA        June 20, 2008        1G-GH70

From this table:
TESTER               MAX(MAXST DESIGN
-------------------- --------- --------------------
WTES31PA             20-JUN-08 1G-GH70
WTES31PA             08-JUN-08 1G-GV70
WTES31PA             19-JUN-08 512M-D80
WTES31PA             04-JUN-08 512M-GV70
Avatar of Marcjev
Marcjev
Flag of Belgium image

SELECT TESTER, STARTTIME, DESIGN
FROM TABLE1
JOIN (SELECT TESTER, MAX(STARTTIME)  MAXST FROM TABLE1) AS TABLE2
ON TABLE2.TESTER = TABLE1.TESTER AND TABLE2.STARTTIME = TABLE1.STARTTIME
ASKER CERTIFIED SOLUTION
Avatar of Guy Hengel [angelIII / a3]
Guy Hengel [angelIII / a3]
Flag of Luxembourg image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
As tables grow larger mine will be faster as the inner select will be only evaluated once to determine the max per tester. This little table is then used in the join.
Although the second sollution might seem more readadle, a select will occure once per tester. If a table contains many many rows (millions?) you will see this occuring.
And why not learn to use the better performing syntax right from the start?
first of all, you might want to review the query you posted, it has 3 errors (I leave it to you to post the corrections).

in the meantime, I will run some tests to reconfirm that my code is (at least) as well as yours in terms of performance.
I tested with a table of 600K records with no index, your query (ie the corrected one) took 5 seconds, and mine took 5 seconds also (average over 10 executions).
so, in case there is no index, the 2 syntax produce the same performance !!!

then, I created a index on TESTER + STARTTIME (took 5 seconds).
effects on the performance:
both queries now run subsecond (and using EXACTLY the same execution plan!!!!)

tested on a sql server 2005 in a VMware machine, Windows 2003 Server, 2GB ram, 1 CPU, external firewire disk.
but be assured that the results would be the same on oracle or any other database...
SELECT TABLE1.TESTER, TABLE1.STARTTIME, TABLE1.DESIGN
FROM TABLE1
JOIN (
      SELECT      TESTER,
                  MAX(STARTTIME)  MAXST
                  FROM TABLE1
                  GROUP BY TESTER
        ) AS TABLE2
ON TABLE2.TESTER = TABLE1.TESTER AND TABLE2.MAXST = TABLE1.STARTTIME

OK. my mistake
Avatar of kong01
kong01

ASKER

Angle111--- What is the t and i in your response?  
the "t" and "i" are aliases for the table, as the table is used 2 times in the query/subquery.
that method allows to correlate the table in the subquery:
 WHERE t.STARTTIME = ( SELECT MAX(i.STARTTIME) FROM yourtable i WHERE i.TESTER = t.TESTER )

the subquery will return the max(startime) for the tester in the main query.
Avatar of kong01

ASKER

angelIII----  thanks for the update; How could I apply this same concept to the following query.

select wafer_flow.wafer.tester, wafer_flow.wafer.stage,wafer_flow.lot.design as design,
 max( wafer_flow.lot.start_time) as maxtime from wafer_flow.wafer,wafer_flow.lot
where wafer_flow.lot.start_time= wafer_flow.wafer.start_time and tester='WTES22PA' group by
tester,stage,design order by tester;
The result I get from this query doesn't give me the last entry see below result set.  I only want the jun-22 result. I am trying to apply your suggestion from above to the result set below but I am not sure how.
Return from this query:
TESTER               STA DESIGN               MAXTIME
-------------------- --- -------------------- ---------
WTES22PA             F   304M-Z11S1           04-JUN-08
WTES22PA             P   512M-T11             22-JUN-08



 
as you use a JOIN, it's a bit more code, but still the same concept.
however, you seem to "join" the tables only on the start_time?
I must think that there must be missing "something"?
Avatar of kong01

ASKER

I am joining on the start time because that time will the same in both tables but I only want the latest start time not all of them.
Avatar of kong01

ASKER

Angel111------ These are the two table I am pulling from.....I am wanting to pull the tester,stage, and design. Only the latest record update based on the start_time column.

Name                                      Null?    Type
----------------------------------------- -------- --------------------------
START_TIME                                NOT NULL DATE
END_TIME                                  NOT NULL DATE
DESIGN                                    NOT NULL VARCHAR2(20)
LOT                                       NOT NULL VARCHAR2(20)
SUB_DESIGN                                NOT NULL VARCHAR2(10)
PRODUCT                                            VARCHAR2(10)

QL> describe wafer_flow.wafer
Name                                      Null?    Type
----------------------------------------- -------- --------------------------
START_TIME                                NOT NULL DATE
END_TIME                                  NOT NULL DATE
TESTER                                             VARCHAR2(20)
STATION                                            VARCHAR2(1)
LOT                                       NOT NULL VARCHAR2(20)
WAFER                                     NOT NULL NUMBER(2)
TEST_PROGRAM                              NOT NULL VARCHAR2(20)
STAGE                                     NOT NULL VARCHAR2(3)
PROBECARD                                          NUMBER(6)
ID                                                 NUMBER(12)
TOTAL                                              NUMBER(6)
should the "join" not be on a field like LOT (which is common) ? ...
I must assume that the 2 column start_time in the 2 tables are not really for join?
maybe some sample data to clarify? or --- what are the primary and foreign keys in the 2 tables?
Avatar of kong01

ASKER

Using the lot will not allow me to find the latest entry.  Start time is equal to start time one time for each lot.. so if I can find the max start time from wafer_flow.lot table I can find the tester name and design name that is currently processing.  Not sure if this makes sense to you but if I get the max start_times from wafer_flow.lot that match the start_time in wafer_flow.wafer then I can get the latest record for tester and design.
sorry, but the explanation does not seem "enough" for me.
I ask you to post some sample data of records that are relevant and some that are not relevant.
Avatar of kong01

ASKER

Sample date from wafer_flow.lot:
START_TIM END_TIME  DESIGN               LOT                  SUB_DESIGN
--------- --------- -------------------- -------------------- ----------
PRODUCT
----------
22-JUN-08 22-JUN-08 1G-GH70              QL829114.2           04
39600322

22-JUN-08 22-JUN-08 1G-GH70              QL829149.2           04
39600322

22-JUN-08 22-JUN-08 64M-Z11S1            D70134.1             04
39105A04


START_TIM END_TIME  DESIGN               LOT                  SUB_DESIGN
--------- --------- -------------------- -------------------- ----------
PRODUCT
----------
22-JUN-08 22-JUN-08 512M-T11             D69924.1             DD1G
35512N1G

22-JUN-08 22-JUN-08 512M-T80G            D67310.1             03
39222D03

22-JUN-08 22-JUN-08 256M-D11             D70148.1             DD1L
31259C1L


START_TIM END_TIME  DESIGN               LOT                  SUB_DESIGN
--------- --------- -------------------- -------------------- ----------
PRODUCT
----------
22-JUN-08 22-JUN-08 1G-GH70              QL833052.2           04
39600322

22-JUN-08 22-JUN-08 1G-GH70              QL829052.2           04
39600322



Sample data from wafer_flow.wafer:
START_TIM END_TIME  TESTER               S LOT                       WAFER
--------- --------- -------------------- - -------------------- ----------
TEST_PROGRAM         STA  PROBECARD         ID      TOTAL
-------------------- --- ---------- ---------- ----------
22-JUN-08 22-JUN-08 atlas                  QL829144.2                   16
E02.40               3            0      44477

22-JUN-08 22-JUN-08 KGDT16HA             1 D44713.1                      5
V18.00               8          379      44478

22-JUN-08 22-JUN-08 WTES20PA             1 D70131.1                     13
W04.01               P          266      44480

so, you have 3 records with the same date in wafer, and 8 rows with the same date in lot.
I don't see any other column to "join", but the date alone cannot be it? or am I wrong? ...
Avatar of kong01

ASKER

I didnt expand the date all the way the date is down to the seconds.
interesting choice, anyhow :)
SELECT w.tester, w.stage , l.design as design, l.start_time  as maxtime
  FROM wafer_flow.wafer w
  JOIN wafer_flow.lot l
    ON l.start_time = w.start_time
 WHERE w.start_time = ( SELECT MAX(i.start_time)
                          FROM wafer_flow.wafer i 
                         WHERE i.TESTER = w.TESTER 
                       )

Open in new window

Avatar of kong01

ASKER

This is the result I get from using the query from above: It has gotten the latest entry for the three tester but did not give me all the testers.

TESTER               STA DESIGN               MAXTIME
-------------------- --- -------------------- ---------
KGDT21HA             P   512M-T11             22-JUN-08
WTES02RI             P   304M-Z11S1           20-JUN-08
WTES25PA             P   512M-T11             22-JUN-08
if there is no "lot" for the maxtime, then this would explain that behaviour.

solution would be a left join:

 
SELECT w.tester, w.stage , l.design as design, l.start_time  as maxtime
  FROM wafer_flow.wafer w
  LEFT JOIN wafer_flow.lot l
    ON l.start_time = w.start_time
 WHERE w.start_time = ( SELECT MAX(i.start_time)
                          FROM wafer_flow.wafer i 
                         WHERE i.TESTER = w.TESTER 
                       )

Open in new window

Avatar of kong01

ASKER

Using the query above I get this result.....All tester ID but not design
TESTER               STA DESIGN               MAXTIME
-------------------- --- -------------------- ---------
WTES39PA             5
KGDT13HA             7
KGDT15HA             8
WTES27PA             P
drang                3
KGDT16HA             8
KGDT22HA             P
KGDT23HA             P
KGDT06HA             P
WTES23PA             P
WTES30PA             1

TESTER               STA DESIGN               MAXTIME
-------------------- --- -------------------- ---------
KGDT08HA             8
WTES24PA             P
WTES25PA             P
KGDT18HA             P
WTES41PA             2
WTES36PA             1
WTES38PA             5
WTES22PA             P
KGDT20HA             P
WTES21PA             P
WTES02RI             P   304M-Z11S1           20-JUN-08

TESTER               STA DESIGN               MAXTIME
-------------------- --- -------------------- ---------
WTES33PA             6
WTES37PA             3
WTES01RI             1
KGDT14HA             7
KGDT19HA             P
KGDT12HA             7
KGDT17HA             7
sturm                3
WTES19PA             2
WTES32PA             5
WTES40PA             2

TESTER               STA DESIGN               MAXTIME
-------------------- --- -------------------- ---------
WTES34PA             5
KGDT11HA             8
WTES20PA             P   64M-Z11S1            22-JUN-08
KGDT21HA             P
WTES26PA             1
Avatar of kong01

ASKER

This query gets me almost everything I want but the design is not latest record in some cases. If you could get me the max design based on the wafer_flow.lot.end_time = wafer_flow.wafer.end_time it would work by mine is returning the max value of the design not the max value of design based on the dates matching.

select wafer_flow.wafer.TESTER, max(wafer_flow.wafer.stage),max(wafer_flow.lot.design)
from wafer_flow.wafer,wafer_flow.lot where wafer_flow.lot.end_time = wafer_flow.wafer.end_time
group by test er order by tester;

Ouput from above query:
TESTER               MAX MAX(WAFER_FLOW.LOT.D
-------------------- --- --------------------
KGDT06HA             P   512M-T11
KGDT07HA             8   128M-P11
KGDT08HA             8   128M-P11
KGDT09HA             P   256M-D11
KGDT10HA             P   64M-Z11S1
KGDT11HA             8   512M-GH90
KGDT12HA             7   128M-P11
KGDT13HA             7   128M-P11
KGDT14HA             7   128M-P11
KGDT15HA             8   64M-PS11
KGDT16HA             8   128M-P11

TESTER               MAX MAX(WAFER_FLOW.LOT.D
-------------------- --- --------------------
KGDT17HA             7   64M-PS11
KGDT18HA             P   256M-Z11S1
KGDT19HA             P   304M-Z11S1
KGDT20HA             P   64M-Z11S1
KGDT21HA             P   512M-T11
KGDT22HA             P   256M-D11
KGDT23HA             P   64M-Z11S1
WTES01RI             2   512M-T80
WTES02RI             P   64M-Z11S1
WTES18PA             F   304M-Z11S1
WTES19PA             F   304M-Z11S1

TESTER               MAX MAX(WAFER_FLOW.LOT.D
-------------------- --- --------------------
WTES20PA             P   64M-Z11S1
WTES21PA             P   512M-T11
WTES22PA             P   512M-T11
WTES23PA             P   512M-T11
WTES24PA             P   512M-T11
WTES25PA             P   512M-T11
WTES26PA             5   512M-T80G
WTES27PA             P   64M-Z11S1
WTES28PA             P   64M-Z11S1
WTES29PA             P   64M-Z11S1
WTES30PA             5   512M-T80G

TESTER               MAX MAX(WAFER_FLOW.LOT.D
-------------------- --- --------------------
WTES31PA             5   512M-GV70
WTES32PA             6   512M-T80G
WTES33PA             6   512M-T80G
WTES34PA             5   512M-T80G
WTES35PA             5   512M-T80G
WTES36PA             5   64M-CD80
WTES37PA             5   1G-LD70
WTES38PA             5   512M-T80
WTES39PA             5   512M-T80G
WTES40PA             5   64M-CD80
WTES41PA             5   512M-GH80
I fear that the problem is that for the "join", there is no "lot" record for the same max(start_time) per tester in wafer table...

try to find some record in LOT table that matches the maxtime (including the time portion) from the below query:
SELECT w.tester, w.stage , l.design as design, w.start_time  as maxtime
  FROM wafer_flow.wafer w
  LEFT JOIN wafer_flow.lot l
    ON l.start_time = w.start_time
 WHERE w.start_time = ( SELECT MAX(i.start_time)
                          FROM wafer_flow.wafer i 
                         WHERE i.TESTER = w.TESTER 
                       )

Open in new window

what about some analytics ?

this gives you for every record the max starttime and the last design

SELECT w.tester, w.stage , l.design as design, max(w.start_time) over (partition by w.tester, w.stage order by w.start_time) as maxtime, last_value(l.design) over (partition by w.tester, w.stage order by w.start_time) lastdesign
  FROM wafer_flow.wafer w
  LEFT JOIN wafer_flow.lot l
    ON l.start_time = w.start_time
Avatar of kong01

ASKER

Using the query below I can query the following for each(only one tester shown for space sake) tester:
TESTER               DESIGN               MAX(MAXTI
-------------------- -------------------- ---------
WTES31PA             1G-GH70              26-JUN-08
WTES31PA             1G-GV70              08-JUN-08
WTES31PA             512M-D80             19-JUN-08
WTES31PA             512M-GV70            04-JUN-08

**Shouldn't I be able to perform another select statement on this query to get it down to this:
WTES31PA            1G-GH70  26-JUN-08

Query I used:
select tester, design,max(maxtime) from (select wafer_flow.wafer.tester, wafer_flow.wafer.stage

,wafer_flow.lot.design as design,
max( wafer_flow.lot.start_time) as maxtime from wafer_flow.wafer,wafer_flow.lot
where wafer_flow.lot.start_time = wafer_flow.wafer.start_time and tester='WTES31PA' group by
tester,stage,design order by tester) group by tester,design order by tester;
Avatar of kong01

ASKER

Why do I get more than one recod when I try to query DESIGN, TESTER, and MAX from the example above?  If I only query TESTER and MAX TIME I get the max time and WTES31PA returned, but if I query for the DESIGN, TESTER, and MAX TIME I get every record not just the one with max date.  Why does a query work differently when querying more than one column?