Solved

SQL query subselect

Posted on 2014-02-10
5
320 Views
Last Modified: 2014-02-20
I have a staging table with data that gets fed in from flat files.  I need to insert that data into another table, but my query is not working. It works, it just doesn't insert all the coorect records, so I need to fix the query.  
The staging table gets records inserted and then updated data causes duplicate records to be inserted.  I need the most current records.  Here is a sample of data to help illustrate my issue.  For this sample data I will need records 2 and 4 inserted, however, my query is now only selecting record 4.  Please help.  Thank you.  Here is my query.
INSERT INTO MY_TBL2
SELECT * FROM MY_TBL1 W
WHERE LOADDATE = (SELECT MAX(LOADDATE)
            FROM MY_TBL2 V
            WHERE W.BCODE = V.BCODE)
            order by BCODE

BCODE SAMPLEID  ACTIVITY  MEASUREMENT  ITEMNAME  LOADDATE
99999   1                  act1          meas1                 Item1           10/3/2012
99999   1                  act1          meas1                 Item1           10/4/2012
99999   2                  act2          meas2                 Item2           10/3/2012
99999   2                  act2          meas2                 Item2           10/4/2012
0
Comment
Question by:newtoperlpgm
  • 2
  • 2
5 Comments
 
LVL 73

Accepted Solution

by:
sdstuber earned 250 total points
ID: 39848588
I'm not sure why your subquery was against tbl2.  That seems to ensure that each time you load data, you'll reinsert the same data you already read before.

Instead, just read table1 one time and sort it to pick out the rows you need

INSERT INTO my_tbl2
    SELECT bcode,
           sampleid,
           activity,
           measurement,
           itemname,
           loaddate
      FROM (SELECT t.*, ROW_NUMBER() OVER(PARTITION BY bcode,sampleid ORDER BY loaddate DESC) rn
              FROM my_tbl1 t)
     WHERE rn = 1
order by bcode
0
 

Author Comment

by:newtoperlpgm
ID: 39848629
Hi, that was a typo, it should have said tbl1, but that doesn't matter anymore, because the query you supplied worked great.  Can you explain the query, though, what it does, because I spent many hours trying to find a query that would obtain the desired results, that is, the rows that were the most recently updated for the bcode and sampleid, but couldn't come up with one.  If I understand the query, I can apply that knowledge I just learned to subsequent sql tasks.
Thanks very much.
0
 
LVL 35

Assisted Solution

by:Mark Geerlings
Mark Geerlings earned 250 total points
ID: 39848660
I think you already identified the problem: "the rows that were the most recently updated for the bcode and sampleid", but your query only looked at: bcode.

If you adjust your original query like this, I think that will also work:

INSERT INTO MY_TBL2
SELECT * FROM MY_TBL1 W
WHERE LOADDATE = (SELECT MAX(LOADDATE)
            FROM MY_TBL1 V
            WHERE V.BCODE = W.BCODE
               AND V.SAMPLID = W.SAMPLEID)
            order by BCODE
0
 
LVL 73

Expert Comment

by:sdstuber
ID: 39848684
>>>> Can you explain the query, though, what it does, because I spent many hours trying to find a query

work it from the inside out..

This is the inner most query...


(SELECT t.*, ROW_NUMBER() OVER(PARTITION BY bcode,sampleid ORDER BY loaddate DESC) rn
              FROM my_tbl1 t)


this pulls everything from your table, but adds one new column to the results.  
This numbers your data in groups (partitions)

each bcode/sampleid  is taken as one group.
The rows within each group are numbered by date in descending order (so the latest is 1, next is 2 and so on)


BCODE SAMPLEID  ACTIVITY  MEASUREMENT  ITEMNAME  LOADDATE   RN
99999   1                  act1          meas1                 Item1           10/3/2012   2
99999   1                  act1          meas1                 Item1           10/4/2012   1
99999   2                  act2          meas2                 Item2           10/3/2012   2
99999   2                  act2          meas2                 Item2           10/4/2012    1


Then,  in the outer query, I keep only the rows that have rn=1,  so that's the latest for each group/partition.
0
 

Author Comment

by:newtoperlpgm
ID: 39861448
Thank you for the explanation.
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
VB.Net - Oracle BulkCopy from CSV Date Format 7 59
Oracle and DateTime math 6 37
constraint check 2 40
PL/SQL Two changes 7 27
Cursors in Oracle: A cursor is used to process individual rows returned by database system for a query. In oracle every SQL statement executed by the oracle server has a private area. This area contains information about the SQL statement and the…
This post first appeared at Oracleinaction  (http://oracleinaction.com/undo-and-redo-in-oracle/)by Anju Garg (Myself). I  will demonstrate that undo for DML’s is stored both in undo tablespace and online redo logs. Then, we will analyze the reaso…
This video shows how to Export data from an Oracle database using the Original Export Utility.  The corresponding Import utility, which works the same way is referenced, but not demonstrated.
Via a live example, show how to restore a database from backup after a simulated disk failure using RMAN.

813 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now