Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 298
  • Last Modified:

MySQL query - Merge statment

Hi Experts,

I am new to MySQL scripting and I am facing error while merging Insert and Update query into single query.

Please suggest me how to merge attached query into single query with Merge option (Insert+Update).

Thanks in advance.
For-Insert.sql
For-Update.sql
0
sqldba2013
Asked:
sqldba2013
  • 9
  • 5
  • 3
  • +1
1 Solution
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
sqldba,
 
  MySQL has implement the "merge" feature as INSERT ... ON DUPLICATE KEY, see the technical reference here:
  http://dev.mysql.com/doc/refman/5.0/en/insert-on-duplicate.html
  it will require a primary key or unique index on the relevant columns
0
 
sqldba2013Author Commented:
Thanks for your reply.

I have tried with above URL, still I am not able to write correct query and scripting is very horrible for DBA's :)

TX_TASK_ID column is primary column in table TIMESHEET_UPDATE_HOURS
0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
I am not really sure where the issue is ...

INSERT INTO yourtable T ( columns ...)
SELECT SQ FROM ( <YOUR SELECT FROM INSERT> ) SQ
ON DUPLICATE KEY UPDATE
  SET T.USER = SQ.USER
   ,T.DAYONE = SQ.DAYONE
  , etc ....
0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 
hnasrCommented:
The sql you provided, is a select statement.

We expect the insert and update statements, as provided by the link Guy provided.

INSERT statement If record is a new record:
INSERT INTO table (a,b,c) VALUES(1,2,3);

UPDATE statement if a record with key a=1 exists:
UPDATE table SET c=c+1 WHERE a=1;

Merging Insert and Update, to insert the record if key a=1 does not exist, and updates c if exists.
INSERT INTO table (a,b,c) VALUES(1,2,3)
ON DUPLICATE KEY UPDATE SET c=c+1;

Open in new window

0
 
PortletPaulCommented:
You haven't provided the right information for a merge which needs:
1. some inserts
2. an update query

You have only supplied select queries for-update-for-insert.png
0
 
sqldba2013Author Commented:
Thanks to all for your suggestions/help.

I have created script myself as per my requirement. However, the attached select statement is part of my script and it was  taking very long time (10 min for 80 rows). Please suggest on which line code I have to modify to speed up the attached query.
Select-Query.txt
0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
to speed up is a different question, actually.
in short, you need to use the explain plan for the query (only for the select part), and build up the needed indexes.
note that 80 rows "output" is not the relevant number, but the number of rows that are in the source tables, before filtering, that are relevant
0
 
PortletPaulCommented:
I would say it is the multi-column use of IN()
and (task_id, ENTERED_BY_USER, DATE_ADD(DATE_WORKED, INTERVAL -(DAYOFWEEK(DATE_WORKED) - 2) DAY)) not in
(select TX_TASK_ID, USER, cast(start_date as char) from TIMESHEET_SUMMARY  where ENTERED_BY_USER not like 'eUS_N%')

Open in new window

but it is a complete guess without any ability to test it. I would suggest finding an alternative method to exclude data, perhaps using a join instead.
0
 
sqldba2013Author Commented:
We have already created and tested given select script with indexes and we didn't get any improvements.

@Portletpaul : Could you please suggest alternative method for IN operator for above snippet
0
 
PortletPaulCommented:
I already did - a join (a left join to TIMESHEET_SUMMARY, then exclude matches)

but I cannot attempt this tonight (it's late evening for me) and also would not try it without all table definitions and sample data.

I cannot be certain this is your performance issue - but it looks like it might be.
0
 
sqldba2013Author Commented:
Thanks PortletPaul and find herewith attached table definitions.
Table-Defn.txt
0
 
PortletPaulCommented:
and some data?
0
 
PortletPaulCommented:
Here is one thing to try that doesn't alter your existing query too much, see line 12 below. In your original you reference `ENTERED_BY_USER` in that location which is making it into a correlated subquery as well as an IN() comparison.
        AND (
               task_id
             , ENTERED_BY_USER
             , DATE_ADD(DATE_WORKED, INTERVAL - (DAYOFWEEK(DATE_WORKED) - 2) DAY)
             ) NOT IN (
                       SELECT
                              TX_TASK_ID
                            , USER
                            , cast(start_date AS CHAR)
                       FROM TIMESHEET_SUMMARY
          
                       WHERE USER NOT LIKE 'eUS_N%'
/* change in line above      ^^^^                           */    
                       )
       GROUP BY
              ENTERED_BY_USER
            , task_id
            , DATE_WORKED

Open in new window

0
 
PortletPaulCommented:
Here is an approach using NOT EXISTS () instead of NOT IN ()
SELECT
       TX_TASK_ID
     , USER
     , SUM(DAYONE) DAYONE
     , SUM(DAYTWO) DAYTWO
     , SUM(DAYTHREE) DAYTHREE
     , SUM(DAYFOUR) DAYFOUR
     , SUM(DAYFIVE) DAYFIVE
     , SUM(DAYSIX) DAYSIX
     , SUM(DAYSEVEN) DAYSEVEN
     , NETWORK_CODE
     , ACTIVITY_CODE
     , ACTIVE
     , SUM(TOTAL)
     , START_DATE
     , START_DAY
     , END_DATE
FROM (
       SELECT
              task_id TX_TASK_ID
            , ENTERED_BY_USER USER
            , CASE WHEN (DAYOFWEEK(DATE_WORKED) - 1) = 1 THEN sum(HOURS_WORKED) ELSE 0.0 END DAYONE
            , CASE WHEN (DAYOFWEEK(DATE_WORKED) - 1) = 2 THEN sum(HOURS_WORKED) ELSE 0.0 END DAYTWO
            , CASE WHEN (DAYOFWEEK(DATE_WORKED) - 1) = 3 THEN sum(HOURS_WORKED) ELSE 0.0 END DAYTHREE
            , CASE WHEN (DAYOFWEEK(DATE_WORKED) - 1) = 4 THEN sum(HOURS_WORKED) ELSE 0.0 END DAYFOUR
            , CASE WHEN (DAYOFWEEK(DATE_WORKED) - 1) = 5 THEN sum(HOURS_WORKED) ELSE 0.0 END DAYFIVE
            , CASE WHEN (DAYOFWEEK(DATE_WORKED) - 1) = 6 THEN sum(HOURS_WORKED) ELSE 0.0 END DAYSIX
            , CASE WHEN (DAYOFWEEK(DATE_WORKED) - 1) = 0 THEN sum(HOURS_WORKED) ELSE 0.0 END DAYSEVEN
            , NULL NETWORK_CODE
            , NULL ACTIVITY_CODE
            , 1 ACTIVE
            , sum(HOURS_WORKED) TOTAL
            , DATE_ADD(DATE_WORKED, INTERVAL - (DAYOFWEEK(DATE_WORKED) - 2) DAY) AS START_DATE
            , DATE_ADD(DATE_WORKED, INTERVAL - (DAYOFWEEK(DATE_WORKED) - 2 + 7) DAY) AS END_DATE
            , NULL START_DAY
       FROM TIMESHEET_UPDATE_HOURS tuh
       WHERE ACTUAL_DATE_ENTERED >= '2014-07-31'
       AND ENTERED_BY_USER NOT LIKE 'eUS_N%'
       AND NOT EXISTS (
                        SELECT NULL
                        FROM TIMESHEET_SUMMARY ts
                        where ts.TX_TASK_ID = tuh.task_id
                        AND ts.USER = tuh.ENTERED_BY_USER
                        AND cast(ts.start_date AS CHAR) = DATE_ADD(tuh.DATE_WORKED, INTERVAL - (DAYOFWEEK(tuh.DATE_WORKED) - 2) DAY)
                       )
  
       GROUP BY
              ENTERED_BY_USER
            , task_id
            , DATE_WORKED
     ) f
GROUP BY
     USER
   , TX_TASK_ID
   , NETWORK_CODE
   , ACTIVITY_CODE
   , ACTIVE
   , START_DATE
   , END_DATE
   , START_DAY
;

Open in new window

note you should propagate use of the alias tuh which is needed in the correlated subquery
0
 
PortletPaulCommented:
I should point out that that this will work against performance here:

comparing

         cast(ts.start_date AS CHAR)
to
         DATE_ADD(tuh.DATE_WORKED, INTERVAL - (DAYOFWEEK(tuh.DATE_WORKED) - 2) DAY)

Is it absolutely necessary to  cast(ts.start_date AS CHAR)  ????

---

and don't get me started on storing dates as varchars!
never a good thing to do IMO
0
 
PortletPaulCommented:
and, here is an approach using a left join:
SELECT
       TX_TASK_ID
     , USER
     , SUM(DAYONE) DAYONE
     , SUM(DAYTWO) DAYTWO
     , SUM(DAYTHREE) DAYTHREE
     , SUM(DAYFOUR) DAYFOUR
     , SUM(DAYFIVE) DAYFIVE
     , SUM(DAYSIX) DAYSIX
     , SUM(DAYSEVEN) DAYSEVEN
     , NETWORK_CODE
     , ACTIVITY_CODE
     , ACTIVE
     , SUM(TOTAL)
     , START_DATE
     , START_DAY
     , END_DATE
FROM (
       SELECT
              task_id TX_TASK_ID
            , ENTERED_BY_USER USER
            , CASE WHEN (DAYOFWEEK(DATE_WORKED) - 1) = 1 THEN sum(HOURS_WORKED) ELSE 0.0 END DAYONE
            , CASE WHEN (DAYOFWEEK(DATE_WORKED) - 1) = 2 THEN sum(HOURS_WORKED) ELSE 0.0 END DAYTWO
            , CASE WHEN (DAYOFWEEK(DATE_WORKED) - 1) = 3 THEN sum(HOURS_WORKED) ELSE 0.0 END DAYTHREE
            , CASE WHEN (DAYOFWEEK(DATE_WORKED) - 1) = 4 THEN sum(HOURS_WORKED) ELSE 0.0 END DAYFOUR
            , CASE WHEN (DAYOFWEEK(DATE_WORKED) - 1) = 5 THEN sum(HOURS_WORKED) ELSE 0.0 END DAYFIVE
            , CASE WHEN (DAYOFWEEK(DATE_WORKED) - 1) = 6 THEN sum(HOURS_WORKED) ELSE 0.0 END DAYSIX
            , CASE WHEN (DAYOFWEEK(DATE_WORKED) - 1) = 0 THEN sum(HOURS_WORKED) ELSE 0.0 END DAYSEVEN
            , NULL NETWORK_CODE
            , NULL ACTIVITY_CODE
            , 1 ACTIVE
            , sum(HOURS_WORKED) TOTAL
            , DATE_ADD(DATE_WORKED, INTERVAL - (DAYOFWEEK(DATE_WORKED) - 2) DAY) AS START_DATE
            , DATE_ADD(DATE_WORKED, INTERVAL - (DAYOFWEEK(DATE_WORKED) - 2 + 7) DAY) AS END_DATE
            , NULL START_DAY
       FROM TIMESHEET_UPDATE_HOURS tuh
  
       LEFT JOIN (
                        SELECT
                               TX_TASK_ID
                             , USER
                             , cast(start_date AS CHAR) as start_date_char
                        FROM TIMESHEET_SUMMARY ts
                        ) ts  ON ts.TX_TASK_ID = tuh.task_id
                                   AND ts.USER = tuh.ENTERED_BY_USER
                                   AND start_date_char = DATE_ADD(tuh.DATE_WORKED, INTERVAL - (DAYOFWEEK(tuh.DATE_WORKED) - 2) DAY)

       WHERE ACTUAL_DATE_ENTERED >= '2014-07-31'
       AND ENTERED_BY_USER NOT LIKE 'eUS_N%'
       AND ts.TX_TASK_ID IS NULL
/*     ^^^^^^^^^^^^^^^^^^^^^^^^^   */
       GROUP BY
              ENTERED_BY_USER
            , task_id
            , DATE_WORKED
     ) f
GROUP BY
     USER
   , TX_TASK_ID
   , NETWORK_CODE
   , ACTIVITY_CODE
   , ACTIVE
   , START_DATE
   , END_DATE
   , START_DAY
;

Open in new window

0
 
sqldba2013Author Commented:
Its working perfect...Thanks a lot for your help.
0
 
PortletPaulCommented:
No problem, glad I could help. Cheers, Paul
0

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

  • 9
  • 5
  • 3
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now