sqldba2013
asked on
MySQL query - Merge statment
Hi Experts,
I am new to MySQL scripting and I am facing error while merging Insert and Update query into single query.
Please suggest me how to merge attached query into single query with Merge option (Insert+Update).
Thanks in advance.
For-Insert.sql
For-Update.sql
I am new to MySQL scripting and I am facing error while merging Insert and Update query into single query.
Please suggest me how to merge attached query into single query with Merge option (Insert+Update).
Thanks in advance.
For-Insert.sql
For-Update.sql
ASKER
Thanks for your reply.
I have tried with above URL, still I am not able to write correct query and scripting is very horrible for DBA's :)
TX_TASK_ID column is primary column in table TIMESHEET_UPDATE_HOURS
I have tried with above URL, still I am not able to write correct query and scripting is very horrible for DBA's :)
TX_TASK_ID column is primary column in table TIMESHEET_UPDATE_HOURS
I am not really sure where the issue is ...
INSERT INTO yourtable T ( columns ...)
SELECT SQ FROM ( <YOUR SELECT FROM INSERT> ) SQ
ON DUPLICATE KEY UPDATE
SET T.USER = SQ.USER
,T.DAYONE = SQ.DAYONE
, etc ....
INSERT INTO yourtable T ( columns ...)
SELECT SQ FROM ( <YOUR SELECT FROM INSERT> ) SQ
ON DUPLICATE KEY UPDATE
SET T.USER = SQ.USER
,T.DAYONE = SQ.DAYONE
, etc ....
The sql you provided, is a select statement.
We expect the insert and update statements, as provided by the link Guy provided.
INSERT statement If record is a new record:
INSERT INTO table (a,b,c) VALUES(1,2,3);
UPDATE statement if a record with key a=1 exists:
UPDATE table SET c=c+1 WHERE a=1;
Merging Insert and Update, to insert the record if key a=1 does not exist, and updates c if exists.
We expect the insert and update statements, as provided by the link Guy provided.
INSERT statement If record is a new record:
INSERT INTO table (a,b,c) VALUES(1,2,3);
UPDATE statement if a record with key a=1 exists:
UPDATE table SET c=c+1 WHERE a=1;
Merging Insert and Update, to insert the record if key a=1 does not exist, and updates c if exists.
INSERT INTO table (a,b,c) VALUES(1,2,3)
ON DUPLICATE KEY UPDATE SET c=c+1;
ASKER
Thanks to all for your suggestions/help.
I have created script myself as per my requirement. However, the attached select statement is part of my script and it was taking very long time (10 min for 80 rows). Please suggest on which line code I have to modify to speed up the attached query.
Select-Query.txt
I have created script myself as per my requirement. However, the attached select statement is part of my script and it was taking very long time (10 min for 80 rows). Please suggest on which line code I have to modify to speed up the attached query.
Select-Query.txt
to speed up is a different question, actually.
in short, you need to use the explain plan for the query (only for the select part), and build up the needed indexes.
note that 80 rows "output" is not the relevant number, but the number of rows that are in the source tables, before filtering, that are relevant
in short, you need to use the explain plan for the query (only for the select part), and build up the needed indexes.
note that 80 rows "output" is not the relevant number, but the number of rows that are in the source tables, before filtering, that are relevant
I would say it is the multi-column use of IN()
and (task_id, ENTERED_BY_USER, DATE_ADD(DATE_WORKED, INTERVAL -(DAYOFWEEK(DATE_WORKED) - 2) DAY)) not in
(select TX_TASK_ID, USER, cast(start_date as char) from TIMESHEET_SUMMARY where ENTERED_BY_USER not like 'eUS_N%')
but it is a complete guess without any ability to test it. I would suggest finding an alternative method to exclude data, perhaps using a join instead.
ASKER
We have already created and tested given select script with indexes and we didn't get any improvements.
@Portletpaul : Could you please suggest alternative method for IN operator for above snippet
@Portletpaul : Could you please suggest alternative method for IN operator for above snippet
I already did - a join (a left join to TIMESHEET_SUMMARY, then exclude matches)
but I cannot attempt this tonight (it's late evening for me) and also would not try it without all table definitions and sample data.
I cannot be certain this is your performance issue - but it looks like it might be.
but I cannot attempt this tonight (it's late evening for me) and also would not try it without all table definitions and sample data.
I cannot be certain this is your performance issue - but it looks like it might be.
ASKER
Thanks PortletPaul and find herewith attached table definitions.
Table-Defn.txt
Table-Defn.txt
and some data?
Here is one thing to try that doesn't alter your existing query too much, see line 12 below. In your original you reference `ENTERED_BY_USER` in that location which is making it into a correlated subquery as well as an IN() comparison.
AND (
task_id
, ENTERED_BY_USER
, DATE_ADD(DATE_WORKED, INTERVAL - (DAYOFWEEK(DATE_WORKED) - 2) DAY)
) NOT IN (
SELECT
TX_TASK_ID
, USER
, cast(start_date AS CHAR)
FROM TIMESHEET_SUMMARY
WHERE USER NOT LIKE 'eUS_N%'
/* change in line above ^^^^ */
)
GROUP BY
ENTERED_BY_USER
, task_id
, DATE_WORKED
Here is an approach using NOT EXISTS () instead of NOT IN ()
SELECT
TX_TASK_ID
, USER
, SUM(DAYONE) DAYONE
, SUM(DAYTWO) DAYTWO
, SUM(DAYTHREE) DAYTHREE
, SUM(DAYFOUR) DAYFOUR
, SUM(DAYFIVE) DAYFIVE
, SUM(DAYSIX) DAYSIX
, SUM(DAYSEVEN) DAYSEVEN
, NETWORK_CODE
, ACTIVITY_CODE
, ACTIVE
, SUM(TOTAL)
, START_DATE
, START_DAY
, END_DATE
FROM (
SELECT
task_id TX_TASK_ID
, ENTERED_BY_USER USER
, CASE WHEN (DAYOFWEEK(DATE_WORKED) - 1) = 1 THEN sum(HOURS_WORKED) ELSE 0.0 END DAYONE
, CASE WHEN (DAYOFWEEK(DATE_WORKED) - 1) = 2 THEN sum(HOURS_WORKED) ELSE 0.0 END DAYTWO
, CASE WHEN (DAYOFWEEK(DATE_WORKED) - 1) = 3 THEN sum(HOURS_WORKED) ELSE 0.0 END DAYTHREE
, CASE WHEN (DAYOFWEEK(DATE_WORKED) - 1) = 4 THEN sum(HOURS_WORKED) ELSE 0.0 END DAYFOUR
, CASE WHEN (DAYOFWEEK(DATE_WORKED) - 1) = 5 THEN sum(HOURS_WORKED) ELSE 0.0 END DAYFIVE
, CASE WHEN (DAYOFWEEK(DATE_WORKED) - 1) = 6 THEN sum(HOURS_WORKED) ELSE 0.0 END DAYSIX
, CASE WHEN (DAYOFWEEK(DATE_WORKED) - 1) = 0 THEN sum(HOURS_WORKED) ELSE 0.0 END DAYSEVEN
, NULL NETWORK_CODE
, NULL ACTIVITY_CODE
, 1 ACTIVE
, sum(HOURS_WORKED) TOTAL
, DATE_ADD(DATE_WORKED, INTERVAL - (DAYOFWEEK(DATE_WORKED) - 2) DAY) AS START_DATE
, DATE_ADD(DATE_WORKED, INTERVAL - (DAYOFWEEK(DATE_WORKED) - 2 + 7) DAY) AS END_DATE
, NULL START_DAY
FROM TIMESHEET_UPDATE_HOURS tuh
WHERE ACTUAL_DATE_ENTERED >= '2014-07-31'
AND ENTERED_BY_USER NOT LIKE 'eUS_N%'
AND NOT EXISTS (
SELECT NULL
FROM TIMESHEET_SUMMARY ts
where ts.TX_TASK_ID = tuh.task_id
AND ts.USER = tuh.ENTERED_BY_USER
AND cast(ts.start_date AS CHAR) = DATE_ADD(tuh.DATE_WORKED, INTERVAL - (DAYOFWEEK(tuh.DATE_WORKED) - 2) DAY)
)
GROUP BY
ENTERED_BY_USER
, task_id
, DATE_WORKED
) f
GROUP BY
USER
, TX_TASK_ID
, NETWORK_CODE
, ACTIVITY_CODE
, ACTIVE
, START_DATE
, END_DATE
, START_DAY
;
note you should propagate use of the alias tuh which is needed in the correlated subquery
I should point out that that this will work against performance here:
comparing
cast(ts.start_date AS CHAR)
to
DATE_ADD(tuh.DATE_WORKED, INTERVAL - (DAYOFWEEK(tuh.DATE_WORKED ) - 2) DAY)
Is it absolutely necessary to cast(ts.start_date AS CHAR) ????
---
and don't get me started on storing dates as varchars!
never a good thing to do IMO
comparing
cast(ts.start_date AS CHAR)
to
DATE_ADD(tuh.DATE_WORKED, INTERVAL - (DAYOFWEEK(tuh.DATE_WORKED
Is it absolutely necessary to cast(ts.start_date AS CHAR) ????
---
and don't get me started on storing dates as varchars!
never a good thing to do IMO
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Its working perfect...Thanks a lot for your help.
No problem, glad I could help. Cheers, Paul
MySQL has implement the "merge" feature as INSERT ... ON DUPLICATE KEY, see the technical reference here:
http://dev.mysql.com/doc/refman/5.0/en/insert-on-duplicate.html
it will require a primary key or unique index on the relevant columns