Solved

Interbase optimizing a stored procedure, remove index?

Posted on 2002-05-02
8
485 Views
Last Modified: 2006-11-17
I really need help on this one.

I have a database that has several tables and stored procedures to perform some data storage.

Each day the stored procedure below is called to pack data in some tables. This is done by packing some rows into one row and delete some data that is no longer needed after the time specified as an argument to "PASSIVE_PURGE"

My problem is that the procedure takes far to long time, and since it's a stored procedure, the calling application freezes while it waits for the procedure to complete.

I need to optimize the following stored procedure in Interbase, or, in some way process windows messages on the host.

The database I am using has the following indexies.

CREATE INDEX MEW1 ON MEASURE (WashID, DateTime);
CREATE INDEX MED1 ON DRYMEASURE (DryID, DateTime);
CREATE INDEX WAS1 ON WASHES (WashID, MachineID, DateTime);
CREATE INDEX DRY1 ON DRYINGS (DryID, MachineID, DateTime);

Will the procedure spped up if there is no index that needs to be updated during the call?

Or is there any other way I can spped up this procedure?

---

/* PASSIVE_PURGE - Removes and packs data older than PDATE */
CREATE PROCEDURE PASSIVE_PURGE (PDATE Date)
AS
DECLARE VARIABLE WashID Integer;
DECLARE VARIABLE MachineID Smallint;
DECLARE VARIABLE DateTime Date;
DECLARE VARIABLE TotalHourCnt Integer;
DECLARE VARIABLE IdleHourCnt Integer;
DECLARE VARIABLE RunHourCnt Integer;
DECLARE VARIABLE ServiceHourCnt Integer;
DECLARE VARIABLE PrmNum Smallint;
BEGIN
 FOR SELECT MachineID, :PDATE DateTime, SUM(TotalHourCnt) TotalHourCnt, SUM(IdleHourCnt) IdleHourCnt,
 SUM(RunHourCnt) RunHourCnt, SUM(ServiceHourCnt) ServiceHourCnt, PrmNum FROM HOURDATA WHERE DateTime < :PDATE GROUP BY MachineID, PrmNum  
 INTO :MachineID, :DateTime, :TotalHourCnt, :IdleHourCnt, :RunHourCnt, :ServiceHourCnt, :PrmNum
 DO
 BEGIN
  DELETE FROM HOURDATA WHERE MachineID = :MachineID AND PrmNum = :PrmNum AND DateTime < :DateTime;
  INSERT INTO HOURDATA VALUES (:MachineID, :DateTime, :TotalHourCnt, :IdleHourCnt, :RunHourCnt, :ServiceHourCnt, :PrmNum);
 END
 FOR SELECT WashID FROM WASHES WHERE DateTime < :PDATE INTO :WashID
 DO
 BEGIN
  DELETE FROM MEASURE WHERE WashID = :WashID;
  DELETE FROM PVMEASURE WHERE WashID = :WashID;
 END
 DELETE FROM WASHES WHERE DateTime < :PDATE;
 EXIT;
END ##

---
0
Comment
Question by:Probie
  • 2
  • 2
  • 2
  • +2
8 Comments
 
LVL 27

Expert Comment

by:kretzschmar
Comment Utility
add indexes to your datefields
0
 
LVL 1

Author Comment

by:Probie
Comment Utility
What do you mean?

I have the following indexes
CREATE INDEX MEW1 ON MEASURE (WashID, DateTime);
CREATE INDEX WAS1 ON WASHES (WashID, MachineID, DateTime);

Do you think more indexes will increase the performance?

I read somewere that indexes decreases the speed during deletion or insertions. It that correct?
0
 
LVL 27

Expert Comment

by:kretzschmar
Comment Utility
nope, more like

CREATE INDEX WHATEVERNAME ON HOURDATA(DateTime);
CREATE INDEX WHATEVERNAME ON WASHES (DateTime);

a none unique index,
because in your for-selects
it is this field in your whereclauses

>I read somewere that indexes decreases
>the speed during deletion or insertions. It that correct?
yes, thats true, because the server has to do more maintainance

well, i guess, it takes long time to select the records,
because if there is no index on the datetime-field,
the server must lookup each record

do a try and see what happens,
if there is no faster performance,
you can easily drop the indexes and
i'm wrong

meikl ;-)



0
 
LVL 9

Accepted Solution

by:
ornicar earned 100 total points
Comment Utility
If like me, you are lazy and don't want to change the whole thing. Just think about moving the delete statements outside the loops: Dont the records be deleted using the PDATE parameter? You don't filter on MachineID, after all.

then, just a suggestion, I know nothing about Interbase specifics, but on a SQL point of view, the slowness is due to the fact that you are processing in a loop something that finally looks like a record by record update, invoking select statements with where clause on each loop: this is the slowest and the most resource consuming way to achieve this.
You should review the whole thing and assume that you can update as many records as possible at one time.

For instance your second loop:

FOR SELECT WashID FROM WASHES WHERE DateTime < :PDATE INTO :WashID
DO
BEGIN
 DELETE FROM MEASURE WHERE WashID = :WashID;
 DELETE FROM PVMEASURE WHERE WashID = :WashID;
END

could be preformed by the following statement(s):

delete from measure
where washid in (select washid from washes
                    where datetime < :pdate)
delete from pvmeasure
where washid in (select washid from washes
                    where datetime < :pdate)

and even better by using a temp table:

create table temp_wash_delete as
create index temp_wash_delete_idx on temp_wash_delete (washid)
select washid from washes
      where datetime < :pdate)
delete from measure
where washid in (select washid from temp_wash_delete)
delete from pvmeasure
where washid in (select washid from temp_wash_delete)
drop index temp_wash_delete_idx
drop table temp_wash_delete

Same idea for the first loop: create a temp table with the group by query, index it, then work with it using only raw sql statements.

I feel by finally eliminating the loops, your proc will run much faster.

Does Interbase understands the WHERE EXISTS clause? Its faster than the IN clause.

0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 1

Author Comment

by:Probie
Comment Utility
Very interesting!
Yes, Interbase supports the WHERE EXISTS clause.

Why is this temp table such a good idea?

Can you please explain the second part (with the temp tables) a little more?

The first part of my procedure is supose to sumerize all "*hour*" columns in the HOURDATA table grouped on MachineID and PrmNum before a certain time. Then it is suposed to replace all the rows for each PrmNum och each MachineID with just one record that holds the sum of the previous records for a MachineID/PrmNum before a date (PDATE)... The procedure basicly lowers the resolution of the records after a certain time. Rows before time PDATE is packed into one record, (One record for each machines program)

I was thinking on replacing the first part of my procedure with:

INSERT INTO HOURDATA SELECT MachineID, :PDATE DateTime, SUM(TotalHourCnt), SUM(IdleHourCnt), SUM(RunHourCnt), SUM(ServiceHourCnt), PrmNum FROM HOURDATA WHERE DateTime < :PDATE GROUP BY MachineID, PrmNum;
DELETE FROM HOURDATA WHERE DateTime < :PDATE;

But there might be a chance that there already exsits a record with the exact time as PDATE, which will cuase an error since my HOURDATA table looks like this:

CREATE     TABLE HOURDATA
(
     MachineID          Smallint Not Null,
     DateTime          Date Not Null,
     TotalHourCnt          Integer,
     IdleHourCnt          Integer,
     RunHourCnt          Integer,
     ServiceHourCnt          Integer,
     PrmNum                  Smallint Not Null,
     
     PRIMARY KEY (MachineID,DateTime,PrmNum)
);

Any ideas?
0
 
LVL 9

Expert Comment

by:ornicar
Comment Utility
For the first part, what you think about looks fine and is straightforward.

For the temp table, I suggested it because you need to execute it twice. With the temp table, you can use twice the result of only one calculation.

It will be useful to create a temp table with your sum() results. There are three tables to clean based on what has been calculated.

so:

CREATE TABLE TEMPRESULTS AS
SELECT MachineID, :PDATE DateTime, SUM(TotalHourCnt), SUM(IdleHourCnt), SUM(RunHourCnt), SUM(ServiceHourCnt), PrmNum FROM HOURDATA WHERE DateTime < :PDATE GROUP BY MachineID, PrmNum;

DELETE FROM HOURDATA HD
WHERE EXISTS (
SELECT * FROM TEMPTABLE TT WHERE TT.MachineID=HD.MachineID
                             AND TT.DateTime=HD.DateTime
                             AND TT.PrmNum=HD.PrmNum)

0
 
LVL 54

Expert Comment

by:nico5038
Comment Utility

No comment has been added lately, so it's time to clean up this TA.
I will leave a recommendation in the Cleanup topic area that this question is:
 - Answered by: ornicar  
Please leave any comments here within the
next seven days.

PLEASE DO NOT ACCEPT THIS COMMENT AS AN ANSWER !

Nic;o)
0
 
LVL 5

Expert Comment

by:Netminder
Comment Utility
Per recommendation, force-accepted.

Netminder
CS Moderator
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

In today’s complex data management environments, it is not unusual for UNIX servers to be dedicated to a particular department, purpose, or database.  As a result, a SAS® data analyst often works with multiple servers, each with its own data storage…
Shadow IT is coming out of the shadows as more businesses are choosing cloud-based applications. It is now a multi-cloud world for most organizations. Simultaneously, most businesses have yet to consolidate with one cloud provider or define an offic…
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

5 Experts available now in Live!

Get 1:1 Help Now