Link to home
Start Free TrialLog in
Avatar of jagishiv
jagishiv

asked on

Batch Performance

This question is related to the earlier question that I had posed but I have more details now:

here is the old question:
I have a client who needs help tuning a batch process; The batch is run every weekend and it finishes monday morning. The client feels that it's a long time for the batch to finish. The batch process was developed by a third party who is not very cooperative to rectify the situation. Where do we start to rectify the problem? Is there any said methodology for tuning a batch process. The developer also hosts the environment for the client and they seem to tell the client that there is nothing wrong with it. They have added additional processor and memory on the server but it has not rectified the situation. Any help with a defined process and right questions to ask the developer will be appreciated.

New Information:

Batch is a set of stored procedures. It has about 15 jobs scheduled one after the other. the DB is SQL 7 on a wintel quad processor with 2 gig ram. The disk resides on a SAN with fibre connectivity. the database size is 115 GB a nd with 318 tables. The client is saying that  the performance is very slow. the batch is run every fortnight and started oin friday evening . it completes by tuesday morning. The party which developed this asked the client to move to a new server( current wintel box) but problems persists. Gets worse day by day. there are a million transactions processed in every batch;

there seem to be lot of referential integrity which according to the developers slows down. They are not using any tools like SQL profiler to monitor the system for network, i/o bottlencks. They also mention that it has to be upgraded to SQL 2000. Are there any merits? what are the advantages from the performance side. Do we ask them to denormalize the tables? There are also some backups and it also takes a lot of time. i have started to suspect i/o issues. Can some of u advise. the updates are all through views; Do u think it should be changed?

JobName                                                                                                                                                                                                                                                        
Wait Job 1
CAPSFeed OFF
Dist XRef File
Distributor Parent BSB Extract
Daily Dist Details File
Initial Backup Database
Wait Job 2
RPM Extracts
Commission Split
Uploads
Process Exclusion
Academy Recalc
Remuneration Recalc
Override Recalc
backup after overrides and Before Recalc
SOA Recalc
Payments
OLAS AP
Re index b4 OR recalc
Backup after Payments
SOA Load
Backup after SOA Load
SOA Control Totals
OLAS GL
Dist Transaction File
DCS DistTran Extract
Non Product Commission
SOA Print
SOA Create PDFs
SOA Create CSVs
Final Backup
CAPSFeed ON


Can some of you throw some light and likely approaches to the problem. I have got an eval copy from embarcadero for a tool called DB artisan. Is it good?
Avatar of arbert
arbert

"there seem to be lot of referential integrity which according to the developers slows down. They are not using any tools like SQL profiler to monitor the system for network, i/o bottlencks. They also mention that it has to be upgraded to SQL 2000. Are there any merits? what are the advantages from the performance side. Do we ask them to denormalize the tables? There are also some backups and it also takes a lot of time. i have started to suspect i/o issues. Can some of u advise. the updates are all through views; Do u think it should be changed?"

What's the system used for?  Reporting?  Updates?  If strictly reporting, then ya, I would probably consider denormalizing the data to some extent.

Definately start running SQL Profiler and Perfmon to look at the IO and memory usage.  What RAID type is the san configured for?  If RAID5, you will take a BIG hit for all write operations.  If the log file located on the same raidgroup/disks as the data?  Also a bottleneck to get rid of.

Embarcadero is a good tool, but personally, I would start with the built in MS products first--you need to get a handle on where you think the problems are....

Also, the jobnames tell us nothing--post some of the actual procs.....
ASKER CERTIFIED SOLUTION
Avatar of Jay Toops
Jay Toops
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of jagishiv

ASKER

Do you suggest some metrics that need to be monitored here? The batch timings are borderline with the SLA and it has started to affect the online performance as well. Will views be a potential problem as they are in memory? The system is used for both reporting and updates. SOA create PDF's job creates all the bank statements. I know the job names mean nothing. i will try to get some SQL behind some of the longer running ones. I will have some more information in the next couple of hours
I have narrowed down those times for the jobs run over the last 2 years and I have taken the slower performing ones. the problem is that the transaction volume is not constant. Do we assume a linear representation. I have tried to plot the total jobtime/transaction volume; this should normalixe the data. There are a lot of writes ( through the views).
You can get the best advice if you post all the code, but you have so many stored procedures, I suggest you post the code for each stored procedure as separate questions.

If you take arbert & jltoops suggestion (which I agree with) and determine how long each stored procedure takes, you could skip any that complete in only a few minutes.

As to general principals for improving performance for stored procedures, here are some that may be applicable:
1. Managing indexes well (adding and/or removing indexes could both help, depending on circumstances)
2. Covert cursors to set-based operations, where possible
3. Convert READ_ONLY cursors to FAST_FORWARD cursors where possible
There are no use of cursors. I will try to post the sqls

The fill factor for the indices used are 90% . What does that mean? Does it suggest performance issues?
Straight from books online:

"When creating an index, you can specify a fill factor to leave extra gaps and reserve a percentage of free space on each leaf level page of the index to accommodate future expansion in the storage of the table's data and reduce the potential for page splits. The fill factor value is a percentage from 0 to 100 that specifies how much to fill the data pages after the index is created. A value of 100 means the pages will be full and will take the least amount of storage space. This setting should be used only when there will be no changes to the data, for example, on a read-only table. A lower value leaves more empty space on the data pages, which reduces the need to split data pages as indexes grow but requires more storage space. This setting is more appropriate when there will be changes to the data in the table."


If you have a lot of updates/inserts/deletes on a table, you might want to lower this. A value closer to 100 will cause page splits (and slow things down) with more updates....

Do you know anything about your raid setup and logfile placement?
I will have the RAID information this afternoon

I don't mean to state the obvious, have you actually monitoring the server hardware performance ?

How many rows are you trying to process during this batch operation, you have 2GB of memory a database that is 115Gb, which will mean that most of your operations will be HDD based as it cannot place much of the database in memory.  This would be a HDD/Memory bottleneck if you are processing a lot of data updates.
I would advise looking also at the hardware counters using perfmonitor in windows, if possible.
These should be a good start:
Logical Disk / Current Disk Queue Length
Memory
Network IO
Processor

You tend to get a faster bang/per/buck if you upgrade the hardware and get this right, software code changes often take longer to implement and can realise some large-time savings.
Before you verify any purchases get this data from the company if they cannot provide this, the question is would you trust them to host your system ?

You would need to upgrade from SQL 7 -> SQL 2000 Enterprise Edition which will give you also MS Optimisations that are not present in SQL 7 Std Edition (in otherwords it runs faster).  SQL2000 is faster than SQL7.  SQL 2000 EE would be suitable for a > 2 GB memory size instalation.
Taken from your post it looks like you are attempting to process 17 M records on the system in a weekend, is this correct ?

Yes, they use RAID 5. and they store the log file in the same disks as data. The tempDB size is set as 20 MB and set as autogrow. I read that this was bad. Can somebody say why?
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
HI.. could you award points for this?  or post a new question
Id really appreciate it..

Jay
Have we helped you to solve your problem?  What were the results of monitoring and average times for each step? Have you identified the long running jobs so we might be able to help optimize those procedures?  Until you know what is slow, and why, it's hard to start tuning.
reccomended 70% arbert 30% jltoops

Jay