Batch Performance

This question is related to the earlier question that I had posed but I have more details now:

here is the old question:
I have a client who needs help tuning a batch process; The batch is run every weekend and it finishes monday morning. The client feels that it's a long time for the batch to finish. The batch process was developed by a third party who is not very cooperative to rectify the situation. Where do we start to rectify the problem? Is there any said methodology for tuning a batch process. The developer also hosts the environment for the client and they seem to tell the client that there is nothing wrong with it. They have added additional processor and memory on the server but it has not rectified the situation. Any help with a defined process and right questions to ask the developer will be appreciated.

New Information:

Batch is a set of stored procedures. It has about 15 jobs scheduled one after the other. the DB is SQL 7 on a wintel quad processor with 2 gig ram. The disk resides on a SAN with fibre connectivity. the database size is 115 GB a nd with 318 tables. The client is saying that  the performance is very slow. the batch is run every fortnight and started oin friday evening . it completes by tuesday morning. The party which developed this asked the client to move to a new server( current wintel box) but problems persists. Gets worse day by day. there are a million transactions processed in every batch;

there seem to be lot of referential integrity which according to the developers slows down. They are not using any tools like SQL profiler to monitor the system for network, i/o bottlencks. They also mention that it has to be upgraded to SQL 2000. Are there any merits? what are the advantages from the performance side. Do we ask them to denormalize the tables? There are also some backups and it also takes a lot of time. i have started to suspect i/o issues. Can some of u advise. the updates are all through views; Do u think it should be changed?

Wait Job 1
Dist XRef File
Distributor Parent BSB Extract
Daily Dist Details File
Initial Backup Database
Wait Job 2
RPM Extracts
Commission Split
Process Exclusion
Academy Recalc
Remuneration Recalc
Override Recalc
backup after overrides and Before Recalc
SOA Recalc
Re index b4 OR recalc
Backup after Payments
SOA Load
Backup after SOA Load
SOA Control Totals
Dist Transaction File
DCS DistTran Extract
Non Product Commission
SOA Print
SOA Create PDFs
SOA Create CSVs
Final Backup

Can some of you throw some light and likely approaches to the problem. I have got an eval copy from embarcadero for a tool called DB artisan. Is it good?
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

"there seem to be lot of referential integrity which according to the developers slows down. They are not using any tools like SQL profiler to monitor the system for network, i/o bottlencks. They also mention that it has to be upgraded to SQL 2000. Are there any merits? what are the advantages from the performance side. Do we ask them to denormalize the tables? There are also some backups and it also takes a lot of time. i have started to suspect i/o issues. Can some of u advise. the updates are all through views; Do u think it should be changed?"

What's the system used for?  Reporting?  Updates?  If strictly reporting, then ya, I would probably consider denormalizing the data to some extent.

Definately start running SQL Profiler and Perfmon to look at the IO and memory usage.  What RAID type is the san configured for?  If RAID5, you will take a BIG hit for all write operations.  If the log file located on the same raidgroup/disks as the data?  Also a bottleneck to get rid of.

Embarcadero is a good tool, but personally, I would start with the built in MS products first--you need to get a handle on where you think the problems are....

Also, the jobnames tell us nothing--post some of the actual procs.....
Jay ToopsCommented:
first find out how long each job takes.

I have a suspect that you might be running some rather large update
queries somewhere.  
these can be very slow .. but ID how long each job takes
create a table insert before and after date/times



Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
jagishivAuthor Commented:
Do you suggest some metrics that need to be monitored here? The batch timings are borderline with the SLA and it has started to affect the online performance as well. Will views be a potential problem as they are in memory? The system is used for both reporting and updates. SOA create PDF's job creates all the bank statements. I know the job names mean nothing. i will try to get some SQL behind some of the longer running ones. I will have some more information in the next couple of hours
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

jagishivAuthor Commented:
I have narrowed down those times for the jobs run over the last 2 years and I have taken the slower performing ones. the problem is that the transaction volume is not constant. Do we assume a linear representation. I have tried to plot the total jobtime/transaction volume; this should normalixe the data. There are a lot of writes ( through the views).
You can get the best advice if you post all the code, but you have so many stored procedures, I suggest you post the code for each stored procedure as separate questions.

If you take arbert & jltoops suggestion (which I agree with) and determine how long each stored procedure takes, you could skip any that complete in only a few minutes.

As to general principals for improving performance for stored procedures, here are some that may be applicable:
1. Managing indexes well (adding and/or removing indexes could both help, depending on circumstances)
2. Covert cursors to set-based operations, where possible
3. Convert READ_ONLY cursors to FAST_FORWARD cursors where possible
jagishivAuthor Commented:
There are no use of cursors. I will try to post the sqls

jagishivAuthor Commented:
The fill factor for the indices used are 90% . What does that mean? Does it suggest performance issues?
Straight from books online:

"When creating an index, you can specify a fill factor to leave extra gaps and reserve a percentage of free space on each leaf level page of the index to accommodate future expansion in the storage of the table's data and reduce the potential for page splits. The fill factor value is a percentage from 0 to 100 that specifies how much to fill the data pages after the index is created. A value of 100 means the pages will be full and will take the least amount of storage space. This setting should be used only when there will be no changes to the data, for example, on a read-only table. A lower value leaves more empty space on the data pages, which reduces the need to split data pages as indexes grow but requires more storage space. This setting is more appropriate when there will be changes to the data in the table."

If you have a lot of updates/inserts/deletes on a table, you might want to lower this. A value closer to 100 will cause page splits (and slow things down) with more updates....

Do you know anything about your raid setup and logfile placement?
jagishivAuthor Commented:
I will have the RAID information this afternoon

I don't mean to state the obvious, have you actually monitoring the server hardware performance ?

How many rows are you trying to process during this batch operation, you have 2GB of memory a database that is 115Gb, which will mean that most of your operations will be HDD based as it cannot place much of the database in memory.  This would be a HDD/Memory bottleneck if you are processing a lot of data updates.
I would advise looking also at the hardware counters using perfmonitor in windows, if possible.
These should be a good start:
Logical Disk / Current Disk Queue Length
Network IO

You tend to get a faster bang/per/buck if you upgrade the hardware and get this right, software code changes often take longer to implement and can realise some large-time savings.
Before you verify any purchases get this data from the company if they cannot provide this, the question is would you trust them to host your system ?

You would need to upgrade from SQL 7 -> SQL 2000 Enterprise Edition which will give you also MS Optimisations that are not present in SQL 7 Std Edition (in otherwords it runs faster).  SQL2000 is faster than SQL7.  SQL 2000 EE would be suitable for a > 2 GB memory size instalation.
Taken from your post it looks like you are attempting to process 17 M records on the system in a weekend, is this correct ?

jagishivAuthor Commented:
Yes, they use RAID 5. and they store the log file in the same disks as data. The tempDB size is set as 20 MB and set as autogrow. I read that this was bad. Can somebody say why?
First, it's really bad to have your logs on the same drives/raid group as your data.  not only for performance reasons, but especially from a recoverability standpoint.  RAID5 is a killer for updates KILLER....

autogrow isn't really bad, you just usually want to make sure you pre-allocate your database space with room to grow.  If the database has to grow while you're processing, it will slow things down and cause your data to become more fragmented....

Jay ToopsCommented:
HI.. could you award points for this?  or post a new question
Id really appreciate it..

Ken SelviaRetiredCommented:
Have we helped you to solve your problem?  What were the results of monitoring and average times for each step? Have you identified the long running jobs so we might be able to help optimize those procedures?  Until you know what is slow, and why, it's hard to start tuning.
Jay ToopsCommented:
reccomended 70% arbert 30% jltoops

It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Storage Software

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.