Solved

Metrics to diagnose ETL Jobs load performance in Informatica

Posted on 2014-10-12
2
904 Views
Last Modified: 2014-10-20
Hi,

I am managing an ETL team now and I learnt that it took informatica 27 Hours to load 85 Million rows. In my opinion this is unacceptable.

What metrics (and from what tool or repository of informatica) I can get more diagnostic information about

1. The source database is slow
2. The target database is slow
3. Missing indexes issues
4. Transforms taking time
5. Staging database is slow
6. Job was queued in the system ( so the actual time can be 5 hour but for 22 hours it was in a queue)

Anything else you might think of. My point is simple I want the team to look for all kinds of optimization opportunities to speed up this job or work with the vendor to get this fixed

thanks
-anshu
0
Comment
Question by:anshuma
2 Comments
 
LVL 45

Accepted Solution

by:
aikimark earned 400 total points
ID: 40376766
Contact Informatica immediately.

In such cases, I usually advise my clients to isolate the problems.  For instance, just for the extract phase, I would ask...
* is the source database busy when you want to run the extract?
* is the extract job (network) 'close' to the source database server?
* what is the speed and capacity of the network path from the source database to the extract job?
* is the extract job running on a busy server?
* is the extract job running on a resource constrained server?
* is the extract job using local storage for its temp/working files?
* what is the fastest possible extract of data from the source, independent of the Informatica software?

All of these questions are independent of the configuration and properties your team has assigned to the ETL job, which should be double-checked.  I've seen cases, in other system utilities, where an option was selected to minimize system resource use.  The trade-off was slow run times.

In another shop, the vendor's tape management system used the first character of the job name to index the job's data in the utility's internal linked-list data structure.  Since the production job names began with "P", there was a very uneven distribution of data and all activities (insert, search, update, delete) took a very loooong.  Rather than replacing the tape management system, the jobs were renamed.
0
 
LVL 74

Assisted Solution

by:sdstuber
sdstuber earned 100 total points
ID: 40376892
You should be able to look in the workflow monitor logs and see which steps are taking a long time.

Then determine what the corresponding SQL statements are and give those to your dba team to investigate.
0

Featured Post

Netscaler Common Configuration How To guides

If you use NetScaler you will want to see these guides. The NetScaler How To Guides show administrators how to get NetScaler up and configured by providing instructions for common scenarios and some not so common ones.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Using SQL Scripts we can save all the SQL queries as files that we use very frequently on our database later point of time. This is one of the feature present under SQL Workshop in Oracle Application Express.
When it comes to protecting Oracle Database servers and systems, there are a ton of myths out there. Here are the most common.
This video shows how to Export data from an Oracle database using the Datapump Export Utility.  The corresponding Datapump Import utility is also discussed and demonstrated.
This video explains what a user managed backup is and shows how to take one, providing a couple of simple example scripts.

808 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question