Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1088
  • Last Modified:

Metrics to diagnose ETL Jobs load performance in Informatica

Hi,

I am managing an ETL team now and I learnt that it took informatica 27 Hours to load 85 Million rows. In my opinion this is unacceptable.

What metrics (and from what tool or repository of informatica) I can get more diagnostic information about

1. The source database is slow
2. The target database is slow
3. Missing indexes issues
4. Transforms taking time
5. Staging database is slow
6. Job was queued in the system ( so the actual time can be 5 hour but for 22 hours it was in a queue)

Anything else you might think of. My point is simple I want the team to look for all kinds of optimization opportunities to speed up this job or work with the vendor to get this fixed

thanks
-anshu
0
anshuma
Asked:
anshuma
2 Solutions
 
aikimarkCommented:
Contact Informatica immediately.

In such cases, I usually advise my clients to isolate the problems.  For instance, just for the extract phase, I would ask...
* is the source database busy when you want to run the extract?
* is the extract job (network) 'close' to the source database server?
* what is the speed and capacity of the network path from the source database to the extract job?
* is the extract job running on a busy server?
* is the extract job running on a resource constrained server?
* is the extract job using local storage for its temp/working files?
* what is the fastest possible extract of data from the source, independent of the Informatica software?

All of these questions are independent of the configuration and properties your team has assigned to the ETL job, which should be double-checked.  I've seen cases, in other system utilities, where an option was selected to minimize system resource use.  The trade-off was slow run times.

In another shop, the vendor's tape management system used the first character of the job name to index the job's data in the utility's internal linked-list data structure.  Since the production job names began with "P", there was a very uneven distribution of data and all activities (insert, search, update, delete) took a very loooong.  Rather than replacing the tape management system, the jobs were renamed.
0
 
sdstuberCommented:
You should be able to look in the workflow monitor logs and see which steps are taking a long time.

Then determine what the corresponding SQL statements are and give those to your dba team to investigate.
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Tackle projects and never again get stuck behind a technical roadblock.
Join Now