Josh Price

asked on

MOSS_DELTAIMPORT stage very slow after User Profile Service synchronization reset

I realize it's entirely possible that this kind of behavior is normal, but I'd appreciate confirmation or correction...

We were no longer gathering new AD users in our synchronization and after many hours of tinkering, I followed this article to reset sync. I reconnected AD (left a BDC connection out) and ran a full sync. 36 hours later it had completed with all of the new users we were missing.

However, the next incremental sync is where I saw issues. According to the MIIS client, the process hit the MOSS_DELTAIMPORT step and slowed down incredibly. It was averaging about 6 objects per minute, which is not good with 200,000 objects to process.

I let it run for days and it never sped up. The timer job entered the 'Pausing' state and a few days later, I had enough. I stopped it, restarted services and attempted to start the incremental over. Once again, it hit the MOSS_DELTAIMPORT step and essentially stopped. Days later, a mere 2500 objects were processed and the timer job was in 'Pausing' state again. Rinse and repeat a couple more times.

One constant (aside from speed) was that it appeared that most of the objects/users were both Deletes and Adds in the statistics inside of the MIIS client. I would assume this occurs because the sync service needs to reconcile the users already in the profile store with the AD data in the metaverse. However, I would not assume that this step take about 100x longer than any other step in the sync process.

I've looked through logs and I've been tracking performance data on our SQL server and the application servers. The FIM service and the UPS service rarely exceed 5% CPU and DB IO is in the KB/s range. Aside from the timer job slipping into the 'Pausing' state, there's no evidence to suggest that the process isn't continuing to run.

I don't know what information I need to provide to get some insight here, but I will do what I can to offer what I have. I wanted to provide a little background and touch on the troubleshooting/monitoring I've been doing for over a month now. Is this normal behavior, or do I need to investigate further into logs to find a possible cause for this?

Again, any insight anyone could provide would be wonderful. I have nowhere to go next and this is a growing concern for our organization.
Justin Smith
What is your Farm Build Number?  Find it in Central Admin - System - Servers in farm.
It is happening in all three of our environments (DEV, QA, PRD). The build numbers are:

DEV/QA: 14.0.6137.5002 (April 2013 CU)
PRD: 14.0.6112.5000 (October 2011)
Josh Price
I was not receiving input from anyone regarding this question. I failed to find any valuable answers on TechNet and on EE. I hate having to recreate something to fix a problem - sometimes it's like using a nuke on pesky ground moles.