• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 723
  • Last Modified:

SQL Server 2005 - Fuzzy Grouping Task in SSIS causing TempDB to grow to ridiculously large sizes... need help

First off, I'm a SQL Server newbie and I may have gotten in over my head a little here, but you gotta learn somehow.

I have read everything I can get my hands on about the new Fuzzy Grouping feature in SSIS and I have created a package that looks for duplicates in one of my DB Tables.  The table has 6 fields and about half a million rows.  I need the package to use Fuzzy Grouping too look for "near duplicates", in the table and copy the "duplicates" to another table where they can be reviewed and eventually have the IDs resolved so that only one entry for each actually "entity" exists in the table.

The package I created works great in my test environment (much smaller table), but when it is run on the production server with the large table, the package takes almost a day to run and the last time I ran it the combined size of the TempDB files was several hundred GIGS!

I read on MSDN that the size of TempDB can become "quite large", but that's about as descriptive as they get.  I'm sure there is some basic step that I am missing that will keep the size of TempDB from growing out of control, but like I said, I'm new at this stuff, and I may have tried to "run before I really knew how to walk", so to speak.  Regardless, I need to make this work somehow and if anyone can offer some advice I would greatly appreciate it.

1 Solution
Vadim RappCommented:
I'm sure you are not missing anything. From what BOL says about the fuzzy grouping, and from the times disk space is mentioned, including recommendation not to run it on production server (obviously because of the possibility to eat up all disk space), it's clear that what you saw is typical. Remember, it's brand new feature, so it's not very surprising.

Featured Post

Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now