• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 3747
  • Last Modified:

SSIS Script Cache Connection Manager

I'm trying to script a cache connection manager in my package. Mainly because I need to reuse the same connection manager many times throughout the execution of the package and I keep getting the error similar to "cannot initialize [name] since it is already in use". I tried setting "retain data = false" and that didn't do the trick either. Hence my choice to scrip it so I can control when it's built/disposed.

That being said...I get my code to run through the package in a script component just fine, but it doesn't put a connection manager anywhere in my package (I have a breakpoint on the transformation right after the script so I can see). Any clues as to what I need to add?

Thanks!!
ConnectionManager conManager;
            conManager = Dts.Connections.Add("CACHE");
            conManager.Name = "test";
            conManager.Description = "please work";

            IDTSConnectionManagerCache100 cacheConnectionManager =
                       (IDTSConnectionManagerCache100)Dts.Connections["test"].InnerObject;

            IDTSConnectionManagerCacheColumn100 ID = cacheConnectionManager.Columns.Add();
            IDTSConnectionManagerCacheColumn100 Age = cacheConnectionManager.Columns.Add();
            IDTSConnectionManagerCacheColumn100 Year = cacheConnectionManager.Columns.Add();
            IDTSConnectionManagerCacheColumn100 LastName = cacheConnectionManager.Columns.Add();
            
            ID.DataType = DataType.DT_I4;
            ID.IndexPosition = 0;

            Age.DataType = DataType.DT_I4;
            Age.IndexPosition = 0;

            Year.DataType = DataType.DT_I4;
            Year.IndexPosition = 0;

            LastName.CodePage = 1252;
            LastName.DataType = DataType.DT_STR;
            LastName.IndexPosition = 1;
            LastName.Length = 100;

Open in new window

0
rmm2001
Asked:
rmm2001
  • 6
  • 4
  • 4
1 Solution
 
vdr1620Commented:
you can reuse the same Connection Though out the package by changing the properties of the Connection Manager

Select the ConnectionManager ---> Properties --> RetainSameConnection = TRUE
0
 
vdr1620Commented:
I forgot to mention..You might to do this in Data flow Task NOT Control Flow Task and use Script Component as a Source...You can also define the connections being used in the connections tab of the script component..
0
 
rmm2001Author Commented:
I think I need to rephrase my question differently.

This is using a cache connection manager. When I run it in a for loop (using it more than once) I get this error :
Error: 0xC0010200 at Cache WH, Cache WH [38]: The component "Cache WH" (38) cannot write to the cache because component "Cache WH" (38) has already written to it.

Both with RetainData true and false.

The flow goes like this:

For Loop ->
    1) Data Flow Task to cache the data
    2) DFT to split that data into two different chunks
    3) DFTs 4, 5, 6, 7, 8 to use the cached data from the to files from DFT2
And repeat

See above error message for what happens the first time the loop goes through where "Cache WH" is DFT1.
0
Never miss a deadline with monday.com

The revolutionary project management tool is here!   Plan visually with a single glance and make sure your projects get done.

 
vdr1620Commented:
Did you try using OLE DB Source and Cache transform instead of the script.. I know you want more control on what you are doing.. but i would suggest you to try that
0
 
rmm2001Author Commented:
Yes. I can created OLEDB Source/Cache End just fine. But please read my posts. I need to make it dynamically because it loops. And when it loops i get :Error: 0xC0010200 at Cache WH, Cache WH [38]: The component "Cache WH" (38) cannot write to the cache because component "Cache WH" (38) has already written to it.

Did that make sense? If not I'll try again :)

0
 
vdr1620Commented:
I thought even the 1st attempt fails and so was asking you to try different things..

In a single package, only one Cache Transform can write data to the same Cache connection manager. If the package contains multiple Cache Transforms, the first Cache Transform that is called when the package runs, writes the data to the connection manager. The write operations of subsequent Cache Transforms fail.

check this article.. i believe that would answer your question
http://msdn.microsoft.com/en-us/library/bb895264.aspx
0
 
rmm2001Author Commented:
Yes - I know that and I've read the article. That is why I am here asking the question. As in the first post :

I'm trying to script a cache connection manager in my package. Mainly because I need to reuse the same connection manager many times throughout the execution of the package and I keep getting the error similar to "cannot initialize [name] since it is already in use". I tried setting "retain data = false" and that didn't do the trick either. Hence my choice to scrip it so I can control when it's built/disposed.

^ That is what I am trying and need to do. Yes I know that only one transform can write data to one cache transform, which is why i need to create the ccm dynamically so I can rebuilt it for each of my iterations. Here's my code again. Can you see where I need to add/remove something in order for my ccm that i created to appear in my package?

Thanks!
ConnectionManager conManager;
            conManager = Dts.Connections.Add("CACHE");
            conManager.Name = "test";
            conManager.Description = "please work";

            IDTSConnectionManagerCache100 cacheConnectionManager =
                       (IDTSConnectionManagerCache100)Dts.Connections["test"].InnerObject;

            IDTSConnectionManagerCacheColumn100 ID = cacheConnectionManager.Columns.Add();
            IDTSConnectionManagerCacheColumn100 Age = cacheConnectionManager.Columns.Add();
            IDTSConnectionManagerCacheColumn100 Year = cacheConnectionManager.Columns.Add();
            IDTSConnectionManagerCacheColumn100 LastName = cacheConnectionManager.Columns.Add();
            
            ID.DataType = DataType.DT_I4;
            ID.IndexPosition = 0;

            Age.DataType = DataType.DT_I4;
            Age.IndexPosition = 0;

            Year.DataType = DataType.DT_I4;
            Year.IndexPosition = 0;

            LastName.CodePage = 1252;
            LastName.DataType = DataType.DT_STR;
            LastName.IndexPosition = 1;
            LastName.Length = 100;

Open in new window

0
 
Reza RadCommented:
Hi,
Using the Cache Connection managers in Script component is poorly documented.
there are few number of persons who tried to read cache connection data in script component, but mostly unsuccessful.
I also tried this before and had no luck.
this TechNET link is a sample of working with cache connection managers in script component:
http://social.msdn.microsoft.com/Forums/en-US/sqlintegrationservices/thread/452a6316-c552-46dd-bead-325d033824e3

as you see there are no clues , no starting points anywhere. and hard to finding out the way by yourself.

Now, I think a work around to use another way instead of cache connection.
please tell us why you want to use cache connection? and what you want to do exactly?
maybe your aim be reachable with simpler ways.


0
 
rmm2001Author Commented:
Ok what's going on is this..

I have a moderate database (A) (~5 million rows).
I need to compare these rows to the master database (B) ( ~20 million) to see which are new and which are old.
New records are piped through one set of tasks and old records are sent to a different task.

My control flow is this
For Loop ->
    1) Data Flow Task to cache the data (A)
    2) DFT to split that data into two different chunks (Lookup done to (B) to see what is new and sending new/old into two different caches)
    3) DFTs 4, 5, 6, 7, 8 to use the cached data from the to files from DFT2
And repeat

My intentions were to break up (A) into smaller parts (hence the for loop that will loop around and grab records that have IDs that match the value that the loop is on (if that made sense).

Then using the lookup task to go and find new records/old records ...then putting new/old records into their own caches so they can be split up and work simultaneously.

I'm definitely up for a work around though!
0
 
Reza RadCommented:
OK, amout of data is large , and using lookup needs to be carefully done.

I have question:
does A and B is on single server? are they linked?
I am asking this because I think if you can do part of data flow with T-SQL in database side will be better for large amount of data
0
 
rmm2001Author Commented:
They're linked yes. I can definitely go down the road of tsql and see what that leads to..
0
 
Reza RadCommented:
so I suggest to use T-SQL for this.

In the other hand, using 20 milions of records in Lookup can be done if you do it carefully.
for example suppose you have an integer field and varchar field in lookup query. this may need 4+10 byte =14 byte
so when this multiply by 20M records , result is : 280 MB
this is not very much memory ,
But note that you should be careful to use it. and also be careful about other transformation in Data flow, for example a simple Sort transformation can double this amount to more than 500MB.

and note that this is just lookup part, you have a source table too, which is 5M records, you should fetch it carefully, just select fields you want, every field you fetch can raise the memory usage size multiple 5M. do you get what I mean?

also you can use Cache mode when you work with lookup to raise performance, this is great article by SSIS team blog which I recommend it to everyone who wants to work with Lookup Transformation:
http://blogs.msdn.com/b/mattm/archive/2008/10/18/lookup-cache-modes.aspx

and at last word, try a linked server and T-SQL firstly. because when you work with T-SQL, all transfer done in Server memory. but in SSIS first A data fetch to SSIS server, next B data fetch and do some transformation there and then fill result. with T-SQL there is no need to third place ( I mean SSIS package execution environment)


0
 
rmm2001Author Commented:
So I did get the code to create a new ccm...it just overwrote everything that was in my entire package except for the new manager. So I'd consider that a no-go. I'm in the middle writing it all (and I think it's a better way to go). I'm using a lot of tsql with a little of the cached stuff to hopefully speed things up while accomplishing something

Thanks for the push in the right direction!
0
 
Reza RadCommented:
glad to help,
Regards,
Reza
0

Featured Post

Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

  • 6
  • 4
  • 4
Tackle projects and never again get stuck behind a technical roadblock.
Join Now