I am always looking to improve the performance of my SSIS packages. I try to do as much work within the package as possible, only going to external stores (databases, flat/raw files) as little as possible. There are a number of reasons for this. Raw files are incredibly fast and very good when you need to have multiple stages within your package. I use Raw Files to store about 20 GB of data which is broken out among 8 threads. This was necessary for a number of reasons one if them being the ability to reload the tables from this raw data if we ran out of time during the initial load window.
In SSIS you can store the result set of a stored procedure in a System.object, which is translated into an ADO object. I noticed the only SSIS component that can use this source is the Foreach Loop. I can loop through the variable and process the results. At the bottom of this post I linked to the request to add the ability for the Data Flow ADO.NET source/destination to use a System.Object.
I have a few situations where I have a flat file and would like to read that into memory for processing. Currently reading in a flat file the traditional way, with a Data Flow task, only allows me to push that into a data source. I dug a little deeper into the SSIS script component and found a way to read the flat file from within a Script Component into an ADO component, which can then be read by a Foreach Loop.
If I get enough people to vote for the enhancement this technique can be used to eventually build in-memory data sets, start to finish.
Let's dissect the C# component I wrote.
Make sure you include using System.Data in your script. This is where all of the ADO.NET objects live. Now, you need to create a table structure to store the data. According to the API you can name your table, but I have yet to find a need for it:
DataTable dt = new DataTable("Parent_Child");
Now you need to create columns for your table. In my case I need a two columns, one for the parentID and one for the ChildID. Since I am reading in text fields and figured I can always case on insert into SQL Server.
Now you have your table all ready to receive data, you need to lock your SSIS variable so you can access this data outside of the script. I happen to have named my variable test_obj, which is of course of type System.Object.
The script image above shows the technique to load up the local object, DataTable dt. Once that is completed you pass the local object back out to the SSIS variable and unlock the variables like so:
variablesList["User::test_obj"].Value = dt;
Remember to unlock your list AFTER you set the value of your variables. At this point you have loaded all of the data pulled from the flat file in a simple two column table that resides in an SSIS object variable that you can now loop through, like so:
Please click on the link below and vote to enhance SSIS to enable ADO source/destination to accept variables.