Much has been done to improve the Lookup Component between VS2005 and VS2008. The ability to create and use a cache connection managerand import raw files instead of having to go out to a database is brilliant and welcomed. What I believe is lacking in the component is the ability to allow records to 'pass' through on the same path. Right now you have 3 choices for your output;Lookup Match Output, Lookup No Match Output, and Error Output. The Lookup Match Output redirects rows that match the lookup down that pipeline, while the Lookup No Match Output redirects rows that have no match down a different pipeline. What if I want both?
I have a situation where I need to translate an ID into the corresponding English equivalent. An example of this is ID:101 = 'Job Complete' (folks would prefer to see the English phrase vs some arbitrary machine number). Here comes the catch. We have records for what are considered parent jobs. These jobs are used as headers and do nothing more than group actual jobs into a more meaningful set. These parent jobs have no job status, it's actually NULL and I want to preserve that. I would love to have an option that says 'Match or no match allow everything to flow through the same pipeline' This might break some sort of ETL rule, but I find this situation to be very common. The way I used to solve this, which is the most straight forward way, looks like this:
I don't find the above code desirable. You can no longer table lock , since you have multiple paths to the table (4 now) and you have an additional Lookup Component. I prefer the simplest and cleanest solution to a given problem and that's where Union All comes in.
The code above can be simplified with a Union All for each Lookup, like so:
Code is clean and elegant. Until the SSIS team enables me to ignore a mismatch to preserve the original value, this is the way to go.
on 2010-02-04 03:01 by Josef Richberg
Chris Randall posted a comment regarding Donald Farmer's post a few years back. I did a quick check and turns out the one test system I was using was a bit slow, then end result being somewhat erroneous. In the first example, you can have a table lock on all 4 tables and have a block size = the default. SSIS is able to process all of the inserts 'side-by-side' as explained by Chris.