In this week long series we are going to go over the fundamentals of an SSIS package. By the end of this week, you will have the knowledge necessary to write your own package and begin using SSIS for your ETL needs.
The image above represents what might be considered a small, but standard SSIS package. You have all of the basic components. You have Data Transformation Tasks, File System Tasks, Execute SQL Tasks,a For Loop Task, a few Script Tasks tossed in, and finally user notification (via Send Mail Task).
Today we will give brief explanations of the components you see in the above picture and dig into the File Watcher and For Loop tasks. Throughout the week we will go into the different sections in more detail.
The File Watcher Task is a nice task which sits and waits for a file to be placed in a specified folder. You can supply a simple wildcard file name, supply a timeout and the the result of the watcher will fill a variable with the filename with full path(which we use later). I did find out something interesting, which I want to let you know about. If you have multiple FileWatchers in the same package, you must set the "Timeout Warning" to true. If you don't and the FileWatcher times out, your package will receive an error and bomb out. If you think "I'll just set the timeout to never end", the package will never end :). Next up the For Loop.
In this package there are 4 FileWatchers each looking for the exact same file, in different directories. This is because there are 4 people responsible for this file and if there are errors, we want to know who to send them to. I used the drop box method to solve this problem. The For Loop enables me to constantly loop through all the FileWatcher tasks as each one expires ( I set the timeout at 1 minute). You'll notice that I have all of them (you can only see the first two) connected via the logical &&. This simply says 'If all of the variables are empty, loop again", at which point the tasks will check to see if the file exists, sleep on wait, send a warning if file fails to appear before timeout, check loop again. Now once a file is placed in one of the directories, the corresponding FileWatcher will wake up, the loop will 'fail' and we move on to the task in the list.
The Execute SQL Task enables you to run any type of sql (insert/update/deletes, truncates, store procedures, etc) and if a result set exists, use it. The result of the statement can be no records, a single record, or multiple records. I can store the results in variables (single record) or an object (multiple records). We will get into more detail later in the week.
The File System Task can be used for simple things like copying/deleting/moving files. It does just like the name implies, manipulate files.
The Data Flow Task is really the heart of SSIS. This is where you import data, manipulate it to suit your needs, and then push it into a repository of some sort. You can import from a variety of sources and export to a variety of sources. There are endless ways to manipulate the data with stock components or if you don't see one you need, components you can build yourself. This will take a whole day unto itself where we will go over not only how to develop a Data Flow Task, but how to tune them to get the utmost of performance out of them.
The Send Mail Task is flexible and powerful. You can not only email users when something goes wrong, but you can send them information as well. I can attach a file to my outgoing mail, dynamically create a 'To:' list of users (which I do in the example package), even dynamically modify the message I send.
Take a bit to digest the information for Day One. Tomorrow we will dig into the Script Tasks and Script Components. You will learn the difference between the two and how they are used.