Dynamic ETL Framework

If you have bunch of tables to copy from application Source DB(s) to landing place(or say ODS layer),even you need to move the data from ODS to stage and stage to DWH. This framework may help you some to a certain degree.

The purpose of this framework is to mitigate the work in creating almost-duplicate ETL packages. Instead, you can have more time to smooth the business requirement.

Take a closer look. Supposing there are a1, a2,…an tables waiting for you to load them into ODS. Normally, you are requested to create one package for all or you create one package for one table accordingly to load the data from source to target. If this is the reality you are experiencing, check out this framework.

The idea of framework is easy to understand. If we have a table to maintain the source table name and its destination place/name even source columns to destination columns, then this dynamic framework can go through every table mapping and load the data according to your configuration in the config table. What you gain is to save the time in developing/unittesting/maintaining the packages. What you need to focus isto understand the business requirement and configure the mapping relationship in the right place and tell the framework which loading strategy you wanna implement flush-fill or SCD I…

Here are two possible ways of implementation. One is to use the SQL server procedures, which is pretty SQL intensive. The other is to leverage the SSIS interface in .Net framework to dynamic generate the packages and execute them.

The first version(SQL intensive, single thread) is in coding and will be released as estimated on Nov. 25th, 2011.
Tables schema attached. Feel free to check this out.