SSIS-Lookup-Partial Catch Sucks!

It is obvious that I am lazy. Not any update on my blog in the past one month.

Well, here I only wanna remind all SSIS developers to take consideration very seriously to use the “partial catch” in Lookup component in SSIS.

My experience learned from the project informs me this fact. It costs 9 hours to finish loading 12 million records to go through some 40 lookup component, one of which uses “partial catch”. By contrast, if changing to full catch mode, it only takes 40 minutes! Amazing~~it is true. I am extremely surprised by this.

Technically, why I would try the “partial catch”? Here is the case. There is a dimension table with a low boundary(min value) and a high boundary(max value) combined as a business key. As it goes, you wanna fetch the surrogate key for your fact data which flows into the pipeline. So what can we do? if using full catch, you lose the chance to modify the TSQL and you cannot do a range compare. One of options is to use the partial catch then modify the TSQL something like this select * from (select * from...) where ?>=minValue and ?=O.MinNumberOfDays and H.num<=O.MaxNumberOfDaysHowever, this is not a good option for the enterprise data warehouse.

Alternatively, it is highly suggested to use the full catch. How? using CTE or a “numbering” table to expand the range as row by row. Looks like below TSQL.(Personally, I like the way of the table one–to create a physical table to store sufficient numbers.)
select num=convert(int,num ), OrderAging_Key from
(select top 10000 rOW_NUMBER() over (order by O1.object_id)-2000 as 'Num'
from sys.objects O1 , sys.objects O2, sys.objects O3, sys.objects O4) as H inner join dbo.Dim_OrderAging O
on H.num>=O.MinNumberOfDays and H.num<=O.MaxNumberOfDays

Hope this thread helps.

Advertisements

Dynamic ETL Framework

If you have bunch of tables to copy from application Source DB(s) to landing place(or say ODS layer),even you need to move the data from ODS to stage and stage to DWH. This framework may help you some to a certain degree.

The purpose of this framework is to mitigate the work in creating almost-duplicate ETL packages. Instead, you can have more time to smooth the business requirement.

Take a closer look. Supposing there are a1, a2,…an tables waiting for you to load them into ODS. Normally, you are requested to create one package for all or you create one package for one table accordingly to load the data from source to target. If this is the reality you are experiencing, check out this framework.

The idea of framework is easy to understand. If we have a table to maintain the source table name and its destination place/name even source columns to destination columns, then this dynamic framework can go through every table mapping and load the data according to your configuration in the config table. What you gain is to save the time in developing/unittesting/maintaining the packages. What you need to focus isto understand the business requirement and configure the mapping relationship in the right place and tell the framework which loading strategy you wanna implement flush-fill or SCD I…

Here are two possible ways of implementation. One is to use the SQL server procedures, which is pretty SQL intensive. The other is to leverage the SSIS interface in .Net framework to dynamic generate the packages and execute them.

The first version(SQL intensive, single thread) is in coding and will be released as estimated on Nov. 25th, 2011.
Tables schema attached. Feel free to check this out.

Move DB files

For some reasons, you may need to change DB file location from one to another.
As i can think of, there are at least 3 wways.
1.detach DB, move files, attach
2.backup/restore(indicate a new location when restoring)
3.This is a way I prefer to use but seems it does not have a formal name.
scripts for your reference
use USSut003
go
–tell SQL server, you wanna change the physical place for one file(could be log or data)
ALTER DATABASE USSut003
MODIFY FILE (NAME = ‘USSut003_PG_DAT_1’, FILENAME = ‘J:\BIILAND\MSSQL\DATA\USSut003.MDF’)

use master
go

ALTER DATABASE USSut003
SET offline

–Physical move data from one place to another

ALTER DATABASE USSut003
SET online

All done.

How to repeat the column header for tablix in SSRS2008 R2

Issue: “When creating a tablix control with Repeat Column Header set to true the Column Headers are not repeated”.

In the tablix control, when click row header handle then right click it to set the “Repeat header column in each page” in the “tablix property”, as we all ses the option name is quite self-explanation but unfortunately it does not work in most cases.

Below the steps are listed to solve this.

1) Select the Tablix that you want to repeat column headers for by clicking on it.

2) At the bottom of the screen, find the “Row Groups” and “Column Groups” section.
3) Click the small drop-down-arrow on the right side of that section.
4) In the “Row Groups” section, locate the top-outermost “static” row and click on it.
5) In the properties pane, you need to set the following TWO properties:
a) RepeatOnNewPage = True
b) KeepWithGroup = After

SQL Server (BI) Training Log

Only for log….

 

Year of 2010

  • July 17, 2010– SQL Server 2008 R2 (Course 6231, 6232A), Public training, Shanghai
  • Aug 28, 2010–SQL Server 2005 BI (Courses: 2791, 2792, 2793),Client-Customized Training, Acxiom Corporation, Shanghai
  • Oct 23, 2010–SQL Server 2008 R2 Training (Course 6231, 6232A), Public training, Shanghai
  • Dec 1, 2010–SQL Server APO Training (SQL 2008 R2 new features), Public training, Shanghai
  • Dec 27, 2010–SQL Server 2008 R2 BI (Courses: 6234A, 6235A, 6236A),Client-Customized Training, Microsoft Corporation, Beijing

 

Year of 2011

  • Mar 25, 2011–SQL Server Hands-On-Lab (SQL 2008 R2 new features), Public training, Shanghai
  • Apr 16~19, 2011–SQL Server 2008 R2 (Course 6231, 6232A), Public training, Shanghai
  • May 19~22, 2011–SQL Server 2008 R2 BI (Courses: 6234A, 6235A, 6236A), Public Training, Shanghai
  • July 3, 9, 10, 2011–SQL Server 2008 R2 BI (Courses: 6234A, 6235A, 6236A),Client-Customized Training, Metlife Corporation, Beijing
  • July 4~8, 2011–SQL Server 2008 R2 BI (Courses: 6234A, 6235A, 6236A), Public Training, Beijing
  • Aug. 17~19, 2011–SQL Server 2008 R2 HOL (Hands On Lab, Exam Instruction 70-432 and 70-450), Public Training to MS partners, Shanghai
  • Aug. 31~Sep.2, 2011–SQL Server 2008 R2 HOL (Hands On Lab, Exam Instruction 70-432 and 70-450), Public Training to MS partners, Chengdu

Hello world!

Welcome to WordPress.com. After you read this, you should delete and write your own post, with a new title above. Or hit Add New on the left (of the admin dashboard) to start a fresh post.

Here are some suggestions for your first post.

  1. You can find new ideas for what to blog about by reading the Daily Post.
  2. Add PressThis to your browser. It creates a new blog post for you about any interesting  page you read on the web.
  3. Make some changes to this page, and then hit preview on the right. You can alway preview any post or edit you before you share it to the world.