It is obvious that I am lazy. Not any update on my blog in the past one month.
Well, here I only wanna remind all SSIS developers to take consideration very seriously to use the “partial catch” in Lookup component in SSIS.
My experience learned from the project informs me this fact. It costs 9 hours to finish loading 12 million records to go through some 40 lookup component, one of which uses “partial catch”. By contrast, if changing to full catch mode, it only takes 40 minutes! Amazing~~it is true. I am extremely surprised by this.
Technically, why I would try the “partial catch”? Here is the case. There is a dimension table with a low boundary(min value) and a high boundary(max value) combined as a business key. As it goes, you wanna fetch the surrogate key for your fact data which flows into the pipeline. So what can we do? if using full catch, you lose the chance to modify the TSQL and you cannot do a range compare. One of options is to use the partial catch then modify the TSQL something like this
select * from (select * from...) where ?>=minValue and ?=O.MinNumberOfDays and H.num<=O.MaxNumberOfDaysHowever, this is not a good option for the enterprise data warehouse.
Alternatively, it is highly suggested to use the full catch. How? using CTE or a “numbering” table to expand the range as row by row. Looks like below TSQL.(Personally, I like the way of the table one–to create a physical table to store sufficient numbers.)
select num=convert(int,num ), OrderAging_Key from
(select top 10000 rOW_NUMBER() over (order by O1.object_id)-2000 as 'Num'
from sys.objects O1 , sys.objects O2, sys.objects O3, sys.objects O4) as H inner join dbo.Dim_OrderAging O
on H.num>=O.MinNumberOfDays and H.num<=O.MaxNumberOfDays
Hope this thread helps.