Monday, August 04, 2014

SSIS Best Practices 3 - Handling and Logging Data Load Errors

Normally you don't want a ETL load to fail because of any data errors. For instance, if you have a file with 1000 rows with 10 rows having some issues, you don't want the load to fail. Instead the package must successfully load the 990 rows and the 10 error rows need to be redirected to a separate table to identify the issue and fix them. If it is a flat file then it is worth while to generate and include a Row Number which will help in identifying the row in the flat file.

The error table will have all columns with data type of varchar(max) across all columns with additional columns like ErrorCode, ErrorColumn and ErrorDescription.

Error rows are handled in the Data Flow through the use of the Error Row Configuration properties. These properties tell the Data Flow Components what to do when a row fails an operation, such as data conversion or truncation.

Redirect Row Event Handler denotes that when a row reaches an error for any column marked as redirected, the entire row is sent out the red error path output which normally connects to the Error table destination. If the red error path output is not used then the row will be ignored out of the Data flow.

public override void Input0_ProcessInputRow(Input0Buffer Row) 
{ 
Row.ErrDesc = ComponentMetaData.GetErrorDescription(Row.ErrorCode); 
}

No comments:

Post a Comment