SSIS: DataFlow Transformations

Transformations are defined as a core component in the Data flow of a package in SSIS. It is that part of the data flow to which we apply our business logic to manipulate and modify the input data into the required format before loading it to the destination.

Different types of Transformations available in SSIS.

1. AGGREGATE: The Aggregate transformation applies aggregate functions to column values and copies the results to the transformation output. Besides aggregate functions, the transformation provides the GROUP BY clause, which you can use to specify groups to aggregate across.
The Aggregate Transformation supports following operations:
GROUP BY, SUM, AVERAGE, COUNT, COUNT DISTINCT, MINIMUM, MAXIMUM

Example of Aggregate Transformation

2. AUDIT: Adds Package and Task level Metadata - such as Machine Name, Execution Instance, Package Name, Package ID, etc..
3. CHARACTER MAPWhen it comes to string formatting in SSIS, Character Map transformation is very useful, used to convert data lower case, upper case.

4. CONDITIONAL SPLITused to split the input source data based on condition.

5. COPY COLUMN: Add a copy of column to the output, we can later transform the copy keeping the original for auditing.
6. DATA CONVERSION: Converts columns data types from one to another type. It stands for Explicit Column Conversion.
7. DATA MINING QUERY: Used to perform data mining query against analysis services and manage Predictions Graphs and Controls.
8. DERIVED COLUMN: Create a new (computed) column from given expressions.

9. EXPORT COLUMN: Used to export a Image specific column from the database to a flat file.
11. FUZZY GROUPING: Groups the rows in the dataset that contain similar values.

12. FUZZY LOOKUP: Used for Pattern Matching and Ranking based on fuzzy logic.

13. IMPORT COLUMN: Reads image specific column from database onto a flat file.
14. LOOKUP: Performs the lookup (searching) of a given reference object set against a data source. It is used to find exact matches only.

15. MERGE:Merges two sorted data sets of same column structure into a single output.

16. MERGE JOIN:Merges two sorted data sets into a single dataset using a join.

17. MULTI CAST - is used to create/distribute exact copies of the source dataset to one or more destination datasets.

18. ROW COUNT: Stores the resulting row count from the data flow / transformation into a variable.
19. ROW SAMPLING: Captures sample data by using a row count of the total rows in dataflow specified by rows or percentage.

20. UNION ALL: Merge multiple data sets into a single dataset.

21. PIVOT: Used for Normalization of data sources to reduce analomolies by converting rows into columns
22. UNPIVOT: Used for demoralizing the data structure by converts columns into rows incase of building Data Warehouses.

Different Types of Transformation with examples
In Row Transformation, each value is manipulated individually. In this transformation, the buffers can be re-used for other purposes like following:
OLEDB Datasource,         OLEDB Data Destinations
Other Row transformation within the package,     Other partially blocking transformations within the package.
Examples of Row Transformations: Copy Column, Audit, Character Map

2. Partially Blocking Transformation:
These can re-use the buffer space allocated for available Row transformation and get new buffer space allocated exclusively for Transformation.
Examples: Merge, Conditional Split, Multicast, Lookup, Import, Export Column

3. Fully Blocking Transformation:
It will make use of their own reserve buffer and will not share buffer space from other transformation or connection manager.
Examples: Sort, Aggregate, Cache Transformation

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.