SSIS: Transaction Options, Checkpoints


The TransactionOption property exists at the package level, container level (For Loop, Foreach Loop, Sequence), as well as any Control Flow task (e.g. Execute SQL task, Data Flow task, etc.). 
 
TransactionOption can be set to one of the following:
1.Required: If a transaction already exists then join it else start a new transaction.
2. Supported: If a transaction exists then join it (this is the default).
3. NotSupported: Neither creates a new transaction nor join an existing transaction.

IsolationLevel property by default set to Serializable.

The built-in transaction support in SSIS makes use of the Distributed Transaction Coordinator (MSDTC) service which must be running.

Transactions in SSIS allows you to commit the group of data flow tasks (or) allows you to roll back the group of dataflow tasks as single logical unit of work.

Transactions supported by every executable (i.e. package level, task level, container level).


For this 2 properties have to be set for executable
1. Transaction option
2. Isolation level

If any tasks are executed, then tasks are executed under transaction if transaction option for executable is set to supported.


Suppose sequence container having 3 Execute SQL tasks and transaction option =required, Isolation level =Serializable(default).
Then it creates new transaction when executed the sequence container.


All the tasks in Sequence container have Transaction option set to supported means these tasks run under transaction created by sequence container.

 
If the tasks have Transaction option=Required, it will join the existing transaction , if no transaction is there, it will create new transaction and executes.

In SSMS, create a test table called ASSOCIATES.

CREATE Table Associates
(
 AssociateID INT,
 AssociateName Varchar(50),
 Designation Varchar(50)
)

 

In SSDT, create a simple package that has 4 Execute SQL Tasks.





 
 
 


 
 
 
 
 
 
 
Leave the default value TransactionOption =Supported and execute the package.

















Press F5 and check the package status. It is not a surprise why the package failed, it is for the reason division by zero.
 
 
 

As you can see AssociateID 1001 and 1002 were changed because Task 1 and Task 2 changed before Task 3 that causes the package to fail but AssociateID 3 is not changed because it was after Task 3 and the package failed before it got a change to be executed.
 Now let’s remove all data from ASSOCIATES table then reassign the previous values.
 
 
 
 
 
 
 
 



 
Now let us go back to management studio change the TransactionOption property to Required then rerun the package once again.
 
 
 
 
 Re-execute the package.























Let’s go back to Management studio for one last time and check what changes were made to the table.  No records were updated for Transaction option Required.





 
 
 
 
Check Points:
Integration Services can restart failed packages from the point of failure, instead of rerunning the whole package. If a package is configured to use checkpoints, information about package execution is written to a checkpoint file.

When the failed package is rerun, the checkpoint file is used to restart the package from the point of failure. If the package runs successfully, the checkpoint file is deleted, and then re-created the next time the package is run.
 
 
Using checkpoints in a package can provide the following benefits.
1.Avoid repeating the downloading and uploading of large files.
For example, a package that downloads multiple large files by using an FTP task for each download can be restarted after the downloading of a single file fails and then download only that file.

2.Avoid repeating the loading of large amounts of data.
For example, a package that performs Bulk inserts into dimension tables in a data warehouse using a different Bulk Insert task for each dimension can be restarted if the insertion fails for one dimension table, and only that dimension will be reloaded.

3.Avoid repeating the aggregation of values.
For example, a package that computes many aggregates, such as averages and sums, using a separate Data Flow task to perform each aggregation, can be restarted after computing an aggregation fails and only that aggregation will be recomputed.

CheckpointFileName:
Specifies the full path and filename of your checkpoint file.

CheckpointUsage:
Specifies when to use checkpoints. The property supports the following 3 options:
• Never: A checkpoint file is not used.
• IfExists: A checkpoint file is used if one exists. This option is the one most commonly used if enabling checkpoints on a file.
• Always: A checkpoint file must always be used. If a file doesn’t exist, the package fails.

SaveCheckpoints:
Specifies whether the package saves checkpoints. Set to True to enable checkpoints on the package.

Create table in database using following script for EMP table
CREATE TABLE EMP
(
 EmpID INT,
 EName Varchar(50),
 Dept INT
)

Drag and drop 2 Execute SQL Tasks:



















Add the following script to Execute SQL Tasks.


 
 
Note: without creating DEPT table, trying to insert values into DEPT table in Task 2.

Set the FailPackageOnFailure = True on both the tasks

.
 
 
 
 


 

 
 
 
 

Set the Checkpoint properties for Package.
 

 



Package output would be as follows:
 
 
 
 
 
 


 

 
 
 
 
 


Two records are inserted into the EMP table.
 
 
 
 
 
 
 
 
 
 
 







Go to the checkpoint file location and see the details in text file.
 
 
 


Now create a DEPT table using below script and rerun the package.
CREATE TABLE Dept
(
 DeptNo INT,
 DName Varchar(50)
)

 
 
 
 
 
 
 
 
 
 
 
 






EMP table has only 2 records which were inserted for first time and not inserted any records when second time of package runs. Now DEPT table has 2 records which were inserted when second time of package runs.


 


 
 
 
 
 


 

 

 

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.