I recently started using Azure Data Factory to move data between systems. Until recently I have been writing PowerShell scripts, using SSIS job and various other methods to move and transform data. But Azure Data Factory meets my current needs much better.
One of the first newbie questions I had was how can I truncate a Microsoft SQL table to clear out the data before copying some data in?
It turns out there are a few methods depending on your requirements.
Method 1 – Pre Copy Script
If you want to always truncate the table immediately before performing a copy activity, then using a pre-copy script is the easiest method.
- Open the “Sink” tab for your copy activity
- Populate the “Pre copy script” field, for example
TRUNCATE TABLE MyTable
Method 2 – Using a Lookup Activity
This method is very useful if you don’t want to always truncate the table before a copy. In one of my Pipelines, for example, I have a parameter for a full refresh, if set to true the pipeline will truncate the table to empty it before continuing else the pipeline just appends to the table.
Don’t let the name Lookup fool you the Lookup activity is actually pretty powerful. When using a lookup activity you can actually run any sort of SQL query, not just a select statement.
- Open the “Settings” tab for your lookup activity
- Set the “Use query” option to Query
- Populate the “Query” field with you truncate statement.
Note: You would think you could just enter the truncate statement i.e. TRUNCATE TABLE MyTable However there is a little more to it, read below the screenshot.
Azure Data Factory expects a lookup activity to return some date, if you just enter a truncate statement you will get a failure when triggering the pipeline.
Failure happened on ‘Source’ side. ErrorCode=SqlInvalidDbQueryString,’Type=Microsoft.Data Transfer.Common.Shared.HybridDeliveryException, Message= The specified SQL Query is not valid. It could be caused by that the query doesn’t return any data. Invalid query:Error 2100
‘TRUNCATE TABLE MyTable’Source=Microsoft.DataTransfer.ClientLibrarv.’
The solution to this error is to add a Select statement after your Truncate statement so some data is returned from SQL, for example;
TRUNCATE TABLE MyTable SELECT COUNT(*) FROM MyTable
Method 3 – Stored Procedure
The final method is to create a stored procedure on your SQL database to perform the truncate. You can then trigger your stored procedures from either the settings tab of a lookup activity or from the Sink tab of the copy activity.