One of appropriate test strategy for testing extract, transform and load (ETL) procedures is to use automation. In this post I will explain what on this field Microsoft has to offer.
When you need to test ETL procedure, you usually have a large set of test data. This is first indicator that automation could help you a lot. Microsoft offers, as part of SQL server, its extension of SQL, tSQL. Based on that, community around Microsoft SQL server developed tSQLt, database unit testing framework. It has features that are expected of every modern testing framework on database access object level. The most important one is transactional support, which means that during your test you can modify your database, perform your checks, and test will automatically revoke all database changes when it is done with execution.
As an IDE I recommend Microsoft SQL management studio express which gives you Microsoft Visual Studio features for database development.
You can start by taking quick start guide, which is actually quick. I do not recommend installing tSQLt to your development database. Write tests in tSQLt_Example database and access your database using database_name.database_table notation available in Microsoft SQL Server management studio.
In order to be able to use tSQLt, you need to now any kind of database stored procedure SQL extension. By following links in Quick Start guide, you will find all needed resources to start testing your ETL procedures.
And one lesson from BBST foundation course. tSQLt is database unit testing framework. What is unit in this context?
Please write your answers as comments of this post.
In my context I used tSQLt in following manner. I tested ETL procedure written in C# and run from command line. I explored input and output databases, and tried to discover business rules of ETL procedure. I coded those rules as tSQLt tests, and discuss them with business analyst using source of tSQLt tests as requirements documentation.