Testing Within the Database

April 14, 2020

Specifically, unit testing. As per Hamish Watson (@thehybriddba)’s topic for this month’s T-SQL Tuesday.

Hamish wants to develop a conversation about unit testing within database because he recognises that the lack of unit testing is a significant problem. It’s quite commonplace in the world of iterative code, of C#, Java, and those kinds of languages, but a lot less commonplace in the world of data. I’m going to look at two of the reasons why I think this is.

Modularisation

When I was first taught to write code, I was taught about Conditionals and Looping. Conditions are the ‘IF’ statements, and Looping is the FOR or WHILE statements. You can write just about anything using these. But I was also taught that Modularisation was just as important, so that blocks of code could be called when appropriate, rather than having to repeat them each time. Everything else, including the building blocks of any paradigm like object-oriented programming or logic programming, was secondary to these core concepts.

And then years later I entered the database world, where coding was different. Queries used joins and predicates, not looping and conditionals, and although these things were possible within stored procedures, it was better include the logic within the query itself. As for modularisation, scalar functions were frowned upon, although procedures that called other procedures were okay.

Without modules, the units to be tested are a little less obvious, so fewer people think of unit testing in the database development world as they do in iterative languages.

Statelessness

‘State’ is how about the world looks when a particular set of code is called. A set of parameters that is passed into a module is not typically referred to as the ‘state’, but that’s pretty much all that isn’t. The concept of ‘statelessness’ is that anything that might be needed to allow a module to behave in particular way should be passed in as a parameter, and not pulled in from anywhere else. After all, that ‘anywhere else’ could be different tomorrow, and the module might behave differently.

Statelessness is what makes a module deterministic. When a function is called with a particular set of parameters, the result should be the same as any other time it was called with those same parameters. This is one of the fundamental premises of unit testing. We know that if we call an ‘Add’ function with the values 3 and -2, the result should always be 1. If it’s anything else, then something’s not right.

But the world of databases is all about state. The database describes the world of that system as it is in that moment, and it’s unlikely to ever be in that same exact state again.

So do we skip unit testing for databases?

No! It’s just that things might need to be tested in different ways.

A test harness for a database environment needs to include a database, but this should an opportunity not a hindrance. The database used for testing can include scenarios that you would hope never to find in a production database. You wouldn’t want anyone’s name to include strange characeters or potentially malicious code, but this could be included in a database used for testing. Products with negative prices, or prices that are so high that they would cause overflow problems, are excellent for test harnesses. It’s how to make sure that errors are handled smoothly, and that alerts about these conditions are being flagged.

Databases have a lot of things that need testing, because it’s not just about ensuring that modules of code have the right impact. It’s important to know and test the impact of SSIS processes, to be confident that conditions that break constraints fail gracefully, and to be reassured that reports will still be produced even if the data is unexpected or missing.

So for unit testing within the database, yes it’s very much a thing. I don’t particularly care how you define the units you’re testing, whether they be queries, statements, procedures, functions, packages, reports, execution plans, or anything else. You can test them for correctness, for performance, for grace in failure, for whatever you like. Think of the edge cases you want to put into the testing database. Think of the conditions you want to the system to be in when testing.

Just test.

@rob_farley