A T-SQL Tip: Working calculations

T-SQL Tuesday again and this month is on T-SQL Tips (thanks Allen!). In some ways it’s a tough topic, because there are things I don’t really consider tips that other people do, and vice-versa. This one’s legitimate though: Using CROSS APPLY for working columns in calculations.

Let me give you an example. TSQL2sDay150x150

Back in 2009, I wrote a blog post on Julian (YYDDD) dates. http://msmvps.com/blogs/robfarley/archive/2009/03/25/converting-to-and-from-julian-format-in-t-sql.aspx.

Someone asked me recently about using this for time periods, for example, someone’s age when they play a football match. For example, Cesc Fàbregas holds the record as the youngest Arsenal player, aged 16 years and 177 days. Given someone’s birthdate and date of interest, this seems like it should be quite easy to work out, but can get frustratingly tricky.

The idea is quite straight forward – you just count the number of days from their most recent birthday. So let’s have a think.

Counting the number of years between the startdate and the enddate is a nice place to start. I could count the number of days, but that might give me rounding error based on leap years. But counting the years simply compares the year component, such as 1987 and 2003, and the player might not have had his birthday this year yet. In that case, we’d want to go with the year before. Anyway – once we’ve figured out how old the player is, we can figure out when their last birthday was and count the days since then. Easy.

But so easy to make a mistake somewhere.

And this is where CROSS APPLY can come into its own, by allowing us to use working columns.

Let’s start with a few players. Some young ones, and Thierry Henry, just because. I’m using the VALUES method, but you could have a Players table just as easily.

image

First, I want to make it clear which column is my startdate and which is my enddate. It’s too easy to be thinking “startdate” and pick up “Debut” here, because this sounds very much like the same thing. There’s the startdate of their playing career, and the startdate of the calculation. To avoid confusion, I’m going to do some simple column-renaming. This gives me more reusable code, and APPLY even means I never have to worry about whether I already have these column names referring to something else, because they’re going to get their own table alias too.

image

Might seem like a bit of a waste to you, but it means so much to me. Really.

Now I want to count how many years there are between my startdate and enddate, and work out when the startdate is this year. I could do this in one step, sure, but I want to be able to check my working really thoroughly.

image

Now I can easily test to see if I need to subtract a year or not.

image

I’m sure you can see where this is going. I now have their latest birthday before the date I’m looking for, and I can easily turn this into a Julian Date format.

image

The great thing about this is that I can be very confident of my working, being able to check each step of the calculation along the way. I can even turn it into a simple TVF, whilst keeping the calculations just as verifiable:

I know it’s really simple, but it’s an amazingly useful tip for making sure that you do things right.

Oh, and as for the plan – SQL just rolls it all up into a single Compute Scalar as if we’d written it out the long way, happily showing us that doing it step-by-step ourselves is no worse at all.

image

@rob_farley