Corruption, corruption, more corruption

No, not in my systems.

Corruption does happen from time to time. At LobsterPot Solutions we get calls from people now and then who have corruption in their databases, that want help getting them repaired. Generally, it’s not too much of a problem, although I’ve seen some doozies over time. (And if you need help, get in touch!)

Interestingly, I don’t think many DBAs practise solving corruption issues very often. They might test restores, and even test getting data back from a salvaged LDF file, but rarely would they test any of the stranger corruption scenarios. I’ve never put it on a high priority – I know lots of ways to get data out of tables, and know how to get data out of pages using DBCC PAGE… but I wouldn’t say that I practise solving corruption very often. There are so many different ways that a database can become corrupted, and 99% of the time, the data can be salvaged using some very standard approaches.

So it was good to see Steve Stedman (@SQLEmt) run a series of Corruption Challenges – ten corrupt databases, all of which could be repaired without data loss, but never using the standard DBCC CHECKDB repair option. I noticed the first one part way into the allotted time, and thought it would be a useful exercise to see if I could solve them all. Let’s face it – when someone gets in touch because they have a problem, they want to be reassured that they’re dealing with someone who knows what they’re doing.

All ten challenges have now appeared, and not only have I solved every one of them – a feat only shared by Andre Kamman of the Netherlands and Neil Abrahams of the UK – but I’ve also won three of the events, more than anyone else, allowing me to finish top of the scoreboard!

Winning the competition was never my intention – I’m just pleased to have been able to know that there were no challenges that beat me, which hopefully means that my customers are still in capable hands.

My challenge to you is to grab the ten databases that Steve has made available, and without looking at the solutions that are all posted, see how you go. At the very least it will be good practice for when you have to deal with the real thing.

@rob_farley

Check the settings when installing SQL Server

Next, next, next, next, next… you know the drill.

Except that when installing SQL, it’s simply not good enough. The defaults might not work for you, and that makes this post qualify for this month’s T-SQL Tuesday, hosted by Andy Yun (@sqlbek).

TSQL2sDay150x150

Most things are fine, but there is one page which you really shouldn’t ignore. And it’s not even obvious. But if you’re just clicking Next, next, next, then you might easily miss it.

When you reach this page:

image

please make sure you don’t just hit Next without looking at the Collation tab. It looks like this:

image

Now, I don’t particularly care what you decide to put here. But you shouldn’t just set it blindly. If you are installing a new server and you need it to be consistent with what you’ve used before, go and look at the collation setting on your old instance. This is ridiculously important.

You see, people are lazy. And when I say people, I mean developers. And when I say developers, I mean bad developers. (There was a thing recently that said that “Women’s Soccer” should now be called simply “Soccer”. I agree with this. There shouldn’t have to be a differentiation between a game played by a particular type of people, except perhaps “American Football”, which is obviously called “football” because they use their feet so much. Oh right, about developers. I hope as time passes, developers become better. But for now, when I say “Developers”, I mean bad developers.)

So when a developer creates a temporary table, they do something like:

And then they populate it with some data, and then they use it in another query. Like this:

This is code that has worked for years. But if you have ignored the collation setting and the instance collation is different to the database collation, you will get an error. The system won’t know whether two strings are supposed to be the same or not. Is ‘abc’ the same as ‘àbç’, if one tells you to ignore accents and the other says not to? The system can’t decide. It’s even more complex than that, because two strings might be identical, but it won’t know how to look them up if the alphabet orders letters differently. It can’t decide and gives an error. It goes in the too-hard basket.

Of course, a nice responsible developer will have created the temporary table like this, and then the problem never occurs:

But let’s face it – this is rare. Most people write database code without thinking about the collation settings, and that’s a problem. Most of us are bad developers.

The error you get is the one described here: https://connect.microsoft.com/SQLServer/feedback/details/324910/collation-error-behaviour-option. This is a Connect item which I raised in January 2008 (7.5 years ago), which has 47 upvotes, and which was closed in 2011 as “Won’t Fix”. It was looking likely for a while (at least when using tempdb), but then they must’ve realised it wasn’t particularly important. Ultimately, it could give unexpected results if you’re not confident about which language you’re using, and it’s potentially better to give an explicit error than to let your query work but give the wrong results.

@rob_farley