A whole site for reviewing of SQL Server MVP Deep Dives

This book just keeps amazing me. Not only as I read through some chapters for the first time, and others for the second and third times, but also as I read reviews of it written by other people.

The guys over at http://sqlperspectives.wordpress.com are a prime example. They’ve been going through each chapter, each writing a review on it, and often getting a guest blogger to write something as well – and they’re clearly getting a lot of stuff out of this brilliant book. Back when I first heard about them doing this, I had offered to be involved, and recently did an interview with them about my chapters (chapter seven and chapter forty). That interview can be found at http://sqlperspectives.wordpress.com/2010/03/20/interview-with-rob-farley/ – and covers how I got into databases, and how I think the database roles in the IT industry are changing.

If you don’t have a copy of SQL Server MVP Deep Dives yet, why not get a copy from http://www.sqlservermvpdeepdives.com (or persuade your local bookstore to get some copies in), and read through chapters with these guys? Treat it like a book club, discussing each chapter with others (guest blogging perhaps?), and you’ll probably end up getting even more out of it. Remember that the proceeds of the book go to charity (instead of the authors – we get nothing), so you don’t need to consider that you’re splashing out on a treat for yourself. Think of the kids helped by War Child instead.

Re-running SSRS subscription jobs that have failed

Sometimes, an SSRS subscription fails for some reason. It can be annoying, particularly as the appropriate response can be hard to see immediately. There may be a long list of jobs that failed one morning if a Mail Server is down, and trying to work out a way of running each one again can be painful. It’s almost an argument for using shared schedules a lot, but the problem with this is that there are bound to be other things on that shared schedule that you wouldn’t want to be re-run.

Luckily, there’s a table in the ReportServer database called dbo.Subscriptions, which is where LastStatus of the Subscription is stored. Having found the subscriptions that you’re interested in, finding the SQL Agent Jobs that correspond to them can be frustrating.

Luckily, the jobstep command contains the subscriptionid, so it’s possible to look them up based on that. And of course, once the jobs have been found, they can be executed easily enough. In this example, I produce a list of the commands to run the jobs. I can copy the results out and execute them.

Another option could be to return the job step commands directly (js.command in this query), but my preference is to run the job that contains the step.

T-SQL Tuesday – Query Cost

In SQL Server, the cost of a particular plans is based largely on I/O, which makes this post a good candidate for this month’s T-SQL Tuesday, hosted by Mike Walsh who, like me, walks a straight path.

In considering I/O – the movement of data generally In and Out of disk and memory – my thoughts come very much to query cost. If I run set statistics io on, I can see the reads and writes, both logical and physical, of the queries I run, and I can use this information to gauge the impact of my query on the system.

I can also see that the Estimated Subtree Cost as indicated in the Execution Plan seems to grow with the number of logical reads. This makes sense, as the number of logical reads required by a query can really make a query costly. A logical read implies that the read might only involve examining a page of data that is already in RAM, but it should always be remembered that if the required page of data is not in RAM already, it will need to be pulled in off disk first, being shown as a physical read.

The standard line with measuring cost in terms of I/O comes down to the impact of a physical read on the system. When a physical read occurs, this involves getting data off the actual disk(s), which I liken to the process of picking a track on an old vinyl record player (the records were vinyl, not the record players). Of course, the disk controller lines up the tracks far quicker than I ever could, but the principle still applies. Compared to shifting bits in RAM, it’s incredibly slow. Much less so with Solid State Disks, but that’s another matter entirely.

But the impact of I/O is certainly not the only factor on query cost. I remember some time ago having a discussion about string splitting with a friend of mine. I had seen some posts discussing the different ways of splitting strings, with the conclusion being that CLR functions are the best, as SQL Server generally isn’t very good at string manipulation, but it led to an interesting discussion.

Take one of the standard approaches to splitting strings, such as using a table of numbers. For the purposes of the experiment, I have a table called dbo.nums(num int primary key), populating to at least 50.

There are obviously other methods that can be used – this is just one. But now compare it to the following:

Here I’m constructing a query using dynamic SQL to split strings, making a query which performs a UNION ALL of all my queries. And when I look at the cost of a batch that runs both these methods (by asking Management Studio to show me the Plans), I find that the second one is incredibly cheap. In fact, if I turn on statistics io, I can see that there is no I/O at all for the second method, whereas the first requires 12 logical reads – presumably to get at my table of numbers.

Clearly my dynamic SQL method of splitting strings is fantastically fast, and this is completely true. It is fantastic – a thing of fantasy. It’s actually really awful.

You see, as I mentioned earlier, SQL Server isn’t particularly good at string manipulation. So much so that creating this piece of dynamic SQL is very costly, won’t scale, and should never be implemented in a real environment. You can have a look at it to verify what I’m saying, but if you try to do this on a large set of data, you will very quickly find out just how poor it is.

To test it, created a table called table_of_strings (string varchar(1000)) and populated it with those four strings. I then doubled its size ten times using:

Using the table of numbers to split the strings involved over 12,000 logical reads, but the elapsed time showed 289 milliseconds to provide the 13312 rows of results.

The dynamic SQL method used a total of just 26 logical reads, but took 34 seconds to run on my (old) machine.

I’m not suggesting you ignore the I/O-based cost of execution plans, but you should most definitely consider more than just that – particularly as Solid State storage becomes more and more commonplace.