Plan Operator Tuesday round-up

Eighteen posts for T-SQL Tuesday #43 this month, discussing Plan Operators.

I put them together and made the following clickable plan.

Let me explain this plan for you (people’s names are the links to the articles on their blogs – the same links as in the plan above).

It was clearly a SELECT statement. Wayne Sheffield (@dbawayne) wrote about that, so we start with a SELECT physical operator, leveraging the logical operator Wayne Sheffield.

The SELECT operator calls the Paul White operator, discussed by Jason Brimhall (@sqlrnnr) in his post. The Paul White operator is quite remarkable, and can consume three streams of data. Let’s look at those streams.

The first pulls data from a Table Scan – Boris Hristov (@borishristov)’s post – using parallel threads (Bradley Ball@sqlballs) that pull the data eagerly through a Table Spool (Oliver Asmus@oliverasmus). A scalar operation is also performed on it, thanks to Jeffrey Verheul (@devjef)’s Compute Scalar operator.

The second stream of data applies Evil (I figured that must mean a procedural TVF, but could’ve been anything), courtesy of Jason Strate (@stratesql). It performs this Evil on the merging of parallel streams (Steve Jones@way0utwest), which suck data out of a Switch (Paul White@sql_kiwi). This Switch operator is consuming data from up to four lookups, thanks to Kalen Delaney (@sqlqueen), Rick Krueger (@dataogre), Mickey Stuewe (@sqlmickey) and Kathi Kellenberger (@auntkathi). Unfortunately Kathi’s name is a bit long and has been truncated, just like in real plans.

The last stream performs a join of two others via a Nested Loop (Matan Yungman@matanyungman). One pulls data from a Spool (my post@rob_farley) populated from a Table Scan (Jon Morisi). The other applies a catchall operator (the catchall is because Tamera Clark (@tameraclark) didn’t specify any particular operator, and a catchall is what gets shown when SSMS doesn’t know what to show. Surprisingly, it’s showing the yellow one, which is about cursors. Hopefully that’s not what Tamera planned, but anyway…) to the output from an Index Seek operator (Sebastian Meine@sqlity).

Lastly, I think everyone put in 110% effort, so that’s what all the operators cost. That didn’t leave anything for me, unfortunately, but that’s okay. Also, because he decided to use the Paul White operator, Jason Brimhall gets 0%, and his 110% was given to Paul’s Switch operator post.

I hope you’ve enjoyed this T-SQL Tuesday, and have learned something extra about Plan Operators. Keep your eye out for next month’s one by watching the Twitter Hashtag #tsql2sday, and why not contribute a post to the party? Big thanks to Adam Machanic as usual for starting all this.


Spooling in SQL execution plans

Sewing has never been my thing. I barely even know the terminology, and when discussing this with American friends, I even found out that half the words that Americans use are different to the words that English and Australian people use. TSQL Tuesday

That said – let’s talk about spools! In particular, the Spool operators that you find in some SQL execution plans. This post is for T-SQL Tuesday, hosted this month by me! I’ve chosen to write about spools because they seem to get a bad rap (even in my song I used the line “There’s spooling from a CTE, they’ve got recursion needlessly”). I figured it was worth covering some of what spools are about, and hopefully explain why they are remarkably necessary, and generally very useful.

If you have a look at the Books Online page about Plan Operators, at, and do a search for the word ‘spool’, you’ll notice it says there are 46 matches. 46! Yeah, that’s what I thought too…


Spooling is mentioned in several operators: Eager Spool, Lazy Spool, Index Spool (sometimes called a Nonclustered Index Spool), Row Count Spool, Spool, Table Spool, and Window Spool (oh, and Cache, which is a special kind of spool for a single row, but as it isn’t used in SQL 2012, I won’t describe it any further here).

Spool, Table Spool, Index Spool, Window Spool and Row Count Spool are all physical operators, whereas Eager Spool and Lazy Spool are logical operators, describing the way that the other spools work. For example, you might see a Table Spool which is either Eager or Lazy. A Window Spool can actually act as both, as I’ll mention in a moment.

In sewing, cotton is put onto a spool to make it more useful. You might buy it in bulk on a cone, but if you’re going to be using a sewing machine, then you quite probably want to have it on a spool or bobbin, which allows it to be used in a more effective way. This is the picture that I want you to think about in relation to your data.

I’m sure you use spools every time you use your sewing machine. I know I do. I can’t think of a time when I’ve got out my sewing machine to do some sewing and haven’t used a spool. However, I often run SQL queries that don’t use spools. You see, the data that is consumed by my query is typically in a useful state without a spool. It’s like I can just sew with my cotton despite it not being on a spool!


Many of my favourite features in T-SQL do like to use spools though.


This looks like a very similar query to before, but includes an OVER clause to return a column telling me the number of rows in my data set. I’ll describe what’s going on in a few paragraphs’ time.

So what does a Spool operator actually do?

The spool operator consumes a set of data, and stores it in a temporary structure, in the tempdb database. This structure is typically either a Table (ie, a heap (Edit: Ok, it's not actually stored as a heap, but it may as well be, because it always scans)), or an Index (ie, a b-tree). If no data is actually needed from it, then it could also be a Row Count spool, which only stores the number of rows that the spool operator consumes. A Window Spool is another option if the data being consumed is tightly linked to windows of data, such as when the ROWS/RANGE clause of the OVER clause is being used. You could maybe think about the type of spool being like whether the cotton is going onto a small bobbin to fit in the base of the sewing machine, or whether it’s a larger spool for the top.

A Table or Index Spool is either Eager or Lazy in nature. Eager and Lazy are Logical operators, which talk more about the behaviour, rather than the physical operation. If I’m sewing, I can either be all enthusiastic and get all my cotton onto the spool before I start, or I can do it as I need it. “Lazy” might not the be the best word to describe a person – in the SQL world it describes the idea of either fetching all the rows to build up the whole spool when the operator is called (Eager), or populating the spool only as it’s needed (Lazy).

Window Spools are both physical and logical. They’re eager on a per-window basis, but lazy between windows.

And when is it needed?

The way I see it, spools are needed for two reasons.

1 – When data is going to be needed AGAIN.

2 – When data needs to be kept away from the original source.

If you’re someone that writes long stored procedures, you are probably quite aware of the second scenario. I see plenty of stored procedures being written this way – where the query writer populates a temporary table, so that they can make updates to it without risking the original table. SQL does this too. Imagine I’m updating my contact list, and some of my changes move data to later in the book. If I’m not careful, I might update the same row a second time (or even enter an infinite loop, updating it over and over). A spool can make sure that I don’t, by using a copy of the data. This problem is known as the Halloween Effect (not because it’s spooky, but because it was discovered in late October one year). As I’m sure you can imagine, the kind of spool you’d need to protect against the Halloween Effect would be eager, because if you’re only handling one row at a time, then you’re not providing the protection… An eager spool will block the flow of data, waiting until it has fetched all the data before serving it up to the operator that called it.

In the query below I’m forcing the Query Optimizer to use an index which would be upset if the Name column values got changed, and we see that before any data is fetched, a spool is created to load the data into. This doesn’t stop the index being maintained, but it does mean that the index is protected from the changes that are being done.


There are plenty of times, though, when you need data repeatedly. Consider the query I put above. A simple join, but then counting the number of rows that came through. The way that this has executed (be it ideal or not), is to ask that a Table Spool be populated. That’s the Table Spool operator on the top row. That spool can produce the same set of rows repeatedly. This is the behaviour that we see in the bottom half of the plan. In the bottom half of the plan, we see that the a join is being done between the rows that are being sourced from the spool – one being aggregated and one not – producing the columns that we need for the query.

Table v Index

When considering whether to use a Table Spool or an Index Spool, the question that the Query Optimizer needs to answer is whether there is sufficient benefit to storing the data in a b-tree. The idea of having data in indexes is great, but of course there is a cost to maintaining them. Here we’re creating a temporary structure for data, and there is a cost associated with populating each row into its correct position according to a b-tree, as opposed to simply adding it to the end of the list of rows in a heap. Using a b-tree could even result in page-splits as the b-tree is populated, so there had better be a reason to use that kind of structure. That all depends on how the data is going to be used in other parts of the plan. If you’ve ever thought that you could use a temporary index for a particular query, well this is it – and the Query Optimizer can do that if it thinks it’s worthwhile.

It’s worth noting that just because a Spool is populated using an Index Spool, it can still be fetched using a Table Spool. The details about whether or not a Spool used as a source shows as a Table Spool or an Index Spool is more about whether a Seek predicate is used, rather than on the underlying structure.

Recursive CTE

I’ve already shown you an example of spooling when the OVER clause is used. You might see them being used whenever you have data that is needed multiple times, and CTEs are quite common here.

With the definition of a set of data described in a CTE, if the query writer is leveraging this by referring to the CTE multiple times, and there’s no simplification to be leveraged, a spool could theoretically be used to avoid reapplying the CTE’s logic. Annoyingly, this doesn’t happen. Consider this query, which really looks like it’s using the same data twice. I’m creating a set of data (which is completely deterministic, by the way), and then joining it back to itself. There seems to be no reason why it shouldn’t use a spool for the set described by the CTE, but it doesn’t.


On the other hand, if we don’t pull as many columns back, we might see a very different plan.


You see, CTEs, like all sub-queries, are simplified out to figure out the best way of executing the whole query. My example is somewhat contrived, and although there are plenty of cases when it’s nice to give the Query Optimizer hints about how to execute queries, it usually doesn’t do a bad job, even without spooling (and you can always use a temporary table).

When recursion is used, though, spooling should be expected.

Consider what we’re asking for in a recursive CTE. We’re telling the system to construct a set of data using an initial query, and then use set as a source for another query, piping this back into the same set and back around. It’s very much a spool. The analogy of cotton is long gone here, as the idea of having a continual loop of cotton feeding onto a spool and off again doesn’t quite fit, but that’s what we have here. Data is being fed onto the spool, and getting pulled out a second time when the spool is used as a source.


(This query is running on AdventureWorks, which has a ManagerID column in HumanResources.Employee, not AdventureWorks2012)

The Index Spool operator is sucking rows into it – lazily. It has to be lazy, because at the start, there’s only one row to be had. However, as rows get populated onto the spool, the Table Spool operator on the right can return rows when asked, ending up with more rows (potentially) getting back onto the spool, ready for the next round. (The Assert operator is merely checking to see if we’ve reached the MAXRECURSION point – it vanishes if you use OPTION (MAXRECURSION 0), which you can try yourself if you like).

Spools are useful. Don’t lose sight of that. Every time you use temporary tables or table variables in a stored procedure, you’re essentially doing the same – don’t get upset at the Query Optimizer for doing so, even if you think the spool looks like an expensive part of the query.

I hope you’re enjoying this T-SQL Tuesday. Why not head over to my post that is hosting it this month to read about some other plan operators? At some point I’ll write a summary post – once I have you should find a comment below pointing at it.


Running goals – an update

Back in January, I wrote about some of my “running goals”. It’s time to update those of you who aren’t connected to me on Twitter or Facebook (or weren’t looking on the right days).

I mentioned that I wanted to get a better time in a half marathon.

Yes. I did that. I ran two half marathons in Adelaide this this year, with times of 2:04:58 and 2:03:57.


I mentioned that I wouldn’t mind trying a full marathon.

Last week, I did that. It was in the Barossa Valley – the wine region just north of Adelaide. My official time was 5:18:24. I probably could’ve been faster, but I’d had injuries along the way during my training. The South Australian Road Runners’ Club had a mentor program for people interested in doing a marathon, and I got involved. I didn’t make it to many (none) of the Sunday morning training runs, but I lined up anyway, and ran a marathon.

The date of the marathon was significant for me – it was the day before the tenth anniversary of my back injury. In hindsight, I would never recommend running a marathon the day before a significant day like that. I was pleased I’d finished the marathon (I wasn’t aiming for a particular time, and was just pleased to finish – being hit by a car around 31km in didn’t really help my time, but I managed to get around), and I hadn’t really expected the impact of the injury-anniversary to affect me so much. I got physically and emotionally sick for a few days, and it was horrible. Ten years is a long time – more than a quarter of my life – and I know that it’s been tough not just on me but on those around me. Completing a marathon hasn’t made me feel like I’ve conquered my injury, it just helps me feel like I know I can keep going despite it.

running << I’m smiling because someone’s pointing a camera at me. And because I can see the finish line.

I mentioned I wanted to keep doing some cardio every day, and lose more weight.

This fell off in mid-Feb when I got the first injury of my marathon training. I picked up a thing called “hip bursitis”. That led to increased back pain, and doing something every day was just beyond me. I got below 200lb, but only just. I’m roughly that now still, and I’m okay about that. I might try an “every day” thing again soon, but I’ll see how I go.

I wanted to run some distance during 2013. 750 miles? 900? 1000?

Well, five months in, I’ve run 341. That tracks to a bit over 800. But also I’ve run a marathon. And right now, a week later, part of me thinks I’ve achieved enough running goals, and it would be good to NOT run for a while. So forgive me if I don’t manage to run 1000 miles during 2013. I’d have to do over 20 miles every week from now to reach 1000 – my injuries just aren’t compatible with that.

Running is tough. It’s not completely good for my back, and I have mixed emotions about it. As people have pointed out, I’m not exactly the right build for running… but that’s not the point. The point is that I have a back injury, and I need to work with my physio to continue to see it improve. Running might be part of that, but there are lots of things that I still don’t have in my life that I would like to be able to have again. I’d like to be able to dance. I’d like to be able to play sport with my sons. I’d like to be able to continue to pick up my daughter (who’s five, and getting towards the kind of weight that I can’t actually lift). One day, I plan to carry my wife over the threshold of the house we built. Any of those things is going to take a lot of work – but a year ago, I couldn’t run either.

What, what? A car?

Yes – I got hit by a car during the Barossa Marathon. It wasn’t like it took my legs out and I rolled across the windscreen. It’s more that it hit my hands.

A marshal had stopped this car at a point where we had to cross the road we were running on. It was at the 31km point – almost 20 miles into the 26 miles route. The driver had wound down her window and was talking to the marshal. That was all fine. I took it fairly wide, and crossed in front of the car. The marshal was saying to her “Don’t pull off yet”, but she did – I guess she was still looking at the marshal as she took her foot of the clutch. I was in front of her at that point, and as she started forward, I wasn’t moving fast enough to jump out of the way. I turned and put my hands (and weight) on her car, and she stopped. I almost fell over, but didn’t.

Annoyingly, I’d stopped running then. Later in the run, Erin Stellato (@erinstellato) tweeted to me that it’s easier to keep going if you never stop. Well, I had trouble getting going again. My legs were sore, of course. My back had just got a jolt I wasn’t expecting, as had my adrenalin. I was sweating and leaning forward (so my eyes were starting to suffer). It took me an embarrassingly long time to finish crossing the road. Thankfully there was a drink station there, where I washed my eyes out, and I kept going, about two minutes per mile slower than I’d been going beforehand. I’m not saying I would’ve broken five hours, but I would recommend if time is important to you that you don’t have an altercation with a vehicle part way round.

Massive thanks to the people that sent the 187 tweets during my run (read to me by the RunMeter app). It helped. Now go and give money to Yanni Robel’s fundraising. I run in a Team In Training shirt to honour the effort that she’s putting in, so please – give money to her fund. Also, if you’re reading this in the first 8-ish hours after I’ve published it, send Yanni a message on Twitter (@yannirobel), because she’s about to run a marathon too!

T-SQL Tuesday #43 – Hello, Operator?

June 11th is next week. It’s a Tuesday, the second Tuesday of the month, making it T-SQL Tuesday! This is your opportunity to write on a specific topic, along with countless* people from around the SQL community (* at least until the end of the day, when it will be quite easy to count how many people joined in).TSQL2sDay150x150 This month marks 3.5 years since it started, with 42 events held so far.

This month, for number 43, I’m the host. That means that I set the topic, and this blog post is the one that you need to get a comment into so that I can find your post before I collate them.

The topic is Plan Operators. If you ever write T-SQL, you will almost certainly have looked at execution plans (if you haven’t, go look at some now. I mean really – you should be looking at this stuff). As you look at these things, you will almost certainly have had your interest piqued by some, and tried to figure out a bit more about what’s going on.

That’s what I want you to write about! One (or more) plan operators that you looked into. It could be a particular aspect of a plan operator, or you could do a deep dive and tell us everything you know. You could relate a tuning story if you want, or it could be completely academic. Don’t just quote Books Online at me, explain what the operator means to you. You could explore the Compute Scalar operator, or the many-to-many feature of a Merge Join. The Sequence Project, or the Lazy Spool. You’re bound to have researched one of them at some point (if you never have, take the opportunity this week), and have some wisdom to impart. This is a chance to raise the collective understanding about execution plans!


So, the T-SQL Tuesday thing…

If you haven’t heard about T-SQL Tuesday before, then do this exercise with me. Do a quick search for “T-SQL Tuesday”. If you glance down the results,  you’ll see a bunch of posts either inviting people to a T-SQL Tuesday, contributing in one (the ones that contribute link back to the host), or summarising the posts. The ‘host post’ (which this month is this one!) will have a bunch of comments and trackbacks, pointing to all the contributing posts (and hopefully to the summary too). All-in-all, it makes a terrific resource about that particular subject.

So here’s what you do!

1. Some time next Tuesday (GMT) – that’s June 11th 2013 – publish your blog post. If you’re in Australia like me, GMT Tuesday runs from mid-morning on the Tuesday through to mid-morning on Wednesday. If you’re in the US, then it’ll run from Monday evening through to Tuesday evening.

2. Make sure that your post includes, somewhere near the top, the T-SQL Tuesday image (from above), with a link to this post. This helps people find the other posts on the topic, and hopefully encourages others to jump in with their own posts. If it helps, just switch to source/HTML mode, and put the following snippet in there:
<a href="" target="_blank"><img alt="TSQL Tuesday" align="right" border="0" src=""/></a>

3. Come back to this post, and write a comment, providing a link to your post. That’s probably the only way I’ll know it’s there. You can tweet, sure (use the hashtag #tsql2sday), but I still might not find it.

4 (optional, but maybe worthwhile). Keep your eye out for other people’s posts, and for when I publish the summary…

That’s about it – happy writing!

Remember: 1: Jun 11 GMT; 2: Image with a link; 3: Comment below.