SQL Server CPU Usage by Session

Troubleshooting high CPU usage on a SQL Server Database is an art, but there is a defined methodology to follow to find the root cause of high CPU. This can involve breaking down the overall server CPU usage to a more granular level, first discovering that it’s SQL Server that’s the problem (because way too often it’s something else!), down to exploring specific plan operators in a particular problematic query. Finding that problematic query, identifying the high CPU consumer, means identifying the CPU usage by session.

sys.dm_exec_requests shows the CPU time, but it’s cumulative – it doesn’t give the CPU consumption by each session at the current time. You can see how much CPU usage a session has had since it started, but it doesn’t show you what’s going on right now. To explore that, we need to query sys.dm_exec_requests repeatedly, and look for the differences. We need to collect the CPU usage for a time interval to identify the high CPU consumers.

Here is a query which does that across a one second interval. You can go as small or as large as you like by changing the ‘WAITFOR’ value – but the closer you want to what’s going on “right now”, the smaller you want that interval to be. I find one second to be a good starting point. The query populates snapshots of sys.dm_exec_requests into a table variable called @sessionsCPU, which I then query to do some analysis. I could do this with LAG, but my code will work on pre-2012 instances too, and I can’t see a reason to change it.

Here’s an example of what I get. Obviously, this isn’t from a client machine, it’s just on my local environment. But you’ll see that I can tell that right now, the query that’s causing me most difficulty is the session 71.

I can see the query, and this lets me delve further into the problem. I don’t have an answer yet, but I’m an awful lot closer to finding the culprit and solving the high CPU issue than I was before I ran this query.

Hope this query helps you in troubleshooting high CPU issue and get you a step closer to resolve the issue.

Happy Learning,
Manu

Twenty years on, how I made reporting run quicker

Twenty years ago, in January 1998, was when I touched SQL Server for the first time as a consultant – that was SQL Server 6.0, and I’d be involved in an upgrade to 6.5 at that same customer only a couple of months later. I would go on to spend quite a bit of time with this customer, and helped them implement a lot of things over time. I was a programmer back then, and saw the database as just part of the application. It was only later that I started to see applications as peripheral to the data rather than the other way around.

One major problem that was happening at this customer was the reporting. Every month they would need to provide reports to a part of the government, and these reports were simply painful to produce. Let me explain…

The basic system was designed to monitor the health of machines that were dotted all around the city – machines that would send a signal every few moments to indicate their status. Normally these were “nothing wrong” messages, but occasionally there would be a problem with a particular part of the machine, and our application would manage getting the message out to someone who could go and fix it. Later, a “fixed’ message would come through, I’m sure you get the picture. There were also manual reports of downtime that didn’t have automated signals, that would need to be included in the mix, for those times when people would phone up and say that there was a problem.

The reporting was about the overall uptime of each machine, including whether there was a machine available at each location at a given time, whether the combination of components that were unavailable meant that the entire machine was considered down, or whether so many machines at a location were down that the location itself needed to be marked as unavailable, and so on. The data for a single machine was like:

Machine Time Message
100 19980109 12:33:05 OK
100 19980109 12:34:23 HEAT WARNING
100 19980109 12:34:26 COMP1 ERROR
100 19980109 12:34:29 TOUCHPAD ERROR
100 19980109 12:35:12 COMP1 NOERROR
100 19980109 12:35:19 HEAT NORMAL
100 19980109 12:30:00 Report – TOUCHPAD ERROR
100 19980109 12:35:00 Report – TOUCHPAD FIXED

 

…and so on. Timestamps were available for when phone calls came in, but apparently the reporting needed to show the time that the person said they saw it was unavailable, not the time the alert came through. Go figure.

The reporting would need to have looked at this data, and figured out that the touchpad was unavailable between 12:30 and 12:35, and that Component 1 was unavailable for 48 seconds. There would’ve been tables available to show that the touchpad being down meant the machine would’ve been considered only 50% available, and that Component 1 being down meant 60% available, but a combination of both touchpad and component 1 meant completely down. So this data would mean: 4 minutes 26 seconds of 50% available, 34 seconds of 0% available, and 12 seconds of 60% available. The reports were in place before I went there – they just didn’t run well.

They ran by querying all the data, and then looping through it (in VB3) to figure out the gaps, and to try to keep track of what was available and what wasn’t. If the report were at the machine level, it would pull back everything for that machine during the time period of interest and then try to figure it all out, but if it were at the location level, or even city-wide, it was dealing with a very different amount of data. It would take a few minutes to run a report for a single machine for a week, and many reports that were available had simply never been run. Fixing this was one of the biggest challenges I had in my first few months of consulting.

I didn’t know until years later that what I did was to create a data warehouse of sorts.

I could see that the moment when someone hit the button to say “Show me the report for machine 100 last week” was the wrong time to do the reporting, because no one wanted to watch a computer for a few minutes – or a few HOURS if they wanted a larger amount of time or more than a handful of machines. That meant the work needed to be done ahead of time.

So I relaxed many of the rules of database design I’d learned, and came up with some denormalised tables that were purely designed for analysing the data. This would store the amount of downtime per machine or per location, but be populated ahead of time – updating it with maybe an hour or so latency, long enough to allow for the phoned-in alerts. I’d store the number of seconds since the last machine event, and what the availability proportion was during that time period, and include extra dummy events for the start of each hour, so that the report could handle exact time periods. I spread the heavy-lifting across time, pulling the new messages in as regularly as I could, to avoid doing that heavy lifting when time was more urgent. This was ETL, although I just considered it was reporting-preparation, and didn’t learn what ETL meant until much later.

Once that data was in place, I could just sum the product of number of seconds * availability, and the reports could be produced quickly and easily, no matter how large they were. There was no longer a need for a cursor at report runtime – just regular aggregations.

For me, this trick of doing the work at a time when it’s less urgent is key to a lot of things in the database world (and also in the rest of life). I use indexes to keep a sorted copy of data so that aggregations and joins can run faster. I use filtered indexes to separate out lists of new data from old data. I use data warehouse and cubes to handle history and data quality and complex calculations, so that business users can explore data to their hearts’ content. If I can put up with doing the work later, then maybe it’s just fine to not have those indexes in place – perhaps I’m totally okay about scanning a table of 1000 rows every so often. But if the cost of maintaining a separate copy (index, warehouse, whatever) isn’t going to be significant when spread out over time, but that copy is going to make life much easier when it comes to the crunch, then I’m going to do it.

I often think back to some of the customers / projects that I’ve had over the years and roll my eyes at what I had to overcome. Generally though, the worse they were, the more I learned and the stronger I became.

It’s all just practice for whatever is going to come in the future, right?

@rob_farley

This post was written in response to this month’s T-SQL Tuesday, hosted by Arun Sirpal.

TSQL2sDay150x150

My Learning Goals for 2018

Some months it’s a good thing to be prompted to post by T-SQL Tuesday, because it gets me to talk about something I should’ve talked about already, but haven’t. This is one of those months, thanks to Malathi Mahadevan (@sqlmal)’s topic around Learning Goals

I don’t generally write about my Learning Goals, because I tend to base what I’m learning around what my clients’ needs are. If I feel like I need to brush up on Azure Data Lake because I have a project around that, then I’ll take some for that. I like to keep on top of new things as they come through, but then hone my skills when I have a proper opportunity. I don’t tend to study much for certifications – I didn’t prepare much for MCM exams that I took (and passed – just!) five years ago, nor the couple of exams I took recently to keep hold my ‘Partner Seller’ status.

That said, it’s always good to see things that are coming up and dive a bit deeper into them. And for me, at the moment, that’s the edX courses in the Data Science space. I’m not sure whether I’ll get through to the end of them all, because I’ll happily let myself get distracted by something else. The Advance Analytics area is one of which I’m particularly fond – it’s a nice contrast to the Pure Maths subjects I did at university which shapes my approach to relational data. So far, I’ve gone through the courses quickly, but picking up the odd bit that I didn’t remember. The “Essential Statistics for Data Analysis using Excel” was a particular area, which rang bells but also reminded me how long it’s been since I’ve studied this kind of material.

Where do I think I’ll be in 12 months? Well, like the answer to all good SQL questions, it depends. I’m sure I’ll still be analysing data and helping businesses with their data-focused ambitions, and that I will have learned quite a bit during that time.

@rob_farley

My inspirational team

This month the T-SQL Tuesday theme is “Who has made a meaningful contribution to your life in the world of data?”, and is hosted by Ewald Cress (@sqlonice).

It was good to reflect on this over the past week, and there are (of course!) a lot of people that I could list. I could list some of the people that were prominent in the community when I first discovered it all those years ago. I could list those people that I turn to when I have questions about things.

But there’s one group of people that for me stands out above all the rest. So much so that I’ve hired them.

I’ve had my company for nine years now, and during that time I’ve had up to seven people in the team. Well over a dozen different people have represented the LobsterPot Solutions brand, and they’ve all been amazing. Here are a few stats about our past and present LobsterPot team:

  • There have been MVPs, of course. Six MVPs, in fact, have been employed here. One became an MVP after leaving us, and two no longer have MVP status, but still… six!
  • There have been user group leaders. Seven (or maybe eight) of us lead or have led groups.
  • A few former employees now lead their own businesses.
  • One former employee is completing a PhD.
  • One former employee is a Senior Program Manager at Microsoft.
  • One former employee attends the PASS Summit on her vacation time.

You see, they are all leaders. They almost all give presentations at events. They all make meaningful contributions to the data community.

And they have all made a meaningful contribution to my own life. I may have paid them a salary, but they’ve all left an impact on me personally.

@rob_farley

Migrating to Azure SQL DB using Transactional Replication

Azure SQL DB is a PaaS offering from Microsoft on Azure for hosting a SQL database. This may not be ideal for all environments. But if you have a database which can work in isolation handling requests from one or more applications, Azure SQL DB may be a right fit for you. One common use-case is for ISVs who host multiple tenant databases.

One of the challenges faced is how to migrate to Azure with minimal downtime, especially when your database is already in production use. There are a couple of methods to migrate the database from your production environment to Azure SQL DB.

One method is by exporting your database as a BACPAC file and importing it into Azure SQL DB. There are few points to note when using this method.

  • The production database should not change from the point you start the export till the database is imported into Azure SQL DB. This increases the production downtime.
  • The process involves exporting a BACPAC file, using tools like SqlPackage.exe to import the database into Azure SQL DB.
  • The rollback process is simple, allowing you to revert back to existing database (if your business can afford to lose any changes done on Azure SQL DB). The amount of data loss depends on at what stage you decided to rollback.

To achieve minimal downtime, I will detail a second method of migration: using Transactional Replication.

Transactional replication to Azure SQL DB was announced in 2015 as being in Public Preview. On 23 October 2017 this feature was announced as generally available. To setup transactional replication from on-prem SQL Server to Azure SQL DB, you will need to install the latest version of SSMS (v17.3). The process of migration involves the following steps.

Testing/POC

  1. Run the Microsoft Data Migration Assistant (DMA) to check if your database is ready to migrate to Azure. This tool will help you assess if your database has any compatibility issues to migrate to Azure SQL DB. More information about DMA is available here.
  2. Fix any issues identified in step 1 to prepare your database for migration and repeat.
  3. Create an Azure SQL DB shell (empty database) using the Azure portal. You can follow the step by step guidelines as described here.
  4. Setup transactional replication with the following configuration:
    1. Configure an on-prem SQL Server as the distributor. You can choose the SQL Server hosting the production database to be the distributor, or any other SQL Server.
    2. Configure a new publication for the production database you wish to migrate to Azure SQL DB. Choose all the objects you want to migrate. In general, choose all databases unless you wish not to migrate specific unwanted tables, e.g., archive history tables.
    3. Add a push subscriber by connecting to the Azure SQL DB you created in step 3. When connecting to the subscriber, connect using the details provided when creating the database in Azure.

  5. After successfully adding the subscriber, run the snapshot agent and complete the initialization.
  6. Once the database is in sync (and being kept in sync), choose a time when you want to test your application connectivity to Azure SQL DB.
  7. Remove replication.
  8. Make changes in test application configuration to connect to Azure SQL DB.
  9. Complete end-to-end testing of the application. This should be exhaustive and detailed to ensure business continuity.

Production Migration

  1. Complete steps 1 to 4 as performed for testing.
  2. Once the Azure SQL DB is in sync, choose a time to cutover the production database to Azure SQL DB.
  3. Shutdown the applications and make sure the replication is in sync.
  4. Remove replication and shutdown the on-prem production instance.
  5. Make changes in the application configurations so that they will connect to Azure SQL DB.
  6. Start the applications and complete functional testing.

Rollback Plan

There are two cases to consider rollback.

The first case, when the functional testing fails and the Azure SQL DB is still not updated with any valid production data.

  1. In this case, the rollback is simple. Reconfigure the applications to point to the on-prem database.
  2. Start the on-prem SQL Server and start the applications.
  3. Complete functional testing to ensure applications are behaving as expected using on-prem database.

The second case, when the functional testing passes. and the application runs using Azure SQL DB for few hours or days and then you realise that there are functional issues. Hopefully this is a rare scenario and mostly arises due to poor testing.

  1. In this case, it becomes critical to get back the production data inserted into Azure SQL DB.
  2. The method to sync data back from Azure SQL DB to the on-prem SQL Server database is provided by Azure Data Sync. This will be covered in detail in a future article.

Downtime

The only downtime needed for migration using transactional replication to Azure SQL DB is during the cutover. This depends on how quickly you can stop your applications, shutdown the on-prem SQL Server and make the configuration changes in applications to point to migrated Azure SQL DB. All these tasks can be automated with PowerShell or other tools, so that the downtime can be minimal.

Conclusion

Azure SQL DB is a tailored service that will suit most applications that connect to a single database. With transactional replication now GA, and the tools available in the latest SSMS, the door has been opened to more users for easier migrations. Transactional Replication is the easiest and most flexible method to migrate to Azure SQL DB with minimal downtime and lowest impact on business continuity.

Happy Learning,

Manu

The effort of relocating blog sites

Hopefully you’ve realised that I’m not posting at sqlblog.com any more. There’s still some excellent content there, but it has come up time and time again that I should be posting at a company blog site – so the move has now been made. I’ve also brought across the material that I wrote at msmvps.com, which had also been Community Server until a few years ago when it became WordPress.

Adam Machanic (@AdamMachanic) had put together some C# code for moving posts off Community Server (which is what sqlblog uses) onto WordPress, and combined with a regular WordPress Export + Import from msmvps.com, I had most of my content moved over. I don’t code in C# very often these days, but it felt nice. I spent some time in PowerShell and XML tweaking dates in the WordPress export file to make sure they matched the time zone that I’d originally used, which introduced some frustrating character-mapping that needed fixing in MySQL, so all in all I felt like I was moving around a variety of toolsets that I don’t often swim in.

A big thanks again to Barb and Susan who host msmvps.com still – they (particularly Barb) have helped a lot with sorting out some of my content from the old site. Some things are still broken from years back, but they did find the picture of me with Desmond Tutu, so I’m happy. At some point I’ll be going through old posts and seeing what doesn’t work.

I no longer use Categories – I lost the msmvps.com categories when they moved to WordPress, and the sqlblog.com ones didn’t seem to want to come across either. I don’t know that I ever did categories particularly well, so perhaps it’s a good opportunity to stop pretending that I do. Not everything should within a ‘sql’ category.

I discovered that I have quite a bit of content that needed to use a ‘powershell’ formatting. There is still a bunch of formatting on old posts that I won’t get to for some time though (there’s almost 500 posts, so I’ll take a bit of a run at the rest another day).

I had to install some plugins to get a few things to work. SyntaxHighlighter was one, but also RWD’s Responsive Image Maps to get an image map from an old T-SQL Tuesday round-up working. I tried a stats plugin, only to find that I needed a later version of PHP to support it. Luckily I don’t think I was showing an error for too long, but I’m really not keen on the error messages that WordPress gives.

CSS was a bit of fun to get the “Popular Posts” to look similar to the “Recent Posts”. I ended up just finding a way to have the Popular Posts widget use the same CSS class as the Recent Posts.

And it turns out I do like the Segoe font-face. I know it’s Microsoft’s one, and perhaps that’s what makes it feel right for me – I spend so long looking at Microsoft web pages it feels quite natural to me. Since we deal almost entirely in the Microsoft space, it’s quite appropriate too. We’ll probably use the same when we do a rebranding of our company site.

@rob_farley

New blog site!

It’s about time I moved content to a more central site – one that I own, rather than controlled by others. One that is part of the company, which can help demonstrate the capabilities of the company, and where the other skilled people within the team can also post content.

So I’ve moved the content that I had written at sqlblog.com across (big thanks to Peter and Adam for having me blog there for so long), and the content from msmvps.com (where I’d blogged from the time I first became a SQL MVP, even before the company was set up). I’ll still write for sqlperformance.com when I have something they’ll be interested in, and I’ll post something here to let you know that I’ve done that.

Feel free to let me know what you think of it all – whether I should be use WordPress differently, for example – you can ping me via email, or DM me on Twitter at @rob_farley.

I’ve redirected feedburner, but also feel free to follow this site in general.

The BigData Legacy

Trends come along, and trends pass. Some hang around for quite a while, and then move on, and some seem to disappear quickly. Often we’re glad that they’ve gone, but we still bear scars. We live and work differently because they were there. In the world of IT, I feel like this is all too common.

When ORMs became trendy, people were saying that writing T-SQL would be a thing of the past. LINQ was another ways that people were reassuring the developer community that writing database queries would never again be needed. The trend of avoiding T-SQL through ORMs has hung around a bit, and many developers have recognised that ORMs don’t necessarily create the best database experiences.

And yet when we consider what’s happening with Azure SQL Data Warehouse (SQL DW), you find yourself querying the data through an interface. Sure, that interface looks like another database, but it’s not where the data is (because the data is in the 60 databases that live in the back), and has to it translates our query into a series of other queries that actually run. And we’re fine with this. I don’t hear anyone complaining about the queries that appear in SQ DW’s explain plans.

When CLR came in, people said it was a T-SQL killer. I remember a colleague of mine telling me that he didn’t need to learn T-SQL, because CLR meant that he would be able to do it all in .Net. Over time, we’ve learned that CLR is excellent for all kinds of things, but it’s by no means a T-SQL killer. It’s excellent for a number of reasons – CLR stored procedures or functions have been great for things like string splitting and regular expressions – and we’ve learned its place now.

I don’t hear people talking about NoSQL like they once did, and it’s been folded somehow into BigData, but even that seems to have lost a little of its lustre from a year or two ago when it felt like it was ‘all the rage’. And yet we still have data which is “Big”. I don’t mean large, necessarily, just data that satisfies one of the three Vs – volume, velocity, variety.

Of these Vs, Volume seems to have felt like a misnomer. Everything thinks what they have is big, but if you compared it to others, it probably wouldn’t actually be that big. Generally, if people are thinking “BigData” because they think their data is big, then they just need a reality check, and then deal with it like all your regular data.

Velocity is interesting. If your system can’t respond to things quickly enough, then perhaps pushing your data through something like Stream Analytics could be reasonable, to pick up the alert conditions. But if your data is flowing through to a relational database, then is it really “BigData”?

And then we have Variety. This is about whether your data is structured or not. I’m going to suggest that your data probably is structured – and BigData solutions wouldn’t disagree with this. It’s just that you might not want to define the structure when the data is first arriving. To get data into a structured environment (such as a data table), types need to be tested, the data needs to be converted appropriately, and if you don’t have enough control over the data that’s coming in, the potential for something to break is high. Sorting out that mess when you need to query it back again means that you have a larger window to deal with it.

So this is where I think BigData is leaving its legacy – in the ability to accept data even if it doesn’t exactly fit the structure you have. I know plenty of systems that will break if the data arriving is in the wrong structure, which makes change and adaptability hard to achieve. A BigData solution can help mitigate that risk. Of course, there’s a price to pay, but for those times when the structure tends to change overly regularly, BigData’s ideology can definitely help.

We see this through the adoption of JSON within SQL Server, which is much less structured even than XML. We see PolyBase’s external tables define structure separately to the collection of data. Concepts that were learned in a void of relational data have now become part of our relational databases.

Don’t dismiss fads that come through. Look into them, and try to spot those things which have more longevity. By adopting those principles, you might find yourself coming through as a stronger professional.

@rob_farley

This post was put together for T-SQL Tuesday 95, hosted by Derik Hammer (@sqlhammer). Thanks Derik!

Interviews and niches

T-SQL Tuesday turns this month to the topic of job interviews. Kendra Little (@kendra_little) is our host, and I really hope her round-up post is in the style of an interview. I’m reminded of a T-SQL Tuesday about three years ago on a similar topic, but I’m sure there will be plenty of new information this time around – the world has moved on.

I’m not sure when my last successful job interview was. I know I went through phases when I guess I was fairly good in job interviews (because I was getting job offers), and phases when I was clearly not very good in job interviews (because I would get interviews but not be able to convert them into job offers), and at some point I reached a point where I stopped doing interviews completely. That’s the phase I’m still in.

I hit that point when I discovered my niche (which sounds like “neesh” in my language, not “nitch”). For me it was because I realised that I had a knack for databases and starting exploring that area more – writing, presenting, helping others – until people noticed and started approaching me. That’s when interviewing stops being a thing. It doesn’t necessarily mean going starting your own business, or even changing jobs – it just means that people know who you are and come to you. You no longer have to sit in front of a panel and prove your worth, because they’ve already decided they want you.

So now people approach me for work through LobsterPot Solutions, and although there is sometimes a bidding phase when we need to compete against other companies, there is no ‘interview’ process in the way that there was when I was an employee.

What’s your niche? And are you spending time developing that?

There’s career-advice that talks about the overlap between something you enjoy doing, something you’re good at, and something that people are prepared to pay for. The thing is that people won’t pay you for it unless they know that you’re the person they need, rather than someone else. So get yourself out there. Prove yourself. Three years ago I asked “When is your interview” and said that you need to realise that even before your interview they’ve researched you, considered your reputation, and all of that. Today I want to ask you how your niche is going. Have you identified that thing you enjoy, and that people will pay for? And are you developing your skills in that area?

Your career is up to you. You can respond to job ads and have interviews. Or you can carve your own space.

Good luck.

@rob_farley

Learning the hard way – referenced objects or actual objects

This month’s T-SQL Tuesday is about lessons we’ve learned the hard way. Which, of course, is the way you learn best. It’s not the best way to learn, but if you’ve suffered in your learning somewhat, then you’re probably going to remember it better. Big thanks to Raul Gonzalez (@sqldoubleg) for dragging up these memories.

Oh, I could list all kinds of times I’ve learned things the hard way, in almost every part of my life. But let’s stick to SQL.

This was a long while back… 15-20 years ago.

There was a guy who needed to get his timesheets in. It wasn’t me – I just thought I could help …by making a copy of his timesheets in a separate table, so that he could prepare them there instead of having to use the clunky Access form. I’d gone into the shared Access file that people were using, made a copy of it, and then proceeded to clear out all the data that wasn’t about him, so that he could get his data ready. I figured once he was done, I’d just drop his data in amongst everyone else’s – and that would be okay.

Except that right after I’d cleared out everyone else’s data, everyone else started to complain that their data wasn’t there.

Heart-rate increased. I checked that I was using the copy, not the original… I closed it, opened the original, and saw that sure enough, only his data was there. Everyone else’s (including my own) data was gone.

And then it dawned on me – these tables were linked back to SQL in the back end. I’d copied the reference, but it was still pointing at the same place. All that data I’d deleted was gone from the actual table. I walked over to the boss and apologised. Luckily there was a recent backup, but I was still feeling pretty ordinary.

These kinds of problems can hurt in all kinds of situations, even if you’re not using Access as a front-end. Other applications, views within SQL, Linked Servers, linked reports – plenty of things contain references rather than the actual thing. When you delete something, or change something, or whatever, you had better be sure that you’re working in the right environment.

I don’t even know the best way to have confidence that you’re safe on this. You can help by colouring Prod tabs differently in SSMS with SSMS Tools Pack, but it’s not going to guarantee that you’re okay. You need to be a little paranoid about it. Learn to check and double-check. Because ultimately, data is too valuable to make that kind of mistake.

@rob_farley