Azure SQL Database

Björn Peters challenges us this month to write about Azure SQL Database, describing all our experiences and opinions about it. I’m not sure either you, the reader, or I, the writer, have enough time for that, so I’m going to give you a quick version.

…and that is – yes, you should be using it.

Yes, because I think you should embrace Azure SQL Database. For all kinds of reasons.

Not least, because this week the SQL to probably my favourite movie is being released. I haven’t seen it yet, but from the trailers, I’m going to assume that Mr Incredible feels like he’s been left on the shelf while Elastigirl (not even branded as “Mrs Incredible”) goes out to save the world. That’s all I know (okay, except that Jack-Jack is discovering his powers), and I won’t be updating this post with any further information once I’ve seen it.

You see, in the first movie, the supers have been forced to retire and join the real world. As their secret identities become their only identities, they lose touch with who they are. Edna tells Helen to show Bob that she remembers who he is, because he feels like the world has forgotten. The kid on the tricycle doesn’t know what he’s waiting for from Bob – something amazing, he guesses. Bob is too…

Of course, Bob gets hired by Mirage, and gets to relive the glory days, discovering purpose again briefly. But it’s short-lived as it turns out Syndrome is developing technology to give everyone superpowers. And when everyone’s super, no one will be.

This is a fear that I think many database professionals have been dealing with in recent times, as the cloud has become increasingly prevalent. Infrastructure as as Service takes away the need to buy hardware. Platform as a Service takes away the need to install SQL Server. As for NoSQL, well, that’s about as blunt as it gets. Those superpowers that database professionals have developed over the years are being taken away as the world comes to realise that it’s better to put those duties in the hands of cloud providers like Microsoft.

I’m not saying that Microsoft is like Syndrome – keen to be a sidekick but only until they can grow into a nemesis, destined to try to steal our children until we throw a car at them (okay, I think that analogy falls down fairly quickly) – I’m saying that in this time of rapid technological advancement, which isn’t about to slow down, trying to hold onto the past is pointless. In the movie, if Syndrome didn’t develop that technology, someone else would have done. Hopefully they wouldn’t have used it to make the supers redundant, but rather to develop the supers (which I suspect is a theme in Incredibles II). I think it’s interesting that at the climax of the first movie, the family leverage technology to beat the giant evil robot. They can’t do it by themselves, but need to leverage the technology to be stronger.

We need to leverage technology, and not get depressed about the fact that the world is changing. Learn to use Azure SQL Database, rather than shying away from it. Understand the great things that you can do with it, and how it will enhance your environment. This technology is not to try to make you redundant, it’s to help you beat problems you didn’t have to fight before.

So remember who you are. Confront your problem. Fight. Win. And call me when you get back, darling, I enjoy our visits.

@rob_farley

Getting started as a presenter

For me, the trick to getting started in presenting in the technical community was to find something which made sense to me, but that other people found hard. I think from memory, the first thing that I presented was a .Net thing – maybe AJAX? – and I explored the things that people typically found hard, showing why it was easier than they expected.

…and it was no different to telling a story to a group of people at a barbecue, or around the water cooler.

Let me use Sargability as an example, and show you how the water cooler conversation goes:

 

Person 1: Rob, why did you change my query this morning?

Me: Because it was performing badly, and all you needed to do to fix that was to apply a dateadd on getdate() instead, of doing a datediff between your column and getdate().

Person 1: But why should that make a difference?

Me: Well, you have an index on your column, not on the difference between your column and now.

Person 2 (see – there’s a group!): Yeah nah, I saw the plan, and that index was being used.

Me: Sure, but it was a Scan, not a Seek. It was having to work out the datediff on every row and see if it was small enough.

 

…and you see you have the basis of a presentation. If you can explain it to a few people, and find ways to help them understand it, you can explain it to a larger group.

You might be fine about demonstrating a concept to a co-worker on your computer – so there’s the demo section of your presentation. You don’t need to use slides (I don’t!) to present to a crowd, you just need to see how to explain things in a way that they can connect to – addressing their problems and giving them confidence in your solution.

I think the trick to being confident on stage is to know that you’re the expert who has the answer they want. When you have an abstract that explains what scenario you’re trying to solve, people are coming along to listen just like they’d gather around your desk to listen. They want to know the answer, and they’re on your side. They’re not trying to beat you down – they genuinely want to learn. You might have the odd person in the crowd who wants to argue that they’re the real expert, but if you know your stuff, you can disarm them with things like “I love that you’ve pointed out that the index does get used, because you’re right, it does! Just having that index helps a lot – but I want to show you how to use that index in a better way.”

You can work off the people in the room the same way you can work off the people around your desk. When someone at your desk says “Hang on, what did you do there?”, you know that you should be more careful when explaining that to a bigger room, and invite them to come along the journey with you.

Presenting is a great way to give back to the community, and that’s why I’ve written this piece for T-SQL Tuesday this month. The community wants to learn from you – so let them do just that!

@rob_farley

The tools in my day

Another month, another writing assignment for T-SQL Tuesday – although I did have a post go live over at sqlperformance.com, where I wrote about ways that I put queries together when I have separate date and time columns.

This month, with the #tsql2sday theme being tools, hosted by Jens Vestergaard (@vestergaardj), I get to write about other methods I use to do my job.

In case you hadn’t realised, I’m a consultant, and tend to be on-site at my customers’ offices, as I help them with their data. Sometimes I’m helping them with some advanced analytics things, such as making predictions with Machine Learning. Sometimes I’m in standard BI/DW projects, as they’re trying to make sense of their data. Sometimes I’m troubleshooting performance issues, corruption issues, or whatever they need. Or I’m teaching them, about Power BI, querying, tuning, whatever. And more often than not, I’m using a machine that they’ve allocated to me – and using it alongside the customers.

So I don’t always (or typically) get to install new software on the machines I generally use. So the tools I get to use the most are things like SSMS, ML Studio, Visual Studio, DAX Studio, Power BI Desktop Studio, PowerShell Studio, and so on – the things that they already have on their machine – along with various scripts that I will use. If I’m doing a salvage operation from a corrupt database, I might be able to start thinking about other tools such as the HxD hex editor, or Plan Explorer (actually, the whole Sentry One suite is a great set of tools, that I love finding a customer sites when I do).

I want to wax lyrical a little about one of the tools that I feel like I’ve got to know better over the past couple of years, since SQL 2014 came around.

I wrote about Live Query Statistics within SSMS a while back – and even presented at conferences about how useful it is for understanding how queries run…

…but what I love is that at customers where I have long-running queries to deal with, I can keep an eye on the queries as they execute. I can see how the Actuals are forming, and quickly notice whether the iterations in a Nested Loop are going to be unreasonable, or whether I’m happy enough with things. I don’t always want to spend time tuning a once-off query, but if I run it with LQS turned on, I can easily notice if it’s going to be ages before I see any rows back, see any blocking operators that are going to frustrate me, and so on. Using TOP might introduce row goals that twist my plan more than I’d like, but using LQS lets me feel how long a query will take overall, as well as giving me good insight into it without too much cost. I figure the cost of having the server store the execution profile is probably (I wouldn’t do this on a production server) a lot easier than my looking at an estimate plan for a while and assessing whether I should do a bit of a rewrite or introduce some indexes.

LQS is a great teaching tool, but it also serves as a progress bar for me on queries. I recognise blocking operators, such as Hash Aggregate or Top N Sort, and I can tell when an estimate on an operator is way off – to the point that I can quickly get an idea about whether I should let the query keep going or not. To be able to sit at someone’s desk and show them that their query will start returning rows fairly soon (or not) is excellent, and having a better guess about whether it’s going to be another five minutes or another hour is also tremendously useful.

…because that way I can figure out whether to get some more coffee.

@rob_farley

 

TSQL2sDay150x150

T-SQL Tuesday #100

When T-SQL Tuesday started back in late 2009, I have to admit that I wasn’t sure why Adam used three digits. I didn’t know whether it would reach 10, let alone 100. 2018 seemed so far away – if you’d asked me if I thought T-SQL Tuesday would still be going in 2018, I wouldn’t’ve had a clue. And I definitely wouldn’t’ve thought I’d still be contributing, having never missed one. I’m pleased I haven’t missed any though, because it keeps me writing. I enjoy having a writing assignment each month – it forces my hand, even though it’s self-inflicted.

Anyhow…

In his invitation post, our host and T-SQL Tuesday owner Adam Machanic (@adammachanic) makes some interesting points about the types of topics that have made up the list so far, that there has been a significantly high proportion of topics that are not technical in nature, showing that there is a large interest in personal development as well as technical. I suspect this implies that people want to know what people are thinking, not just how to solve problems. People can ask for technical help on a variety of sites, but perhaps T-SQL Tuesday is unique in its ability to ask people for their opinions. I do think that the next hundred topics will be less technical and more personal.

Adam asks what things will be like after another hundred months’ time, and it’s an excellent question.

I think people will continue to become more and more interested in their personal development rather than their technical development. And I think this will increasingly become the shift of the world. Many years ago, we saw robots replace factory lines, and today we see self-driving cars. As time continues, I’m confident that this trend will continue, and that “regular folk” like us will need to solve fewer and fewer technical problems. Performance tuning seem to be already on the way out, with some of the advancements that we’re seeing in Azure SQL Database. AI advancements allow us to create systems that are smarter than ever before, and database consultants are becoming business consultants.

And this is because I see ‘business consultancy’ as being one of those ‘hard’ tasks. In school we learned about P/NP problems – P being the set of problems that were solvable in useful (the real word is ‘polynomial’ time, but I figure that the real thing is whether it can be solved quickly enough to be useful), and NP being the set of problems that weren’t, despite it being quick to demonstrate that a particular solution was a valid solution (this is just my paraphrase of it all – don’t beat me up for being exactly correct). Obviously you can just test every possible answer to see if it’s a valid solution, but that’s not quick enough to be useful. The world of computer science is still finding ways to turn problems from being NP into P, and we shouldn’t have to spend time thinking about P problems.

If I assume that consulting will always be needed, why isn’t it possible to turn it into a P problem? I suppose that’s like the question of whether a computer can ever produce beauty on a reliable basis. And I think it can. It should be possible to analyse a given situation, and produce a strong-enough plan of attack to produce sensible advice. A self-driving car has a very restricted set of parameters in which it can make decisions, but ten years ago we would’ve all considered this was too NP hard to become useful. Medical consulting has been replaceable (to a certain extent at least) by ‘expert’ systems for many years. So I suspect that business consulting and certainly data consulting ought to be able to go the same way.

…maybe not in the next 100 months though. And hopefully not until I’m ready to retire. I might still be writing a blog post every month, mind you, unless that can be automated as well.

@rob_farley

TSQL2sDay150x150

Revisiting GROUP BY v DISTINCT

It’s been nearly eleven years since I compared GROUP BY and DISTINCT in a blog post. So when Aaron Bertrand (@aaronbertrand) challenged people to review his list of bad SQL habits, and I saw a post from last year at the top, I thought I’d have to have a look.

And I loved his post! It used almost exactly the same example that I had, and came up with the same results, ten years later (I don’t mean that his query took ten years to return results, just that his post is from 2017 and mine is 2007). It’s great to see the ‘next generation’ of blog posts coming up with the same conclusions. (I get to pick on Aaron like this because he was an MVP almost ten years before me (despite being born in the same year as me!), and I therefore consider him a different generation of SQL expert. An ‘elder statesman’ if you like.)

What I want to explore in this post is the particular example that we both used… to bring an important point that could be missed because of the similarity of our examples.

You see, we both happened to use a FOR XML concatenation query, looking back at the same table. We did this to simulate a practical GROUP BY – somewhere that you might feel like GROUP BY is useful, but you know that you’re not using an aggregate function like SUM or MAX, but there isn’t one available. Ok, for Aaron he could’ve used the really new STRING_AGG, but for the old-timer like me, having to use SQL Server 2005, that wasn’t available.

The upshot was that we both used a sub-query, one example using GROUP BY…

…and one using DISTINCT instead.

And, we both demonstrated that GROUP BY was quicker, because the QO didn’t need to do all the string concatenation for every row, only for every unique WhateverWeWereGroupingBy.

Go read those posts to see the results. I’m not going to repeat them here.

Back again? Ok, cool.

Now, this isn’t particularly different to any other sub-query, like this one in the AdventureWorks database,

, with the corresponding plans shown here.

image

Both posts demonstrate that using DISTINCT, the Query Optimizer comes up with a worse plan, because it doesn’t seem to know that the sub-query will return the same values on subsequent calls, figuring that it should see what comes back each time before checking for distinct values. It’s like it worries that DISTINCT option could produce different results. Using GROUP BY it only calls the sub-query for the distinct values in the GROUP BY clause, which could be massive savings.

Now, I’m sure you’re thinking “Hang on – I could’ve written that query another way…”, and you’re right. You probably would’ve written it like this:

…and the query plan for this one looks like this, which is the shape as the GROUP BY query – despite using DISTINCT.

image

But if I use a LEFT JOIN instead of an INNER JOIN, then the behaviour changes significantly, and it goes back to joining the results of the aggregate, rather than aggregating the results of the join.

image

I’m a big fan of LEFT JOIN – it makes it very clear that I’m not filtering my base table when I do the join. So I don’t want to hear that the performance is worse.

Luckily, this sounds like a thing about GROUP BY. After all, I only want to group by od.ProductID – there can only ever be one Product Name per ProductID. And yet,

image

it doesn’t help at all.

Hmm.

There is definitely an advantage to using GROUP BY over DISTINCT when using sub-queries. But the story isn’t actually as clear as all that, and is affected by a bunch of other scenarios.

The trick is to look at where the aggregation is going on, and to remember that an aggregate operator reduces the number of rows in play. Anything which reduces the number of rows being pulled through the plan should be pushed as far to the right as possible. If your aggregate will do that the most, do what you can (query rewrites, indexes, constraints, preferably not hints) to get that operator as far right as possible. But you could do the same with a join operator, or a Filter operator, whatever is reducing the rows.

Understand that GROUP BY is generally better than DISTINCT, because it’s a technique for pushing the operator to the right hand side of a join. But remember it’s only part of the story.

@rob_farley

PS… I feel like I want to expand on this a bunch more, but I need to get this published tonight for T-SQL Tuesday. It’s the 99th one, and I don’t want to miss one now, just because I could type about this for hours.

SQL Server CPU Usage by Session

Troubleshooting high CPU usage on a SQL Server Database is an art, but there is a defined methodology to follow to find the root cause of high CPU. This can involve breaking down the overall server CPU usage to a more granular level, first discovering that it’s SQL Server that’s the problem (because way too often it’s something else!), down to exploring specific plan operators in a particular problematic query. Finding that problematic query, identifying the high CPU consumer, means identifying the CPU usage by session.

sys.dm_exec_requests shows the CPU time, but it’s cumulative – it doesn’t give the CPU consumption by each session at the current time. You can see how much CPU usage a session has had since it started, but it doesn’t show you what’s going on right now. To explore that, we need to query sys.dm_exec_requests repeatedly, and look for the differences. We need to collect the CPU usage for a time interval to identify the high CPU consumers.

Here is a query which does that across a one second interval. You can go as small or as large as you like by changing the ‘WAITFOR’ value – but the closer you want to what’s going on “right now”, the smaller you want that interval to be. I find one second to be a good starting point. The query populates snapshots of sys.dm_exec_requests into a table variable called @sessionsCPU, which I then query to do some analysis. I could do this with LAG, but my code will work on pre-2012 instances too, and I can’t see a reason to change it.

Here’s an example of what I get. Obviously, this isn’t from a client machine, it’s just on my local environment. But you’ll see that I can tell that right now, the query that’s causing me most difficulty is the session 71.

I can see the query, and this lets me delve further into the problem. I don’t have an answer yet, but I’m an awful lot closer to finding the culprit and solving the high CPU issue than I was before I ran this query.

Hope this query helps you in troubleshooting high CPU issue and get you a step closer to resolve the issue.

Happy Learning,
Manu

Twenty years on, how I made reporting run quicker

Twenty years ago, in January 1998, was when I touched SQL Server for the first time as a consultant – that was SQL Server 6.0, and I’d be involved in an upgrade to 6.5 at that same customer only a couple of months later. I would go on to spend quite a bit of time with this customer, and helped them implement a lot of things over time. I was a programmer back then, and saw the database as just part of the application. It was only later that I started to see applications as peripheral to the data rather than the other way around.

One major problem that was happening at this customer was the reporting. Every month they would need to provide reports to a part of the government, and these reports were simply painful to produce. Let me explain…

The basic system was designed to monitor the health of machines that were dotted all around the city – machines that would send a signal every few moments to indicate their status. Normally these were “nothing wrong” messages, but occasionally there would be a problem with a particular part of the machine, and our application would manage getting the message out to someone who could go and fix it. Later, a “fixed’ message would come through, I’m sure you get the picture. There were also manual reports of downtime that didn’t have automated signals, that would need to be included in the mix, for those times when people would phone up and say that there was a problem.

The reporting was about the overall uptime of each machine, including whether there was a machine available at each location at a given time, whether the combination of components that were unavailable meant that the entire machine was considered down, or whether so many machines at a location were down that the location itself needed to be marked as unavailable, and so on. The data for a single machine was like:

Machine Time Message
100 19980109 12:33:05 OK
100 19980109 12:34:23 HEAT WARNING
100 19980109 12:34:26 COMP1 ERROR
100 19980109 12:34:29 TOUCHPAD ERROR
100 19980109 12:35:12 COMP1 NOERROR
100 19980109 12:35:19 HEAT NORMAL
100 19980109 12:30:00 Report – TOUCHPAD ERROR
100 19980109 12:35:00 Report – TOUCHPAD FIXED

 

…and so on. Timestamps were available for when phone calls came in, but apparently the reporting needed to show the time that the person said they saw it was unavailable, not the time the alert came through. Go figure.

The reporting would need to have looked at this data, and figured out that the touchpad was unavailable between 12:30 and 12:35, and that Component 1 was unavailable for 48 seconds. There would’ve been tables available to show that the touchpad being down meant the machine would’ve been considered only 50% available, and that Component 1 being down meant 60% available, but a combination of both touchpad and component 1 meant completely down. So this data would mean: 4 minutes 26 seconds of 50% available, 34 seconds of 0% available, and 12 seconds of 60% available. The reports were in place before I went there – they just didn’t run well.

They ran by querying all the data, and then looping through it (in VB3) to figure out the gaps, and to try to keep track of what was available and what wasn’t. If the report were at the machine level, it would pull back everything for that machine during the time period of interest and then try to figure it all out, but if it were at the location level, or even city-wide, it was dealing with a very different amount of data. It would take a few minutes to run a report for a single machine for a week, and many reports that were available had simply never been run. Fixing this was one of the biggest challenges I had in my first few months of consulting.

I didn’t know until years later that what I did was to create a data warehouse of sorts.

I could see that the moment when someone hit the button to say “Show me the report for machine 100 last week” was the wrong time to do the reporting, because no one wanted to watch a computer for a few minutes – or a few HOURS if they wanted a larger amount of time or more than a handful of machines. That meant the work needed to be done ahead of time.

So I relaxed many of the rules of database design I’d learned, and came up with some denormalised tables that were purely designed for analysing the data. This would store the amount of downtime per machine or per location, but be populated ahead of time – updating it with maybe an hour or so latency, long enough to allow for the phoned-in alerts. I’d store the number of seconds since the last machine event, and what the availability proportion was during that time period, and include extra dummy events for the start of each hour, so that the report could handle exact time periods. I spread the heavy-lifting across time, pulling the new messages in as regularly as I could, to avoid doing that heavy lifting when time was more urgent. This was ETL, although I just considered it was reporting-preparation, and didn’t learn what ETL meant until much later.

Once that data was in place, I could just sum the product of number of seconds * availability, and the reports could be produced quickly and easily, no matter how large they were. There was no longer a need for a cursor at report runtime – just regular aggregations.

For me, this trick of doing the work at a time when it’s less urgent is key to a lot of things in the database world (and also in the rest of life). I use indexes to keep a sorted copy of data so that aggregations and joins can run faster. I use filtered indexes to separate out lists of new data from old data. I use data warehouse and cubes to handle history and data quality and complex calculations, so that business users can explore data to their hearts’ content. If I can put up with doing the work later, then maybe it’s just fine to not have those indexes in place – perhaps I’m totally okay about scanning a table of 1000 rows every so often. But if the cost of maintaining a separate copy (index, warehouse, whatever) isn’t going to be significant when spread out over time, but that copy is going to make life much easier when it comes to the crunch, then I’m going to do it.

I often think back to some of the customers / projects that I’ve had over the years and roll my eyes at what I had to overcome. Generally though, the worse they were, the more I learned and the stronger I became.

It’s all just practice for whatever is going to come in the future, right?

@rob_farley

This post was written in response to this month’s T-SQL Tuesday, hosted by Arun Sirpal.

TSQL2sDay150x150

My Learning Goals for 2018

Some months it’s a good thing to be prompted to post by T-SQL Tuesday, because it gets me to talk about something I should’ve talked about already, but haven’t. This is one of those months, thanks to Malathi Mahadevan (@sqlmal)’s topic around Learning Goals

I don’t generally write about my Learning Goals, because I tend to base what I’m learning around what my clients’ needs are. If I feel like I need to brush up on Azure Data Lake because I have a project around that, then I’ll take some for that. I like to keep on top of new things as they come through, but then hone my skills when I have a proper opportunity. I don’t tend to study much for certifications – I didn’t prepare much for MCM exams that I took (and passed – just!) five years ago, nor the couple of exams I took recently to keep hold my ‘Partner Seller’ status.

That said, it’s always good to see things that are coming up and dive a bit deeper into them. And for me, at the moment, that’s the edX courses in the Data Science space. I’m not sure whether I’ll get through to the end of them all, because I’ll happily let myself get distracted by something else. The Advance Analytics area is one of which I’m particularly fond – it’s a nice contrast to the Pure Maths subjects I did at university which shapes my approach to relational data. So far, I’ve gone through the courses quickly, but picking up the odd bit that I didn’t remember. The “Essential Statistics for Data Analysis using Excel” was a particular area, which rang bells but also reminded me how long it’s been since I’ve studied this kind of material.

Where do I think I’ll be in 12 months? Well, like the answer to all good SQL questions, it depends. I’m sure I’ll still be analysing data and helping businesses with their data-focused ambitions, and that I will have learned quite a bit during that time.

@rob_farley

My inspirational team

This month the T-SQL Tuesday theme is “Who has made a meaningful contribution to your life in the world of data?”, and is hosted by Ewald Cress (@sqlonice).

It was good to reflect on this over the past week, and there are (of course!) a lot of people that I could list. I could list some of the people that were prominent in the community when I first discovered it all those years ago. I could list those people that I turn to when I have questions about things.

But there’s one group of people that for me stands out above all the rest. So much so that I’ve hired them.

I’ve had my company for nine years now, and during that time I’ve had up to seven people in the team. Well over a dozen different people have represented the LobsterPot Solutions brand, and they’ve all been amazing. Here are a few stats about our past and present LobsterPot team:

  • There have been MVPs, of course. Six MVPs, in fact, have been employed here. One became an MVP after leaving us, and two no longer have MVP status, but still… six!
  • There have been user group leaders. Seven (or maybe eight) of us lead or have led groups.
  • A few former employees now lead their own businesses.
  • One former employee is completing a PhD.
  • One former employee is a Senior Program Manager at Microsoft.
  • One former employee attends the PASS Summit on her vacation time.

You see, they are all leaders. They almost all give presentations at events. They all make meaningful contributions to the data community.

And they have all made a meaningful contribution to my own life. I may have paid them a salary, but they’ve all left an impact on me personally.

@rob_farley

Migrating to Azure SQL DB using Transactional Replication

Azure SQL DB is a PaaS offering from Microsoft on Azure for hosting a SQL database. This may not be ideal for all environments. But if you have a database which can work in isolation handling requests from one or more applications, Azure SQL DB may be a right fit for you. One common use-case is for ISVs who host multiple tenant databases.

One of the challenges faced is how to migrate to Azure with minimal downtime, especially when your database is already in production use. There are a couple of methods to migrate the database from your production environment to Azure SQL DB.

One method is by exporting your database as a BACPAC file and importing it into Azure SQL DB. There are few points to note when using this method.

  • The production database should not change from the point you start the export till the database is imported into Azure SQL DB. This increases the production downtime.
  • The process involves exporting a BACPAC file, using tools like SqlPackage.exe to import the database into Azure SQL DB.
  • The rollback process is simple, allowing you to revert back to existing database (if your business can afford to lose any changes done on Azure SQL DB). The amount of data loss depends on at what stage you decided to rollback.

To achieve minimal downtime, I will detail a second method of migration: using Transactional Replication.

Transactional replication to Azure SQL DB was announced in 2015 as being in Public Preview. On 23 October 2017 this feature was announced as generally available. To setup transactional replication from on-prem SQL Server to Azure SQL DB, you will need to install the latest version of SSMS (v17.3). The process of migration involves the following steps.

Testing/POC

  1. Run the Microsoft Data Migration Assistant (DMA) to check if your database is ready to migrate to Azure. This tool will help you assess if your database has any compatibility issues to migrate to Azure SQL DB. More information about DMA is available here.
  2. Fix any issues identified in step 1 to prepare your database for migration and repeat.
  3. Create an Azure SQL DB shell (empty database) using the Azure portal. You can follow the step by step guidelines as described here.
  4. Setup transactional replication with the following configuration:
    1. Configure an on-prem SQL Server as the distributor. You can choose the SQL Server hosting the production database to be the distributor, or any other SQL Server.
    2. Configure a new publication for the production database you wish to migrate to Azure SQL DB. Choose all the objects you want to migrate. In general, choose all databases unless you wish not to migrate specific unwanted tables, e.g., archive history tables.
    3. Add a push subscriber by connecting to the Azure SQL DB you created in step 3. When connecting to the subscriber, connect using the details provided when creating the database in Azure.

  5. After successfully adding the subscriber, run the snapshot agent and complete the initialization.
  6. Once the database is in sync (and being kept in sync), choose a time when you want to test your application connectivity to Azure SQL DB.
  7. Remove replication.
  8. Make changes in test application configuration to connect to Azure SQL DB.
  9. Complete end-to-end testing of the application. This should be exhaustive and detailed to ensure business continuity.

Production Migration

  1. Complete steps 1 to 4 as performed for testing.
  2. Once the Azure SQL DB is in sync, choose a time to cutover the production database to Azure SQL DB.
  3. Shutdown the applications and make sure the replication is in sync.
  4. Remove replication and shutdown the on-prem production instance.
  5. Make changes in the application configurations so that they will connect to Azure SQL DB.
  6. Start the applications and complete functional testing.

Rollback Plan

There are two cases to consider rollback.

The first case, when the functional testing fails and the Azure SQL DB is still not updated with any valid production data.

  1. In this case, the rollback is simple. Reconfigure the applications to point to the on-prem database.
  2. Start the on-prem SQL Server and start the applications.
  3. Complete functional testing to ensure applications are behaving as expected using on-prem database.

The second case, when the functional testing passes. and the application runs using Azure SQL DB for few hours or days and then you realise that there are functional issues. Hopefully this is a rare scenario and mostly arises due to poor testing.

  1. In this case, it becomes critical to get back the production data inserted into Azure SQL DB.
  2. The method to sync data back from Azure SQL DB to the on-prem SQL Server database is provided by Azure Data Sync. This will be covered in detail in a future article.

Downtime

The only downtime needed for migration using transactional replication to Azure SQL DB is during the cutover. This depends on how quickly you can stop your applications, shutdown the on-prem SQL Server and make the configuration changes in applications to point to migrated Azure SQL DB. All these tasks can be automated with PowerShell or other tools, so that the downtime can be minimal.

Conclusion

Azure SQL DB is a tailored service that will suit most applications that connect to a single database. With transactional replication now GA, and the tools available in the latest SSMS, the door has been opened to more users for easier migrations. Transactional Replication is the easiest and most flexible method to migrate to Azure SQL DB with minimal downtime and lowest impact on business continuity.

Happy Learning,

Manu