I mean really? Why should you spend some of your training budget to go to this thing?
Suppose you're someone in the PASS Community who mainly looks after people's data. That could involve database administration, performance tuning, helping developers write queries, that kind of thing. What part of "Advanced Analytics and Insights", "Big Data Innovations and Integration", "Data Analytics and Visualization", "Information Delivery and Collaboration" or "Strategy and Architecture" is relevant to you? It sounds all well and good for the BI team, who's thinking about cubes and models and report subscriptions and Power something, but that's not you.
The problem is that as data professionals, we're no longer just database administrators. The world has become more demanding than that. Maybe it's because of the financial difficulties that the western world has been facing. Maybe it's because we've out-grown our jobs as database administrators. Maybe we're simply being asked for more than we were before.
Now more than ever before, if you're a data professional, you need to be thinking about more than just transaction logs, corruption checking, and backups. You need to be thinking about the overall data story. You can tune your databases to cope with the large amount of data that's pouring into them, as more and more systems produce consumable data. You can work with your developers to help them understand the significance of indexes to be able to get the data out faster. But is this really enough?
Today, we need to be strategic about the data. An increasing number of companies are moving their data to the cloud, where the need for database administrators is not quite the same as it has been in the past. There are a number of tools out there to allow you to manage hundreds, even thousands of database servers, putting pressure on you to be providing more from your role.
And then you get asked into meetings! People more senior than you asking about what can be done with the data. Can you offer more than just a comment about how much they can trust you to make sure their data is available?
This is why you need to be looking at things like the Business Analytics Conference. It's because you need to know how to make the data that you look after more relevant to the organisation who entrusts you with it. You need to know how to get insight from that data. You need to know how to visualise it effectively. You need to know how to make it visible through portals such as SharePoint.
And you need to know WHY these things are important.
Either that, or you need to call in external consultants, who can provide these kind of services. You know how to get in touch. 😉
PS: I should mention that I'm on the PASS board, so I see a lot of stuff about this conference. I'm not part of the organising committee at all though, and have been remarkably separate from the whole process. I do consider that this conference is about helping people achieve more within the data space, and that's something I think more people should be taken advantage of.
Every year, PowerShell increases its stranglehold on the Windows Server system and the applications that run upon it – with good reason too. Its consistent mechanisms for interaction between its scripting interface and the underlying systems make it easy for people to feel comfortable, and there is a discoverability that has been lacking in many other scripting environments.
But what's going on behind the scenes? Does PowerShell present a threat to those amongst us who will always prefer T-SQL? Does PowerShell give us new options that are not available any other way? Well, let's have a bit of a look, especially since this month's T-SQL Tuesday (hosted by Wayne Sheffield who tweets as @DBAWayne) is on the topic of PowerShell.
So we know PowerShell is useful. However we spin it up, we can quickly jump into writing commands, whether it be interacting with WMI, hooking into some .Net assembly we've loaded up, or simply browsing the file system. I've developed a tendency to use it to start whichever SQL instances I'm needing for the day – by default I have all of them turned off, since I don't know which one I'll be wanting most.
If we're going to be interacting with SQL, then it's easiest to either load up the SQLPS environment directly (there's a PowerShell shortcut within Management Studio), or else (as I do), start a PowerShell window with the Snapin loaded. I prefer this later option, as the SQLPS environment is a slightly cut-back version of PowerShell. But either way – the stuff I'll continue on with is essentially the same whichever environment you use.
If you've talked about SQL with me long enough, you'll have come across the fact that I often use SQL Profiler when I'm curious about where to find information. My former colleague Jim McLeod (@Jim_McLeod) blogged about this a few months ago, with an example that I remember looking through with him four or five years ago. It's a great technique that works on all kinds of things, even across different versions of SQL Server. It also adds as a terrific reminder that Management Studio is not a special application, it simply knows how to ask for the pieces of information that it shows.
But PowerShell (or SMO, for that matter), that's in the .Net world. Surely that would be able to bypass the clunky T-SQL stuff that Management Studio does… I mean, Management Studio has to be flexible enough to work across remote servers, talking through firewalls that only allow T-SQL interfaces. Surely PowerShell has access to a deeper magic.
Well, no. PowerShell still lets you talk to remote servers, and ends up using the same methods.
Let's prove it.
Spin up Profiler, and start a trace against your favourite instance. I like to watch for SQL:BatchCompleted, SP:StmtCompleted and RPC:Completed events when doing this kind of thing. I'm using an instance that isn't doing anything else, but you could apply a ColumnFilter to filter the events to things with an ApplicationName starting with SQLPS if you prefer.
With that running, I jump into PowerShell and do something like:
PS SQLSERVER:\sql\localhost\sql2008r2> dir Databases | ft name
This lists the names of the databases on my SQL2008R2 instances. You don't need to see the results, you can imagine them for yourself.
If PowerShell were using some secret interface, it's unlikely we'd see something in Profiler. But it's not, and we see a bunch of stuff.
We see a bunch of Remote Procedure Calls, each with a Stored Procedure Statement Completed event showing the same information. And look – we see queries against master.sys.databases, asking for the name of each of the databases, passing in the name as a parameter. Brilliant! Notice just a bit earlier though, there's a SQL:BatchCompleted call. This means that a query has been passed in directly. It's this:
SELECT CAST( serverproperty(N'Servername') AS sysname) AS [Server_Name], dtb.name AS [Name] FROM master.sys.databases AS dtb WHERE (CAST(case when dtb.name in ('master','model','msdb','tempdb') then 1 else dtb.is_distributor end AS bit)=0) ORDER BY [Name] ASC
So it grabs the list of database names first, and then makes extra calls to be able to fetch the list of names again, one by one.
The reason why it's grabbing the list of names one by one isn't because it's stupid and is asking to be ridiculed. It's because we've asked to see that property, and I guess the PowerShell people figured that no matter what property you ask for, it'll go and fetch it to show you.
When I asked for the CompatibilityLevel property instead, I got some different rows thrown in. Interestingly though, it still asked for the name each time.
Also interestingly, when I asked for the CompatibilityLevel a subsequent time, the calls for "SELECT dtb.compatibility_level…" weren't in there. They'd been cached by the PowerShell environment – important to note if you ever come across PowerShell giving you old values.
So what about asking something more interesting? Let's try asking about the IndexSpaceUsage in AdventureWorks.
PS SQLSERVER:\sql\localhost\sql2008r2> gi Databases\AdventureWorks | ft IndexSpaceUsage
The result tells me it's 62576. Yeah, but today I'm not interested in that, just what happened in the trace.
Four entries. An SP:StmtCompleted with matching RPC:Completed, and two SQL:BatchCompleted.
The SP:StmtCompleted and RPC:Completed were this statement, passing in the parameter value 'AdventureWorks'. Clearly incredibly informative.
SELECT CAST(0 AS float) AS [IndexSpaceUsage], dtb.name AS [DatabaseName] FROM master.sys.databases AS dtb WHERE (dtb.name=@_msparam_0)
This is getting the value zero. Wow. Brilliant stuff.
The last entry – the second of the two SQL:BatchCompleted events is:
select convert(float,low/1024.) from master.dbo.spt_values where number = 1 and type = 'E'
If you run this in Management Studio, you'll discover it gives the value 8. Ok.
The other entry is more interesting.
SELECT SUM(CASE WHEN a.type <> 1 THEN a.used_pages WHEN p.index_id &amp;lt; 2 THEN a.data_pages ELSE 0 END) AS [DataSpaceUsage], SUM(a.used_pages) AS [IndexSpaceTotal] FROM sys.allocation_units AS a INNER JOIN sys.partitions AS p ON (a.type = 2 AND p.partition_id = a.container_id) OR (a.type IN (1,3) AND p.hobt_id = a.container_id)
This is more like it! We run this in Management Studio, and we see two values. DataSpaceUsage is 13682, IndexSpaceTotal is 21504. Neither are our value 62576. But we do have clues in the column names, and in that value 8 that came back too. We can easily deduce that it's actually (IndexSpaceTotal-DataSpaceUsage)*8, and we have ourselves a nice little method for working out the IndexSpaceUsage ourselves now if we need it.
Or we can just ask PowerShell next time as well.
Incidentally – if you're considering doing the MCM Lab exam some time, then you might find that a familiarity with PowerShell comes in really handy. I'm not saying there are PowerShell questions on the exam at all – I'm just suggesting that you may find that PowerShell becomes a really useful way of getting at some of the information that you're looking for. If you're stumbling around the list of DMVs trying to remember which one it is that stores some particular thing, remember that you might be able to get the data out more easily if you use PowerShell instead.
So can we discover secret things about SQL from PowerShell? Are there things we can do in PowerShell that are impossible through other mechanisms? Hooks that let us break the rules even?
Recently, Kendal van Dyke asked a question about this kind of thing on Twitter. He was wondering if you could have a default constraint on a column in a view. The reason for his wondering was that he saw a property on a view column in PowerShell that made him wonder. The answer is no though, and there's a simple reason.
PowerShell is a programmatic interface. It involves classes and property and methods. It does things row by row, which is why much of what you see in that trace feels amazingly pedantic – asking about things which shouldn't have to be that complicated. The implication of this though, is that PowerShell reuses the concept of a column, regardless of whether this is a column in a table, a view, or anywhere else it decides to need a column. The fact that columns in tables have some extra properties isn't enough to make this class re-use pointless. If we try to set a Default constraint for a column in a view though, we get an error, just like if we tried to do it any other way.
The code that ended up getting called was to the stored procedure sp_bindefault (despite it being deprecated). Naturally, trying to execute this against a view column gives an error regardless of what wrappers you have put around it – PowerShell or not.
So PowerShell is very useful, and it provides a handy way of getting to a lot of things that could be otherwise hard. But looking below the surface, it isn't able to circumvent the basic restrictions of SQL Server, because it still ends up doing its work using T-SQL.