CXCONSUMER Is Harmless? Not So Fast, Tiger.

July 24, 2018, 6:15 am

≫ Next: Does your backup strategy achieve RPO and RTO goals of the business?

≪ Previous: Building SQL ConstantCare®: What Mentoring Is (and Isn’t)

In Theory

Let’s say you’ve got a query, and the point of that query is to take your largest customer/user/whatever and compare their activity to smaller whatevers.

You may also stumble upon this issue accidentally if your query happens to capture populations like that by accident.

For our purposes, let’s look at a query that compares Jon Skeet to 1000 users with the lowest activity. To make this easier, I’m not going to go through with the comparison, I’m just going to set up the aggregations we’d need to do that.

I’m going to do this on the smaller 10GB version of the Stack Overflow database, and running the query on my beefy desktop, in case you’re wondering. I need a few indexes first:

CREATE INDEX ix_users ON dbo.Users(Id, Reputation, CreationDate DESC) INCLUDE (DisplayName);
CREATE INDEX ix_comments ON dbo.Comments(UserId) INCLUDE (Score);
CREATE INDEX ix_posts ON dbo.Posts(OwnerUserId) INCLUDE (Score);
CREATE INDEX ix_badges ON dbo.Badges (UserId);

MAXDOP is set to four, and CTFP is set to 50. Now for the query:

WITH hi_lo AS
(
	SELECT TOP 1 u.Id, u.DisplayName, 1 AS sort_order
	FROM dbo.Users AS u
	ORDER BY u.Reputation DESC

	UNION ALL

	SELECT TOP 1000 u.Id, u.DisplayName, 2 AS sort_order
	FROM dbo.Users AS u
	WHERE EXISTS (SELECT 1/0 FROM dbo.Posts AS p WHERE u.Id = p.OwnerUserId AND p.Score > 0)
	AND u.Reputation > 1
	ORDER BY u.Reputation ASC, u.CreationDate DESC
)
SELECT u.DisplayName, 
       SUM(CONVERT(BIGINT, p.Score)) AS PostScore, 
	   SUM(CONVERT(BIGINT, c.Score)) AS CommentScore, 
	   COUNT_BIG(b.Id) AS BadgeCount
FROM hi_lo AS u
JOIN dbo.Posts AS p
ON p.OwnerUserId = u.Id
JOIN dbo.Comments AS c
ON c.UserId = u.Id
JOIN dbo.Badges AS b
ON b.UserId = u.Id
GROUP BY u.DisplayName, sort_order
ORDER BY u.sort_order

In Parallel

This query hits a dead end really quickly. If I sample wait stats with sp_BlitzFirst, they tell an interesting story:

EXEC dbo.sp_BlitzFirst @Seconds = 30, @ExpertMode = 1;

At the query level (yes, that’s 26 minutes in), we’ve been waiting on CXCONSUMER nearly the whole time.

You have died of consumption

The wait stats generated for a 30 second window are no less appealing. Nine total waits, each lasting 136 SECONDS on average.

Just Ignore Me

In this sample there are absolutely no waits whatsoever on CXPACKET.

They are nonexistent.

If you were hoping to find out that they were way crazy out of control call a priest and ditch your bar tab we’re driving on the train tracks, you’ll be awfully disappointed.

There just aren’t any.

There’s only one core in use for nearly the entire duration, aside from some blips.

Jammin On The Ten

Here’s the plan for it, so far. This is only an estimated plan.

In Serial

Adding a MAXDOP 1 hint to that query reduces query time to about 55 seconds.

It’s also possible to get faster queries by supplying join hints, though not faster than limiting DOP to 1.

Here’s the plan for it.

The wait stats for it are pretty boring. Some SOS_SCHEDULER_YIELD, some MEMORY_ALLOCATION_EXT.

Stuff you’d expect, for amounts of time you’d expect (lots of very short waits).

In Closing

This isn’t a call to set MAXDOP to 1, or tell you that parallelism is bad.

Most of the time, I feel the opposite way. I think it’s a wonderful thing.

However, not every plan benefits from parallelism. Parallel plans can suffer from skewed row distribution, exchange spills, and certain spooling operators.

Today, it’s hard to track stuff like this down without capturing the actual plan or specifically monitoring for it. This information isn’t available in cached plans.

This also isn’t something that’s getting corrected automatically by any of SQL Server’s cool new robots. This requires a person to do all the legwork on.

One other way is to use sp_BlitzFirst/sp_BlitzWho to look at wait stats. If you see queries running that are spending long periods of time waiting on CXCONSUMER, you just might have a thread skew issue.

If you blindly follow random internet advice to ignore this wait, you might be missing a pretty big performance problem.

In Updating

This query is now about 17 hours into running. Through the magic of live query plans, I can see that it’s stuck in one particular nut:

Look at you with your problems

I got paranoid about missing cumulative wait stats, so I started logging sp_WhoIsActive to a table.

Here’s what that looks like, for the most recent rows.

Into The Future

Right now it’s July 20th. This post is scheduled to go live on the 24th.

Will it finish before then?!

In Updating More

This query ran for a over 45 hours.

Squinting

The full plan for it is here.

Somewhat laughably, the wait stats for this query show up like this:

<WaitStats>
    <Wait WaitType="PAGEIOLATCH_SH" WaitTimeMs="1" WaitCount="15" />
    <Wait WaitType="RESERVED_MEMORY_ALLOCATION_EXT" WaitTimeMs="1" WaitCount="2919" />
    <Wait WaitType="CXPACKET" WaitTimeMs="304" WaitCount="44" />
    <Wait WaitType="SOS_SCHEDULER_YIELD" WaitTimeMs="33575" WaitCount="41044718" />
</WaitStats>

The CPU time for it looks like this:

<QueryTimeStats CpuTime="164336421" ElapsedTime="164337994" />

And the last few minutes of CXCONSUMER waits look like this:

Consumed

I missed the last 30 seconds or so of the query running, which is why the CXCONSUMER waits here don’t quite line up with the total query CPU time, but they’re very close. Why doesn’t that wait show up in the query plan? I have no idea.

What really gummed things up was the final Nested Loops Join to Posts.

That’s a 13 digit number of rows for a database that doesn’t even have 50 million total rows in it.

Insert comma here

Bottom Line: Don’t go ignoring those CXCONSUMER waits just yet.

Thanks for reading!

Brent says: we were writing new parallelism demos for our PASS Summit pre-con and our Mastering Server Tuning class to show EXECSYNC, and we kept coming across wait stats results that just didn’t line up with what Microsoft has reported about “harmless” waits. This is going to be a really fun set of demos to share in class.

↧

Does your backup strategy achieve RPO and RTO goals of the business?

July 25, 2018, 6:15 am

≫ Next: Do I Have A Query Problem Or An Index Problem?

≪ Previous: CXCONSUMER Is Harmless? Not So Fast, Tiger.

When deciding on a backup strategy for a database, there are various things we must consider:

Does this database need point-in-time recovery?
What are the RPO (data loss) and RTO (downtime) goals? No one likes losing data or encountering unplanned downtime, but the business must decide on these goals so that we can setup an environment that can meet those goals. You want no data loss and no downtime? Okay, who is going to write the check for this?
How much backup retention is needed short term and long term?
How fast are the backups and how quickly can I restore?
Should I use a SAN snapshot or similar if my database is a VLDB?

Sometimes we overlook things

Given my backup strategy, can I hit the RTO goal when needing to do a restore?

What if I need to restore lots and lots of transaction log backups?

There’s a script for that

I’ve got various scripts in my toolbox, one such script reads a folder and writes out the RESTORE LOG commands. This is handy when I need to setup an Availability Group, Database Mirroring or Log Shipping and need to get the secondary server caught up with the transaction log chain. It can also be used for an unplanned restore. The script assumes that the files are in order when sorted alphabetically.

I recently came across transaction log backups that were named in such a way that my script didn’t work. This system had Sun, Mon, Tue, Wed, Thu, Fri or Sat in the backup file names, so Monday files were listed before Sunday files. I was fumbling to fix the script, but then I just decided to manually work around it so that we could make progress on the task at hand.

While I was trying to make the script work, I thought to myself, “There’s no way I could hit the RTO goal here if this were an unplanned restore.”

Be prepared for an unplanned restore

Given your environment, you need to be ready to do a restore that can meet the RTO goal.

This must be practiced. If you can’t achieve the business’s RTO goal, your backup strategy needs to be changed.

Brent says: Like Mike Tyson said, everybody has a plan until they get punched in the cluster.

↧

Do I Have A Query Problem Or An Index Problem?

July 26, 2018, 6:15 am

≫ Next: [Video] “Actual” Execution Plans Lie for Branching Queries

≪ Previous: Does your backup strategy achieve RPO and RTO goals of the business?

Party Up

When someone says “this query is slow”, and you can rule out contextual stuff like blocking, odd server load, or just an underpowered server, what’s the first thing you look at? There’s a lot of potential culprits, and they could be hiding in lots of different places.

Pick a card, any card.

After several minutes of thinking about it, I decided to call my method QTIP, because I like to look at the:

Query Plan
Text of the query
Indexes used
Parameters used

Query Plan

I really like to understand the scope of what I’m dealing with. Queries can hide a lot of complexity from you, and if someone is working on a stored procedure with many statements in it, knowing where to start can be daunting.

Even though costs are estimates, they can often be a pretty good starting place to direct your tuning efforts.

There may also be something very obvious hiding in the query plan that looking at the query wouldn’t give you any insight into, like a spill, spool, expensive optional operators, or operators with many executions.

Text of the query

If I see something in the query plan I don’t like, or if something related to the query text gets flagged in sp_BlitzCache then my next step is to look at the text of the query.

Yeah nah

As a simple example, non-SARGable predicates were flagged here, because we have a column wrapped in a function.

Granted, you could spot this by hovering over tool tips in the query plan, but in a complicated enough plan, it would be easy to miss.

Indexes used by the query

Looking at the indexes generally makes sense to do after poking around the plan and the text.

You may spot something suspicious in the plan, like a heap, or no nonclustered indexes being used — this goes out the window if you’re using a data warehouse, of course — or you may just want to see what indexes are available.

We see a lot of apps that were produced with an initial set of indexes that helped the initial set of queries, but many releases and code changes later, they’re far less useful than they used to be.

Index tuning is a lot like losing weight: The longer you wait to do it, the harder it is. You’ll have to sift through a lot of deduplicating, ditching unused indexes, fixing heaps, and evaluating missing index requests before you start to see those spray tanned abs.

Parameters used by the query

Anyone who has dealt with parameter sniffing issues knows how much different values can change query plans and execution times.

You have to dig up the parameters the plan was cached with, executed with when it was slow, and client connection settings.

I tried to make this a little easier with sp_BlitzCache, by adding the cached execution plan parameters column:

I still hate XML

If you click on it, you’ll get something like this:

One Stop Shop

This doesn’t help you with what parameters were used when the plan was executed most recently — nothing like that will aside from a dedicated monitoring tool — but it’s a pretty good start.

Party On

I know this post is going to frustrate some people — it’s a big outline without much detail. It would be really difficult to go into necessary detail on each of these here, but it’s a good starting place.

And besides, maybe I’m planning on turning it into a presentation.

Thanks for reading!

Brent says: I love the QTIP approach because it reminds me not to get bogged-down in one thing. It’s so easy to just focus on the query plan without looking at the indexes, or to focus too much on getting exactly the right parameters. You’ve gotta use all the parts of the QTIP.

↧

[Video] “Actual” Execution Plans Lie for Branching Queries

July 27, 2018, 6:15 am

≫ Next: [Video] Office Hours 2018/7/25 (With Transcriptions)

≪ Previous: Do I Have A Query Problem Or An Index Problem?

In which I show why you can’t trust “actual” execution plans for branching stored procedures, and you’re better off using sp_BlitzCache to figure out what branches you need to tune:

Demo code:

CREATE TABLE dbo.TableA (ID INT);
CREATE TABLE dbo.TableB (ID INT);
CREATE TABLE dbo.TableC (ID INT);
CREATE TABLE dbo.TableD (ID INT);
CREATE TABLE dbo.TableE (ID INT);
CREATE TABLE dbo.TableF (ID INT);
GO

CREATE OR ALTER PROCEDURE dbo.ManyBranches AS
BEGIN
    SELECT TOP 1 'This shows up' AS aFact;

    IF DATEPART(SECOND, GETDATE()) < 10
        SELECT * FROM dbo.TableA;
    ELSE IF DATEPART(SECOND, GETDATE()) < 20
        SELECT * FROM dbo.TableB;
    ELSE IF DATEPART(SECOND, GETDATE()) < 30
        SELECT * FROM dbo.TableC;
    ELSE IF DATEPART(SECOND, GETDATE()) < 40
        SELECT * FROM dbo.TableD;
    ELSE IF DATEPART(SECOND, GETDATE()) < 50
        SELECT * FROM dbo.TableE;
    ELSE IF DATEPART(SECOND, GETDATE()) < 60
        SELECT * FROM dbo.TableF;
    END

    SELECT 'This does not show up' AS anOpinion;
GO

/* Turn on actual execution plans: */
EXEC dbo.ManyBranches;
GO


/* Turn off actual plans: */
DBCC FREEPROCCACHE;
GO
EXEC dbo.ManyBranches;
GO
sp_BlitzCache

This is one of the reasons you won’t catch me using the “Edit Query Text” in query plans during this week’s Mastering Query Tuning class.

↧

[Video] Office Hours 2018/7/25 (With Transcriptions)

July 28, 2018, 9:01 am

≫ Next: Common Entity Framework Problems: N + 1

≪ Previous: [Video] “Actual” Execution Plans Lie for Branching Queries

This week, Erik and Richie discuss whether it’s relevant to specify data in logs in SQL cloud environment, licensing, using canary tables on Availability Groups, how Entity Framework limits tuning, reusing older databases, and more.

Here’s the video on YouTube:

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

If you prefer to listen to the audio:

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes

Office Hours – 7-25-18

In the cloud, should I separate the log files?

Erik Darling: Brad asks, “Is it relevant to specify separate drives for the data and logs in a SQL cloud environment since I/O is not the limiting factor?” So the cloud is tough because you don’t know where your disks are, who they’re hanging out with, what they’re doing. So separate drives, I’m not sure that’s going to buy you much, unless you specify – so different cloud vendors do it differently, so different sized files, different sized drives can sometimes get you better speeds. So basically, the more you pay for your drives, the better the throughput is on them.

So if you have a higher tier of drives or if you have write specific drives that you want to put tempdb or log files on, that would make sense to me. But just separating them for the sake of separating them, I don’t think that’s going to buy you much, especially because in the cloud, it’s so easy to up instance sizes, it might make a whole lot more sense to solve your storage problems by adding memory rather than add more storage or separate things out. How about you, Richie; any thoughts on cloud drives I/O things?

Richie Rump: No. I mean, it’s such a different world up in the cloud. I didn’t think separating will really buy you much, but then again, I don’t deal with a lot of the SQL-ness in the cloud. I’ve been doing a lot of managed providers where you don’t even have a choice of where the logs go; they just go.

Erik Darling: Yeah, you create a database and they go where they damn well please.

How does Development Edition licensing work?

Erik Darling: Julie asks, “Can you explain the licensing for 2014-16 SQL Server dev, especially the not for use with production data? Does that mean you can’t copy production data to the dev box for testing and development?” No, what that means is, as with all things licensing, please check with your licensing representative, whoever you buy stuff from, to make sure on this. But generally, what it means is that you cannot be serving up a production workload from that server.

So, like, you can’t have customers going in and doing stuff, but if you have your developers going in and testing new code and new indexes on there, then that’s considered development. You can use production data for that, but should you?

Richie Rump: No. You absolutely should not. There’s a lot of reasons why you shouldn’t use production data. One off, that happened to me many, many years ago, where you had production emails, or actually emails of users in there and now you start testing email functionality and emails start going out to customers from dev boxes and things like that. But, you know, the question is, do you want actual production data in the hands of your developers? Wouldn’t it be better if you actually had a set of data that you hand-crafted to go off and test certain things that doesn’t usually occur in your production data or may not be in your current production set?

There’s something to be said about having a known set of data outside of production so that you can do really some valid unit tests or valid integration, system integration, tests with. So I don’t usually recommend copying production over into a test or dev environment. Maybe if you’re doing some system testing or some other type of speed test or whatever, but even then, it gets kind of wonky because you’re probably not on the same hardware as you are on production anyway. So I don’t recommend just copying over production and saying hey, dev, go for it, you know. You need to think about it a little bit more and have dev create some data that will go through some tests. So when you are running it, hey, what about this situation or that situation or what if the data gets out of whack here? How’s the system going to go about that, especially if there’s some sort of regression in the software?

Erik Darling: There’s a lot of tools and products out there that can help you, sort of, scrub data or, like, mung stuff up so that it’s using – like you take a production dataset, you change, you flip things around, you kind of make it, you know, illegible to normal people. But that kind of comes back to another thing where it’s like, you know, you have to be very comfortable if you’re going to give developers access to that data. With the data you’re giving them access to – because for everything that people worry about with developers walking out the door with intellectual property, whatever else, walking out the door with customer data is arguably even worse because you take that to a competitor – you know, it’s a much bigger edge to have a customer list rather than, like, some stored procedure that any dingdong could write. So be careful with that too. But further to that, if you’re already giving your developers access to production to do whatever developer nonsense they have to do, you’re running into a whole different set of problems. So aside from the fact that you shouldn’t be using production data in dev, you also shouldn’t be giving your developers access to crazy production data in production. That’s how I would sum that up. I got off on a little bit of a tangent there.

Richie Rump: Well you know, that’s how I roll.

Erik Darling: So, you’re wearing a Cubs shirt. Are you looking forward to more baseball post-season fun this year?

Richie Rump: Oh, no we’re totally going to tank. We’re awful…

Erik Darling: Totally going to tank? Okay, just making sure.

Richie Rump: We lost, like the last two or last three or whatever and we’re terrible. So if we win the next two or something, we’re going to the world series, but as of right now, after last night’s game, [crosstalk].

Erik Darling: Yeah, Mets suck this year too. Just like trading people – no, go ahead, go have fun…

Richie Rump: The Mets were amazing because they were going to the World Series after 11 games, you know. They won 11 straight or something like that, then all of a sudden, everyone went to the hospital and just never came out…

Erik Darling: Aw, like grandparents.

What happens when a cluster poops the bed?

Erik Darling: Daryl says – so Daryl actually has a bunch of questions about Brent’s senior DBA class videos. I apologize that Brent is not here to answer them, so I would say, Daryl, if you have questions that you want Brent to answer about his videos, I would leave them either as comments on the videos or drop him an email with everything in there. I haven’t watched the senior DBA class videos in a while, so I don’t know that I would be able to answer them terribly well. The first one, though, is sort of answerable. Brent talks about the current vote and how it floats between both nodes. Daryl has one for each of current nodes. So the big question for me is, do you have an even number of votes or an odd number of votes? Do you have dynamic quorum or dynamic witness? Like, for me, there’s a lot more than just is each node having a vote a good idea? Like, you want to make sure that you have an odd number of votes so that, you know, the cluster will stay up if one thing kind of poops the bed in a not fun way. So I would need a little bit more information about that to give you further advisements.

Erik Darling: Tammy says, “Poops the bed in a non-fun way, as opposed to pooping the bed in a fun way.” Yes, Tammy, there are different ways to poop the bed that are varying degrees of fun. It’s a wild world out there. Let’s see, long questions, short questions, all sorts of questions…

I have these circular logic permissions problems…

Erik Darling: Ooh, Brian has a question that I think would be good for dba.stackexchange.com, “I have a circular logic issue with database ownership permissions.” Yeah, so that’s a tough problem for me to solve right here, so I would say post that on dba.stackexchange.com with as much information, as much, hopefully, obfuscated code as you’re willing to share about the issue and hopefully you will get an answer from someone who has been through that before. Any other thoughts on that?

Richie Rump: My mind’s blank on that stuff.

Erik Darling: We specifically avoid security anything in our client work because it is such a hassle and liability. And sometimes I feel even dirty when Blitz is like, these people are all sysadmins. I’m like, I don’t want to see their names, what they’re doing…

Should I have canary tables in an AG?

Erik Darling: Let’s see, Daryl asks, “Should I have canary tables on my AGs?” Only if you care about knowing how up to date they are when they failover, so yes. Most people have AGs thinking that they’re not going to have any data loss, or very little data loss. So I would say generally, canary tables are a good idea.

Does Entity Framework limit my tuning capabilities?

Erik Darling: Pablo says, “My dev team always blames ORM for bad performance. How does Entity Framework limit tuning?” That sounds like a Richie question.

Richie Rump: Yeah, the real problem is that it’s kind of obfuscated. So when you write a query for Entity Framework, you have to use their proprietor, or whatever, ORM and you’re using that language which is then translated into SQL which is then translated by SQL to do the plan and do all the other stuff. So yeah, it’s more of an art than I could tell you, hey, if you’re looking for this then you do that and it’s all one, two, three. You have to take a look at what’s going on in your plan, if you have any – sp_BlitzCache is probably the one that I would use first and see if you’ve got some really bad stuff going on there. And then kind of have to trace it back to the actual Entity Framework piece of code and then rework those queries in entity framework so that those are a little bit better to the SQL Server. There’s a lot of stuff that’s going on out there. I think I’ve got a few posts on brentozar.com about some Entity Framework stuff. There may be another one coming out. I think Brent may have released the hounds on another one soon.

Erik Darling: It was sitting around as a draft for like a year and a half. Like, was Richie done with that?

Richie Rump: Turns out I was and I just never looked. I mean, that wasn’t published, who knew? I’ve been busy on this constant thing…

Erik Darling: Yeah, constantly busy working on this constant thing.

Richie Rump: So yeah, it’s not because of entity framework, but a lot of times, it’s the way the developers crafted the code in their linked syntax and the way that gets translated by the Entity Framework itself. There are ways you could go and write your link code so the Entity Framework can write a better query, but it’s a trial and error thing in a lot of ways. So I would say, find your bad queries using sp_BlitzCache, trace them back to the link query, change your link query and then kind of do that back and forth so that you can actually start tuning those queries just a little better. Or, you could just drop it and say, I’m going to write a SQL statement for it or a stored procedure, and just do it that way.

Erik Darling: All sage advice. That’s advice I’d follow too…

Richie Rump: If you ever touch Entity Framework…

How can I build my dev databases faster?

Erik Darling: On that note, I’m going to mess this name up because I have never seen a name that looks like this – Terje – I’m sorry man, or woman, I don’t know how that one goes. If you want to phonetically give it to me in chat, I’ll say your name right. “Our development team complains about slow build times.” So basically, they have this server where they spin up a whole bunch of new databases to do new builds of the app, it sounds like. Sometimes the server has a whole bunch of old databases on it that need to get cleaned up, it bloats out a bunch of stuff, it’s not fun. See, in my head, I’m thinking why are you reusing a server for this? Why wouldn’t you just spin up a VM and have your new stuff all roll out to that VM? But maybe I’m missing something. Richie, what do you think?

Richie Rump: I’m not even sure exactly what the question is over all that.

Erik Darling: The question is, “Wouldn’t it be better to reuse some older databases and just run schema changes instead of building brand new databases each time?”

Richie Rump: Oh, boy, okay. In a build scenario, you typically would want to build it from scratch, okay, and you have all the data and you want to reload it, so just so you make sure that everything is kind of working and there’s no – you’re not worrying about deltas. So when I run the test, the test runs and it works every time, so you want to start from scratch. So yeah, if you’ve got a slow database machine, it could be slow. I’d look for other ways to make that a little bit faster. I feel your pain. Every time that I run a build or check in something into Git for ConstantCare, it automatically builds a new database for Postgres and then it loads data and it runs, you know, 500 different, maybe even more at this point, different SQL tests and it checks a whole bunch of stuff. So I feel your pain.

Ours don’t take as long as you. I think ours take about ten minutes, but it does take a little bit of time. And the more you add, the longer it’s going to take. So if you could start looking for different servers or different ways, maybe creating a VM or having a VM ready to go, I don’t know. There are different ways, but having build servers and stuff like that, that’s a whole different ballgame than what we’re normally used to here at Brent Ozar. You’re talking about building it each and every time. Now, are you creating the server every time is really the big question, because if you’re doing that, install, yeah, that’s going to take forever. But if the server’s up and running and you’re just creating databases, yeah – I’m not sure if I answered the question or not, but…

Erik Darling: Even if you can just spin up a VM that’s already imaged to have SQL Server on it configured a certain way and then just build stuff inside that VM, I think you’d be a lot better off than trying to just abuse this one poor machine over and over again.

Richie Rump: Which is – actually, we use a system called AppVeyor. It’s in the cloud. That’s exactly what it does. It has a standard image. We add some node packages. I think we actually uninstall a node version, we install a different node version and then Postgres is already there. So we’re not installing Postgres ourselves. It’s there and we’re just saying, create this database and create these tables, create these schemas, create all this other stuff and then run some tests.

Erik Darling: Alright, that is all the questions that we have for this week. Thanks, everyone for coming, hanging out, showing up. We will see you next week. Maybe Brent will even show up from a U-Haul, we don’t know yet.

Richie Rump: More from my parent’s house next week.

Erik Darling: Hopefully I’ll still be home, so I don’t know, get the whole thing. Alright, take care, y’all.

Wanna attend the next Office Hours podcast taping live?

Email*
Name*
First Last
Things I want*
- Register me for the Office Hours webcast
- Subscribe me to the daily blog post updates
- I would like a pony

↧

Common Entity Framework Problems: N + 1

July 30, 2018, 6:15 am

≫ Next: New Classes: Dashboard in a Day, Database DevOps, tSQLt, SQL Server Internals, and Avoiding NOLOCK

≪ Previous: [Video] Office Hours 2018/7/25 (With Transcriptions)

I wanna dance with common problems

One of the most common issues that I’ve seen with Entity Framework isn’t technically an Entity Framework problem at all. The N + 1 problem is an anti-pattern that is a problem with ORMs in general, which most often occurs with lazy loading. There was a lot going on in that paragraph, so let’s break it down.

The N + 1 problem occurs when an application gets data from the database, and then loops through the result of that data. That means we call to the database again and again and again. In total, the application will call the database once for every row returned by the first query (N) plus the original query ( + 1).

All of those calls could have been accomplished in one simple call. Let’s look at an example in code:

using (var context = new StackOverflowContext())
{
    var posts = context.Posts
        .Where(t =&gt; t.PostTags.Any(pt =&gt; pt.Tag == "sqlbulkcopy"))
        .Select(p =&gt; p);

    foreach (var post in posts)
    {
        foreach (var linkPost in post.LinkedPosts)
        {
            // Do something important.
        }
    }
}

Here’s the SQL generated from this code:

SELECT 
    [Extent1].[Id] AS [Id], 
    /* All columns from the Post table are in the SELECT. Extra columns removed for brevity */
    [Extent1].[TagsVarchar] AS [TagsVarchar]
    FROM [dbo].[Posts] AS [Extent1]
    WHERE  EXISTS (SELECT 
        1 AS [C1]
        FROM [dbo].[PostTags] AS [Extent2]
        WHERE ([Extent1].[Id] = [Extent2].[PostId]) AND (N'sqlbulkcopy' = [Extent2].[Tag])
    )

In this example, we’re getting data from the Posts table, and the PostTags table where the Tag equals “sqlbulkcopy”. The problem starts to occur in this line:

foreach (var linkPost in post.LinkedPosts)

Do you see it?

The problem is that in our original query we’re not getting data from the LinkedPosts entity, just data from Posts and PostTags. Entity Framework knows that it doesn’t have the data for the LinkPosts entity, so it very kindly gets the data from the database for each row in the query results.

Whoops!

Obviously, making multiple calls to the database instead of one call for the same data is slower. This is a perfect example of RBAR (row by agonizing row) processing.

This is the SQL generated from our code:

exec sp_executesql N'SELECT 
[Extent1].[Id] AS [Id], 
[Extent1].[CreationDate] AS [CreationDate], 
[Extent1].[PostId] AS [PostId], 
[Extent1].[RelatedPostId] AS [RelatedPostId], 
[Extent1].[LinkTypeId] AS [LinkTypeId]
FROM [dbo].[PostLinks] AS [Extent1]
WHERE [Extent1].[PostId] = @EntityKeyValue1',N'@EntityKeyValue1 int',@EntityKeyValue1=23868934

This query is sent to SQL Server 449 times, and the only thing that’s changing it the EntityKeyValue value.

Ugh.

How can we fix it?

There is one fast way. It’s not optimal, but it will be better! Use an Include (also called eager loading) in the LINQ statement. Using an Include will add ALL of the data from the LinkedPosts entity, but it’s a simple fix without much retesting. Who likes testing code? No one. That’s why companies pay through the nose for software that does it automatically.

var posts = context.Posts
    .Where(t =&gt; t.PostTags.Any(pt =&gt; pt.Tag == "sqlbulkcopy"))
    .Include(p =&gt; p.LinkedPosts)
    .Select(p =&gt; p);

Now when the LinkedPosts entity is called, the Posts entity will have all of the data for the LinkedPosts entity. It will not make any additional calls to the database. That’s a good thing, right? Databases are cranky. That’s why DBAs are cranky.

Here’s the SQL that’s generated:

SELECT 
    [Project2].[Id] AS [Id], 
    /* All columns from the Post table are in the SELECT. Extra columns removed for brevity */
    [Project2].[LinkTypeId] AS [LinkTypeId]
    FROM ( SELECT 
        [Extent1].[Id] AS [Id], 
        /* All columns from the Post table are in the SELECT. Extra columns removed for brevity */
        [Extent1].[TagsVarchar] AS [TagsVarchar], 
        [Extent2].[Id] AS [Id1], 
        [Extent2].[CreationDate] AS [CreationDate1], 
        [Extent2].[PostId] AS [PostId], 
        [Extent2].[RelatedPostId] AS [RelatedPostId], 
        [Extent2].[LinkTypeId] AS [LinkTypeId], 
        CASE WHEN ([Extent2].[Id] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C1]
        FROM  [dbo].[Posts] AS [Extent1]
        LEFT OUTER JOIN [dbo].[PostLinks] AS [Extent2] ON [Extent1].[Id] = [Extent2].[PostId]
        WHERE  EXISTS (SELECT 
            1 AS [C1]
            FROM [dbo].[PostTags] AS [Extent3]
            WHERE ([Extent1].[Id] = [Extent3].[PostId]) AND (N'sqlbulkcopy' = [Extent3].[Tag])
        )
    )  AS [Project2]
    ORDER BY [Project2].[Id] ASC, [Project2].[C1] ASC

See what I mean by it not being optimal? We could rewrite the LINQ statement to have it generate a more optimal query, but that’s not the point of this post. If the performance of the query isn’t satisfactory, you can go down the rewriting the LINQ statement route.

How can we find N + 1 issues?

Not to toot the company horn (but I’m totally going to), one of my favorite ways to find N + 1 problems from the database is by using sp_BlitzCache. After running sp_BlitzCache @SortOrder=’executions’ I get this:

n-plus-1-find-the-issue

Look at those tasty executions!

Captain, I think we found the problem. Now, it doesn’t tell me what line of code is causing the issue, but it does give the SQL statement. I’m sure if you work with the devs, you can figure out where the problem is and fix it. Having the problem statement makes searching the code base a little easier, and there’s a good chance someone will recognize where it comes from.

Back to School Sale: save on online training classes this week.

↧

New Classes: Dashboard in a Day, Database DevOps, tSQLt, SQL Server Internals, and Avoiding NOLOCK

July 30, 2018, 7:45 am

≫ Next: A Query That Should Be Contradicted

≪ Previous: Common Entity Framework Problems: N + 1

We’ve got a few new goodies, and they’re 50% off for a limited time!

Dashboard in a Day – A hands-on workshop using Power BI to rapidly produce great looking interactive reports and dashboards. Your instructor, Microsoft MVP Steph Locke, has a decade of BI and data science experience. Learn more and register.

Database DevOps Featuring Microsoft SSDT – Managing database changes is hard. Learn how to do it properly with Microsoft SSDT in 2 days of hands-on labs. Taught by MVP Alex Yates who has been doing DevOps with databases since 2010. Learn more and register.

Faster Transactions Without NOLOCK – Your application has grown over time, and performance has started to degrade due to blocking and deadlocking. Taught by MVP and published author Kalen Delaney. Learn more and register.

SQL Server Internals 201 – You’re curious. You love learning about the internals of the tools you use. You’re comfortable writing queries, and you’re ready for the next level. Taught by MVP and published author Kalen Delaney. Learn more and register.

Test-Driven Database Development with tSQLt – Learn how to use tSQLt effectively to improve the quality of your database development work. Taught by MVP Alex Yates who has been doing DevOps with databases since 2010. Learn more and register.

And Brent’s next round of Mastering classes starts September 4th with the next Mastering Index Tuning. When you take the Mastering classes, we highly recommend that you take ’em in order – Master Index Tuning, then Mastering Query Tuning, then finally the hardest one, Mastering Server Tuning. If you want to get in on all 3, the one-year Live Class Season Pass is on sale for $2,000 off for the next 10 buyers. That lets you attend all of Brent’s classes for a year straight – enabling you to go again & again.

See you in class!

Back to School Sale: save on online training classes this week.

↧

A Query That Should Be Contradicted

July 31, 2018, 6:15 am

≫ Next: How I Configure SQL Server Management Studio

≪ Previous: New Classes: Dashboard in a Day, Database DevOps, tSQLt, SQL Server Internals, and Avoiding NOLOCK

Innocent Enough

I was writing another query, and became enamored with the fact that HAVING will accept IS NULL or IS NOT NULL as a predicate.

What I ended up writing as an example was this query:

SELECT   v.PostId, SUM(v.UserId) AS whatever
FROM     dbo.Votes AS v
WHERE    v.UserId IS NULL
GROUP BY v.PostId
HAVING   SUM(v.UserId) IS NOT NULL;

Why this query?

I figured the optimizer would take one look at it and bail out with a Constant Scan.

After all, the WHERE clause filters to only NULL UserIds, and this column is NULLable in the Votes table.

The HAVING could only ever produce a NULL. And according to the laws of Logical Query Processing, WHERE is processed earlier than HAVING is.

But that’s not what happens.

Query At Work

And how.

Ach.

Smarter People

Just may kick my butt in the comments about why this doesn’t bail out. My soul and butt are prepared.

Thanks for reading!

Back to School Sale: save on online training classes this week.

↧

How I Configure SQL Server Management Studio

August 1, 2018, 6:15 am

≫ Next: Wait Stats Should Be Easy By Now

≪ Previous: A Query That Should Be Contradicted

Ever go into Tools-Options? SSMS has a stunning number of options these days. Here are some of my favorites:

Documents options

On the Documents options, I uncheck “Check for consistent line endings on load” because I constantly get scripts with all kinds of wacko line endings. That warning is a pain in the butt.

On the Fonts and Colors options, I used to get fancy. There are all kinds of “best programming fonts” articles out there with great-looking fonts. However, when I did screenshots for presentations or clients, people kept asking, “Why does your SSMS look so weird? I sit because you’re on a Mac?” These days, I leave those options at their defaults.

Query Shortcuts screen

On the Query Shortcuts screen, you should set up shortcuts for the scripts you run most often. I don’t – but it’s only because I have a wacko job as a consultant. I’m constantly jumping into an SSMS on someone else’s desktop, and they won’t have the shortcuts set up, so I don’t wanna develop muscle memory for something I won’t have access to. If I was you, though, dear reader, I’d set these up.

Startup options

On startup, SSMS defaults to just opening Object Explorer. I like to open a query window too, though – after all, I’m probably opening SSMS to run queries.

Tabs and windows setup

Under “Tabs and Windows,” check the box for “Show pinned tabs in a separate row.” This way, when you click the pushpin on a given tab, he pops up to the top like this:

Pinned tab

I love that for frequently-used tabs – I might have a dozen query windows open, but I keep coming back to, say, the window with sp_WhoIsActive open. I save that tab with a recognizable query file name, and then when I pin it, it pops up to the top in that separate row.

Speaking of which, those default tabs are hideous – go to Text Editor, Editor Tab and Status Bar:

Editor Tab and Status Bar

Scroll down to “Tab Text” and set everything to False except for “Include file name.” When you click OK, it doesn’t take effect on existing tabs, but after you close & reopen them – ahhh, much more legible. Check out how many more tabs you can fit on a screen:

Tabs, compacted

Next up, going back a little in Text Editor, go to All Languages, Scroll Bars:

Scroll Bars

The default behavior is bar mode, but if you change it to map mode, you get a text map down the right hand side scroll bar. I don’t find that all that useful, so I don’t enable it, but if you’re the kind of person who has long stored procs, you might. The really cool part is when you hover your mouse over the scroll bar map on the right, you get a little zoom popup so you can see a preview of the code at those lines:

Zooming on the scroll bar

I don’t set mine up that way, but I can see why people do, and if you’re reading this post, you’re probably interested in that option. Anyhoo, moving on to All Languages, Tabs:

Losing my religion

SSMS defaults to tabs, and so I switch it to “Insert spaces.” Insert religious flame war here. Moving on….

T-SQL, General

Under Transact-SQL, General, I check the box for “Line numbers.”

Query Execution, Advanced

I would just like to point out that no, I do not set my deadlock priority to high. As far as you know.

Results to Grid

Under Query Results, SQL Server, Results to Grid, I change my XML data size to unlimited so that it brings back giant query plans. (Man, does my job suck sometimes.)

A lot of presenters like to check the box for “Display results in a separate tab” and “Switch to results tab after the query executes” because this gives them more screen real estate for the query and results. I’m just really comfortable with Control-R to hide the results pane.

Designer jeans

Under Designers, I uncheck the box “Prevent saving changes that require table re-creation” because I never just hit save when I make changes in a designer anyway. I always click the generate-scripts button, but strangely, you can’t even generate scripts when a table re-creation would be required. Personally, I’m a big fan of recreation. Also, parks.

Object Explorer drag and drop settings

Under Object Explorer, Commands, I change “Surround object names with brackets when dragged” to False. I usually find that when I’m dragging an object from the OE pane over into a query, that I specifically need it without brackets for some reason.

After I’m done in Tools, Options, I go into View, Toolbars, Customize. Click on the Commands tab, then choose Toolbar, SQL Editor:

SQL Editor Toolbar

These are the buttons that get shown when you’re working with T-SQL. I click on the Debug control, take a deep breath, and while I’m clicking the Delete button on the right hand side, I scream at the top of my lungs, “WHO THE HELL THOUGHT IT WAS A GOOD IDEA TO PUT THIS BUTTON RIGHT NEXT TO EXECUTE?!?!?”

I have less passionate feelings about the rest of the buttons, but I still end up deleting most of them. I don’t really need a button for Query Options or IntelliSense, and I like a clean, minimal UI. After I’m done cleaning out the SQL Editor toolbar, I click the toolbar dropdown, choose the Standard toolbar, and clean that out too. No, I’m never starting an Analysis Services DMX Query. I certainly don’t need buttons for copy or paste. (The only reason I even leave “Execute” as a button is because sometimes I like showing training class attendees that the execution is about to start.)

Minimal toolbars

The end result is a much smaller set of buttons, and they all fit on a single row even when I’m editing queries.

Back to School Sale: save on online training classes this week.

↧

Wait Stats Should Be Easy By Now

August 2, 2018, 6:15 am

≫ Next: How to Check Performance on a New SQL Server

≪ Previous: How I Configure SQL Server Management Studio

Why Is My Query…

We’ve all started a question with a close approximation of those words. No matter how you finish that sentence, there’s some basic information that you need to collect to figure it out, like:

Query plan
Wait stats
Other server activity

Those are a good place to start. It’s easy enough to get a query plan, either by running the query with actual plans on, getting an estimated plan, or retrieving it from the plan cache with sp_BlitzCache.

The last two can be tough, unless you’re observing the problem, or you have a monitoring tool in place.

No, I’m not trying to tell you to buy a monitoring tool.

SQL Server Should Do This

We’ve got Query Store. It tracks an insane amount of metrics about nearly every single query that runs.

Per database.

We’re talking aggregate metrics, the query plan, the text, set options, compile time and memory, and with SQL Server 2017, we get aggregate wait stats in there, too. That’s totally awesome information to have. You can go a long way with that information.

The trouble is that people aren’t really adopting it quickly. There are a lot of questions about the overhead, about the kind of information it collects and if it will expose user data (which is yet another check box if GDPR is a concern to you), how much space it will take up, and more.

That’s why I’ve opened this ~~Connect~~ Feedback item:

Database Level Option For Storing Wait Stats

Vote Early, Vote Often

I’m hoping that a feature like this could solve some intermediate problems that Query Store doesn’t.

Namely, being lower overhead, not collecting any PII, and not taking up a lot of disk space — after all, we’re not storing any massive stored proc text or query plans, here, just snapshots of wait stats.

This will help even if you’re already logging wait stats on your own. You still don’t have a clear picture of which database the problem is coming from. If you’ve got a server with lots of databases on it, figuring that out can be tough.

Understanding what waits (and perhaps bottlenecks) a single database is experiencing can also help admins figure out what kind of instance size they’d need as part of a migration, too.

Especially going to the cloud, configuring instances can feel a lot like hitting the “Random” button on your character configuration screen. You just keep pressing it until something makes you laugh.

Thanks for reading!

Back to School Sale: save on online training classes this week.

↧

How to Check Performance on a New SQL Server

August 3, 2018, 6:15 am

≫ Next: [Video] Office Hours 2018/8/1 (With Transcriptions)

≪ Previous: Wait Stats Should Be Easy By Now

So you just built a brand new SQL Server, and before you go live, you wanna check to see if it’s going to perform as well as your existing server.

The easy way: test your maintenance jobs

I’m just asking for the chance to test it, that’s all

It’s as easy as 1, 2, 3:

Restore your production backups onto the new server
Do a full backup, and time it
Run DBCC CHECKDB, and time it

Oh sure – this is nowhere near as good as testing your application code, and you should do that too, but this is your very first test. It’s extremely easy, and often it surfaces trouble very quickly. If your backup and CHECKDB runtimes are slower than your current production server, ruh roh – you’re in trouble.

This isn’t perfect because:

Your job start times may be different – for example, if your production backup jobs run in the middle of the night at the same time as your ETL jobs, then your production backups could be artificially slower.
Your concurrent workload may be different – maybe all your production backup jobs point at the same shared file target at midnight, making it dead slow. When you test your new server at 1PM in the afternoon when nothing’s happening on the shared file target, they may be artificially fast.
Your backup target may be different – the production servers might be writing their backups to local storage, which is of course one hell of a bad idea.
Your new version of SQL Server may be different – if you’re migrating from 2014 to 2016, and you find that CHECKDB runs faster, it might be the CHECKDB improvements in 2016.

But again – all of that said – if you find that your new production server’s maintenance jobs run slower than your current production servers, time to set off the alarms.

The easy but wrong way: test a synthetic workload

You could download HammerDB, an open source load testing tool, and run the kinda-sorta TPC-C workload against your SQL Server. It’s not an official TPC benchmark, but it’s a repeatable workload that you can run against multiple servers to see whether one is faster than the other.

At that made-up workload.

Which is probably nothing like your real apps.

Using HammerDB to compare two of your production servers is like comparing a Porsche 911 and a Chevy Corvette by measuring how many live babies they can carry in the trunk. It doesn’t really matter who wins – the comparison is meaningless.

The harder way: test individual queries

Use sp_BlitzCache to build a list of your worst-performing queries. Then run those same queries against the new production server to compare:

Their logical reads
Their duration
Their CPU time
Their execution plans – to understand why the above numbers are different

Just running the same queries back to back isn’t all that hard, but the reason I call this method “harder” is that you have to have enough SQL Server performance tuning knowledge to understand why the numbers are different, and what control you have over them. Are they the result of version/edition changes, SQL Server setting differences, or hardware differences? This stuff takes time to analyze and correct.

The really hard way: test your real workload

“Just run the same queries we run in production – how hard can it be?” Well, at scale, very hard – because it involves:

The same starting point – after all, you can’t run the same delete statement twice in a row to achieve the same effect, and every insert statement affects subsequent test runs
The same data load distribution – it’s easy to run an identical query across 1,000 sessions, but if they’re all trying to lock the same row, then you’re just reproducing blocking issues that you don’t have in real life
The same workloads – and since your app and your database is constantly changing, the query traces you gathered 3 months ago are meaningless today, so you have to keep rebuilding this wheel every time you want to tackle a load testing project

Is it possible? Absolutely – but just start with the easy stuff first to get as much valuable data as you can in less time.

Because remember – even when you find differences between the servers, that’s still only the start of your journey. You have to figure out why things aren’t performing as well as you’d expected – and now you’re right back at the prior step, comparing query plans and doing root cause analysis. Before you try to do that across 100 queries, start with your worst-performing query by itself.

Back to School Sale: save on online training classes this week.

↧

[Video] Office Hours 2018/8/1 (With Transcriptions)

August 5, 2018, 7:45 am

≫ Next: Two Important Differences Between SQL Server and PostgreSQL

≪ Previous: How to Check Performance on a New SQL Server

This week, Erik and Richie discuss monitoring tools, finding all unused tables across databases, query tuning, deleting vs hanging on to indexes, sharding databases, query editors, aggressively-locked indexes, why a plan would not be in the plan cache, and Richie’s current housing situation.

Here’s the video on YouTube:

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

If you prefer to listen to the audio:

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes

Office Hours Webcast – 2018-08-01

Erik Darling: Let’s see – [Shoab] asks, “In your experience, what is the best third-party tool for monitoring query performance, tracking slow queries, and is also user-friendly?” Well, [Shoab], assuming you’re not talking about something for free – if you want something for free, our First Responder Kit has a Power BI dashboard where you can go in, it will log a bunch of stuff to tables, it will show you wait stats and stuff. Power BI, not necessarily the most user-friendly thing in the world – it’s probably not going to give you all the slices and dices and reports that you want, which is shocking for a tool that’s called Power BI.

If you’re comfortable with spending around $1000-$1500 per monitor instance, SentryOne Performance Advisor is a perfectly good third-party monitoring tool. Quest Spotlight is also another perfectly good monitoring tool. I would check those two out. They’re just about commensurate in, like, you know, dashboards and gidgets and gadgets. It all just comes down to which one you end up being more comfortable with, or whichever – you know, if you are comparing products, you can make the salespeople fight to the death, so whichever salesperson wins is usually the product you go with.

Erik Darling: Boy oh boy, there are some long questions in here – I’m trying to read… Rob asks, “I’m trying to upgrade from 2008 R2 to 2017; having problems with queries using a linked server.” Boy, howdy… Post that one on dba.stackexchange.com. There is no easy way to troubleshoot something like that from here. You’re going to want to include all sorts of linked server details and login details and stuff like that. I’m sorry, Rob, that is past the limit of what I can troubleshoot quickly via webcast not looking at your screen.

Richie Rump: I understand people who have to use linked server, but I’ve really never been a big fan of using linked server. I’ve always been more of a fan of, hey I’m just going to shove the data over to another server and work from it from there.

Erik Darling: But you know, normally I would be like, cool just use SSIS. But then I’m thinking, man, upgrading SSIS packages and all the other stuff, maybe that linked server doesn’t sound so bad.

Richie Rump: Yeah, well to an SSIS guy, they’d be all excited for that kind of stuff…

Erik Darling: yeah, that’s a good source of consulting hours.

Richie Rump: Yeah, a DBA and a dev would be, like, SSIS? No, please no. it works. Don’t touch it.

Erik Darling: When I type in SSIS, nothing comes up in the thing; how do I do it?

Richie Rump: I’m just glad I don’t have to do that anymore.

Erik Darling: Let’s see, [Nika] asks, “We have 2008 R2. Is there a simple way to find all unused tables across databases?” Yeah, all those scripts that do that are kind of liars. So we have sp_BlitzIndex, and sp_BlitzIndex has a parameter called GetAllDatabases. And GetAllDatabases will go and look, just like it sounds like, at all of your databases and it will go and look at indexes and diagnose stuff. The trouble is that index metadata isn’t the most reliable thing in the world. So figuring out if a table is used or unused isn’t really as simple as, like, when was the last use or seek or scan, because that metadata could have gotten knocked out for some reason. It might not be the kind of long-term thing – it might not be long-term enough to have detected some use.

One thing that we’ve run across many times is, like, we’ll be looking at BlitzIndex and be, like, man, we have this completely unused index. This server has been up for three weeks. No one’s touched this thing ever. This thing must stink. But then, it’s attached to some quarterly report or some monthly report that only gets touched once in a while, but when we need it, we really need it. So if you think that you have unused tables across your databases, just start changing the names of them and if anyone complains or if queries start failing then change them back. Just kidding…

Richie Rump: Millions of people can’t access your website anymore…

Erik Darling: No problem, just sp_rename.

Richie Rump: Please call Erik…

Erik Darling: No, don’t call me. But there’s really no great way to tell that and there’s really no simple script to be able to tell that. The only way you’re going to be able to figure that out is really profiling the application and figuring out who uses what. So, unfortunately, no, there’s really not a good way to do that.

Erik Darling: Let’s see here. You can tell there are some humdingers in the list. “On SQL Server 2014 Standard, I have a query that is not utilizing seeks. I am getting an excessive memory grant warning. What is going on? Why can’t I seek on any of my clustered indexes? This is a simple query; no distinct, just joins left outer.” Eric, I know that you spell your name wrong because it’s with a C, but let me tell you, I have no magic crystal ball to peer into to tell you how you have done something ridiculous with your query that is disallowing seeks.

A lot of the times, if you’re just joining tables together, unless you get a nested loops join – like if you’re using a hash join or a merge join – and you don’t have a predicate on that table, a lot of the time, you will just see an index scan because it makes a whole lot more sense to go and read through the data, scan through it, get all the rows and then figure out which ones you’re keeping or getting rid of at the join operation. So don’t be all hung up on seeks versus scans. Scans are not your enemy. You want to be careful of the bad kinds of scans. And the bad kinds of scans come from awful things like non-SARGable predicates and functions and joins in where clauses; stuff like that. also, index key order will matter; so if you do have some predicates in there, the ability for an optimizer to seek into an index is determined by key order.

So there’s a lot of potential issues; a lot of things that could potentially lead to the fact that you are not seeing seeks in your scan. But I don’t immediately look at an index scan and say, oh my god, the world is coming to an end. It’s a bit of a knee-jerk reaction.

Richie Rump: The world turned upside down.

Erik Darling: Things just seemingly spinning out of control…

Erik Darling: Sheila asks, “If an index only has a value less than 10 in user updates and zero seeks or scans, is it worth keeping?” Sheila, I don’t think you have enough information about that index to make a decision about if it should live or die. I think that you should hang onto that index for a little while longer and see what happens with it. A lot of people want to jump to get rid of indexes. I just don’t know how long your server’s been up for. I don’t know, like, what your application looks like. There could be a lot of outside factors that would make having that index around useful; like if it’s a unique index maybe or if it’s a filtered index or if it offers the optimizer – if it’s like on a foreign key. There are a lot of reasons why we keep indexes around even if they’re not getting used a whole lot. So don’t just go and get rid of that index willy-nilly.

Richie Rump: yeah, and you know, the other thing to think about is, if I get rid of the index, what is that going to buy me? Do I have disk issues? Do I have a shortage of memory? And in that case, maybe I should get more disk. Maybe I should get more memory. Having an index is not a bad thing. If it’s duplicate, then yeah, get rid of one of them. But if it’s a small table or even a medium sized table and there’s an index there, ask yourself, what am I getting if I actually delete it?

Erik Darling: You know, a lot of the times, people want to get rid of indexes because I have this gigantic index that never gets used and it’s just taking up a bunch of space when I back it up and I restore it, when I run CHECKDB. Or, you know, if a table is just, like, plain over-indexed and there’s just a bajillion indexes on there and it’s time to start consolidating, chopping some of them off – because having way too many indexes can obviously lead to bad locking problems. You have a whole bunch more copies of that data that you have to lock when you want to modify it, but it doesn’t sound like it’s being written to a whole lot.

And from the amount of time that you’ve been observing it, it doesn’t sound like the index has been used. So I would just keep a closer eye on it. I would probably want to trend that usage over, like, a month or three and just kind of see what happens. also be careful because, you know, rebuilding indexes, adding and dropping indexes on the same table can also reset the stats for how much that index gets used. So don’t just jump to conclusions on that.

Erik Darling: Let’s see here, “Database sharding – is anything coming to SQL Server to shard databases with?” I don’t think so. I mean, you could use merge replication, if you’re an awful, awful person, but no.

Richie Rump: Yeah, I don’t – I haven’t heard anything.

Erik Darling: I would ask, why do you want to shard your database? What problem are you having that you think sharding might solve?

Richie Rump: Sharding your SQL Server database…

Erik Darling: Yes, your SQL Server database. Other databases might already by sharded. What’s the word for it? Shardified?

Richie Rump: I’m not even going to pretend because there’s a lot of jokes that could have come from that one. Like S3 – I mean, it is a database, so you put files in there and it’s up in the cloud and whatnot and you can actually shard it by, I think, the first couple of characters, and that’s how it determines where it decides to store things. So if you have a lot of data in a bucket in S3, what you would need to do is verify those first three characters so that they all go to different places so that you have uniformity in your data. So those types of things are built into more cloudy type stuff as opposed to stuff like SQL Server.

Erik Darling: Like document databases – you see that a lot there. [Oracle does – it pays too much money to shard so…]

Erik Darling: Steve asks, “I’m looking for a query editor that will only allow select statements, no data or object editing at all. I need to give it to a person that shouldn’t be able to edit data or objects directly, but because of their SQL permissions for a third-party application, SSMS and similar tools will allow them to write and run delete, update…” Wow-wee, that’s a humdinger.

Richie Rump: What about permissions? I mean…

Erik Darling: Well he’s saying that because of their SQL permissions for a third-party application, SSMS and similar tools will allow them to run delete, update, and insert queries, along with drop and create. Boy…

Richie Rump: Yeah, there’s – I mean, query editors, all they do is run SQL, so they don’t filter out what kind of SQL you can run and whatnot. They may do some nice things as far as syntax highlighting and other stuff, but they don’t – when they push stuff to the database, there’s usually not a filter for that kind of stuff. So I don’t know of one that would do anything like that. I would remove the third party permission account from this user or make him a DBA. And if something goes wrong, he gets a phone call in the middle of the night.

Erik Darling: It’s a little unfair to ask you to support someone’s ill-conceived permissions in SQL Server. So what I would say is, fine, this person has the ability to do this. If they do any of these things, I am not fixing it. It is up to them to fix it, because that’s messed up. I would just draw that line. I would make a moat around that.

Richie Rump: I would bet you, that user has a title of CEO or something, or C-level…

Erik Darling: Analyst – always the analyst; always.

Richie Rump: But yeah, they have the user ID for the third-party app password; I’m not down with that. My developer bones are jiggling. No, that’s not for you; that’s for the app.

Erik Darling: I would really want to ask why we can’t give them a separate read-only login just to do this stuff and then have them, you know, have the app do its other stuff, like its login with the appropriate permissions. Someone would have a really hard time justifying that setup to me without me, like, starting to throw things.

Richie Rump: yeah, in fact, this week I created a role for Brent and Erik that’s just read-only. I called it neutered admins. All they could do is just read-only and that’s all you could do. Sorry, you’re neutered. He doesn’t play with other puppies anymore. Sorry, Erik.

Erik Darling: Ah, feels good. Actually, in a way, I’m grateful to have that kind of restraint. I’m happy that I can’t accidentally mess anything up. I’m thrilled. I can’t make that mistake. I can’t update or delete that thing. I can insert some things sometimes to very specific places where it’s not going to break anything, but it’s not on me. Richie, in his wonderful responsible foresight, has taken away my ability to do more damage than I should be able to do and I am grateful to have that shock-collar.

Richie Rump: I mean, when I was consulting fulltime, I absolutely told them, I do not want permissions to prod. You can give me read-only, but I will not take any account that has permissions for prod. That is not my job. My job is to write code, development. Give me full access to development. I will fight for that, but anything else, I will not accept that login. I will not open that email. I don’t want anything to do with it.

Erik Darling: Nope, I’m deleting that thing on sight. I have a special thing that checks email for the word SA, for that letter combination; just no. it destroys it.

Erik Darling: Let’s see here – Joe says, “BlitzIndex says aggressive index on PK. Row locks and waits are very high; lots of contention. Many key lookups, no missing indexes, should I tune indexes to eliminate key lookups? Boy, oh boy. Yes and no. When BlitzIndex warns about aggressively locked indexes, it’s basically saying that queries are waiting a long time to get locks on that index. So other locks are being held on that index for a long period of time. There are a lot of reasons for that. Usually, it’s that your modification queries, anything that wants to, you know, update or delete or insert, does not have a really good index to find its way to the data needs. This is particularly true of updates.

I see this a lot. Updates and deletes mostly – inserts, not as much because an insert, you’re just kind of putting stuff in. I’ll talk about that in a second.. But usually, for updates and deletes, there’s usually a where clause and that where clause needs to get supported, just like a where clause for any other query. So most of the time, when I see aggressively locked indexes, it’s for one of two reasons. It doesn’t sound like the first reason’s going to be for you because it doesn’t sound like you have a bunch of other non-clustered indexes on there that are also getting locked. It’s just a primary key, which seems to me like this table is under-indexed for other queries.

So rather than focus on key lookups, which are a totally okay thing to tune for if you find them being problems, mostly what I would want to do is start looking at the modification queries that my app or whatever issues and I would want to start looking at the where clauses for those and making sure that they have a good path to the data they need to get so that locks get in and get out faster.

For inserts, updates, and deletes, batch size is very important. So if you have quite large tables and your queries are looking to modify quite large chunks of data at a time, that’s usually when we start to see stuff like – we’ll see lots of lock escalation or attempted lock escalation or queries that run for a long time because they’re waiting to get locks across all the huge chunks of data. So a fellow named Michael J Swart has – I’ll find the link and I’ll paste it in there because I’m sure it’s up in my do-da bar. Anyway, one thing that you could do to attempt to reduce aggressive locking is to not try to lock as much stuff at once. So if you bring your batch size down to like 1000, 5000 rows, somewhere in there, you usually have less aggressive locking going on behind it.

Erik Darling: Let’s see here. We’ve got one more and we’ll finish strong on this. Teschal says, “My proc cache is almost one year old. If the last execution time of a proc is NULL, is it safe to say the proc has never been called?” No; it’s never safe to say that because someone might recompile something. A plan for that thing could be invalidated and that plan could just appear from cache. It is not safe to say that in the least. This is one of those – again, like with the unused tables and indexes thing, this is not something that you can hit F5 on once and make a call on. This is something you have to profile over time, especially for something as ephemeral as the plan cache which is – wait a minute… Richie, are you thinking what I’m thinking?

Richie Rump: No, I never think what you’re thinking; it gets me in trouble with the wife.

Erik Darling: This server hasn’t been rebooted in a year. This server hasn’t been patched in a year. That sounds suspicious to me. So, Teschal, what I would say is patch your server because it sounds like this thing hasn’t had any patching love in quite a while. But beyond that, no. again, there are a lot of reasons why a plan might not be in the plan cache. If you hit F5 once and you want to drop a procedure or a table or an index, you have lost your mind. This is the kind of thing that you have to, you know, learn over time, profile your app, profile the queries that run in, try to make some determination on there. If you have folks who are spinning up things that are out of use or out of favor and don’t get used anymore then they have to be responsible for the change management procedure to get rid of those things. It is not something that someone should be trying to ascertain from looking at DMV diagnostic data because it can be terribly misleading and it can be terribly inaccurate sometimes too.

Richie Rump: Like me…

Erik Darling: Richie’s very misleading. He’s always wearing these short shorts to the company outings. I’m like, hey, how’s it going? And he’s like, don’t look at me.

Richie Rump: No, don’t.

Erik Darling: Alright, that’s all the time we have this week, folks. Thank you for joining us. We will see you next week. Brent will hopefully be back from his road trip to San Diego…

Richie Rump: Brent Ozar’s Big Adventure…

Erik Darling: Weekend At Brenty’s.

Richie Rump: He’s on a red bike right now traveling across the country.

Erik Darling: He’s got his peewee suit; it’s nice. He’s having a good time. Alright, take care, y’all, bye.

Time is running out on our Back to School Sale: save on online training classes this week.

↧

Two Important Differences Between SQL Server and PostgreSQL

August 6, 2018, 6:15 am

≫ Next: A Visual Guide to Choosing an Index Type

≪ Previous: [Video] Office Hours 2018/8/1 (With Transcriptions)

SQL ConstantCare® uses PostgreSQL as a back end – specifically, AWS RDS Aurora – so I’ve spent a lot of time writing Postgres queries lately. Here are some of the things I’ve noticed that are different.

CTEs are optimization fences.

In SQL Server, if you write this query:

With AllPosts AS (SELECT * FROM StackOverflow.dbo.Posts)
SELECT *
  FROM AllPosts
  WHERE Id = 1;

SQL Server builds a query plan for the entire operation at once, and passes the WHERE clause filter into the CTE. The resulting query plan is efficient, doing just a single clustered index seek.

In Postgres, CTEs are processed separately first, and subsequent WHERE clauses aren’t applied until later. That means the above query works just fine – but performs horribly. You’ll get much better results if you include your filters inside each CTE, like this:

With AllPosts AS (SELECT * FROM StackOverflow.dbo.Posts WHERE Id = 1)
SELECT *
  FROM AllPosts;

That’s less than ideal.

You can’t just leap into an IF statement.

In SQL Server, you can just start typing conditional logic and execute it:

IF EXISTS (SELECT * FROM StackOverflow.dbo.Users)
    SELECT 'Yay'
ELSE
    SELECT 'Nay';

That’s useful if you want to do conditional processing, set variables up, populate them for different scenarios, etc.

In Postgres, you have to do a little setup to declare that you’re doing procedural code:

DO $$
BEGIN
IF EXISTS (SELECT * FROM rule) THEN
    SELECT 'Yay';
ELSE
    SELECT 'Nay';
END IF;
END $$;

But that doesn’t work either, because you can’t output data from a DO:

ERROR:  query has no destination for result data
HINT:  If you want to discard the results of a SELECT, use PERFORM instead.
CONTEXT:  PL/pgSQL function inline_code_block line 4 at SQL statement

<sigh> You really want to create a function. Which reminds me: Postgres functions are the equivalent of SQL Server stored procedures. Sure, SQL Server’s user-defined functions have a really bad reputation: most of ’em get bad row estimates, inhibit parallelism, and cause performance tuners to point and giggle. Postgres functions? Totally different. Just basically stored procs.

And one less-important difference: GREATEST and LEAST.

Every now and then, I need to find the higher (or lesser) of two things in a row. Let’s say our dbo.Users table has two columns, UpvoteDate and DownvoteDate, and I’m trying to find the most recent date that they cast ANY kind of vote. Postgres has this really cool trick:

SELECT GREATEST(LastUpvoteDate, LastDownvoteDate) AS VoteDate
FROM dbo.Users;

GREATEST is like MAX, but across columns. GREATEST and LEAST are two conditional expressions that we don’t get in SQL Server. Nifty.

Time is running out on our Back to School Sale: save on online training classes this week.

↧

A Visual Guide to Choosing an Index Type

August 7, 2018, 6:15 am

≫ Next: Why Does My Select Query Have An Assert?

≪ Previous: Two Important Differences Between SQL Server and PostgreSQL

Warning: I’m about to overly simplify a whole lot of topics to make things easy. Armchair architects, warm up your flamethrowers.

Your table has rows and columns kinda like a spreadsheet:

In most applications, your users care about all of the rows, and all of the columns. However, they put certain columns in the where clause more often than others, so you design indexing strategies around those. You may also get fancy with indexing for group by, order by, windowing functions, etc.

Their queries are vaguely predictable, and they don’t change too often, so you can design indexes every now and then, and you’re good.

That’s how normal tables work.

In some apps, your queries only care about a very specific set of rows.

They constantly – and I mean CONSTANTLY, like 99% of your queries – filter for a very specific set of rows, like under 5% of the table, and these rows are easily identified by specific values in a given column:

This is a great candidate for a filtered index – an index with a where clause.

Filtered indexes make the most sense when they’re highly selective. In the above example, if 99% of our rows had matched the filter we were looking for, then a filtered index isn’t usually going to dramatically improve performance.

In some apps, data is loaded and deleted in big groups.

The classic example is a big (say, 1TB+) sales table in a data warehouse where every row has a SaleDate:

Partitioning candidate

At first glance, you’d say, “Ah, this data is clearly grouped together! I should partition this data by SaleDate, and it will make my queries faster!”

In some cases, it does – but partitioned tables and partitioned views can often make queries slower rather than faster. If your query doesn’t filter by that partitioning column, SQL Server has to reassemble the rows from the different partitions before moving on to the other parts of your query – and this can involve some painful re-sorting depending on how your joins work.

Where partitioned tables and partitioned views make the most sense is where you need to load an entire partition at a time, or drop an entire partition at a time, in the fastest time possible.

In narrow tables, clustering key design is really important.

If your table only has a couple/few columns:

And if you always filter for equalities on just one or two fields, then you might be able to get away with just a clustered index and nothing else.

When your table is really wide, nonclustered index design becomes more important – and harder.

The more columns you decide to pack into a table:

The harder it is to design enough nonclustered indexes to support your queries – without simultaneously slowing down delete/update/insert operations to an unacceptable degree.

That’s where columnstore indexes can come in handy. If you have a table where you can’t possibly predict what people are going to query on, group by, and order by, and especially if they run a lot of running totals, then columnstore indexes can help.

All of the index types I just covered have huge drawbacks and implementation gotchas. This is just meant as a starting point for your index design journey. Start with regular nonclustered indexes, and then when you hit one of these unusual designs, you can start looking at more niche features.

Time is running out on our Back to School Sale: save on online training classes this week.

↧

Why Does My Select Query Have An Assert?

August 8, 2018, 6:15 am

≫ Next: How to Check for Non-Existence of Rows

≪ Previous: A Visual Guide to Choosing an Index Type

You And Ert

This is a quick post because it came up with a client. I like having stuff to point people to — that’s sort of like automation, right?

Anyway! Lots of plans have Assert operators in them. But they’re usually performing modifications.

Assert operators are used to check Foreign Key and Check Constraint integrity when you modify tables. When you see them in DUI plans, you don’t bat an eye.

But what about in a select query?

Lowered Expectations

You’re probably expecting some trick here, but there isn’t one. Take this query:

SELECT *
FROM dbo.Posts AS p
WHERE p.OwnerUserId = 
(
	SELECT u.AccountId
	FROM dbo.Users AS u
	WHERE u.DisplayName = 'Eggs McLaren'
);

Because it’s a scalar subquery — if AccountId returns more than one value, an error is thrown — the optimizer has to check that a single value is returned somehow.

That somehow is an assert operator.

Passive Assertive

Once the Users table has been scanned and values aggregated via a Stream Aggregate, the Assert operator kicks in to validate that only one value is returned.

Is This Bad?

No, not at all. It’s just an explanation. If performance is a concern, you can try to replace the subquery with EXISTS or CROSS APPLY, but without indexes on columns being matched on, you’re not likely to see much for gains.

Like most other performance problems in SQL Server, queries and indexes tend to work together to solve them.

Thanks for reading!

↧

How to Check for Non-Existence of Rows

August 9, 2018, 6:15 am

≫ Next: The First 3 Things I Look At on a SQL Server

≪ Previous: Why Does My Select Query Have An Assert?

You’re writing a query, and you wanna check to see if rows exist in a table.

I’m using the free Stack Overflow database, and I wanna find all of the users who have not left a comment. The tables involved are:

In dbo.Users, the Id field uniquely identifies a user.
In dbo.Comments, there’s a UserId field that links to who left the comment.

A quick way to write it is:

SELECT u.*
  FROM dbo.Users u
  WHERE NOT EXISTS (SELECT * FROM dbo.Comments c WHERE c.UserId = u.Id);

And this works fine. When you read the query, you might think SQL Server would run that SELECT * FROM dbo.Comments query for every single row of the Users table – but it’s way smarter than that, bucko. It scans the Comments index first because it’s much larger, and then joins that to the Users table. Here’s the plan:

Doing the scan-scan

But another way to write that same query is:

SELECT u.*
  FROM dbo.Users u
  LEFT OUTER JOIN dbo.Comments c ON c.UserId = u.Id
  WHERE c.Id IS NULL;

This can be a little tricky to wrap your head around the first time you see it – I’m joining to the Comments table, but it’s an optional (left outer) join, and I’m only pulling back rows where the Comments primary key (Id) is null. That means, only give me Users rows with no matching Comments rows.

This join plan is completely different: there’s no stream aggregate, and now there’s a filter (c.Id IS NULL) that occurs after the merge join:

Filters are like rose-colored glasses

It’s completely different:

The Users table is processed first
There’s a different kind of merge join (left outer)
There’s a filter after the join

To see which one performs better, let’s use the metrics I explain in Watch Brent Tune Queries: logical reads, CPU time, and duration. In the 10GB StackOverflow2010 database, both queries do identical logical reads and duration, but the join technique uses around 20-30% more CPU time on my VM.

But don’t draw a conclusion from just one query.

I tell my developers, write your queries in whatever way feels the most intuitively readable to you and your coworkers. If you can understand what’s going on easily, then the engine is likely to, as well. Later, if there’s a performance problem, we can go back and try to nitpick our way through different tuning options. The slight pros and cons to the different approaches are less useful when you’re writing new queries from scratch, and more useful when you’re tuning queries to wring every last bit of speed out of ’em.

↧

The First 3 Things I Look At on a SQL Server

August 10, 2018, 6:15 am

≫ Next: First Responder Kit Release: A Year From Now These Will All Stop Working On 2008 And 2008R2

≪ Previous: How to Check for Non-Existence of Rows

1. Are backups and CHECKDB being done? Before I step out on the wire, I want to know if there’s a safety net. If there’s a recoverability risk, I don’t stop here – I keep looking because the rest of the answers affect the safety net we’re going to put in place.

“10TB of data on a Commodore 64, interesting”

2. How’s the hardware sizing compared to data size? How much data are we dealing with, measured in both database quantity and total database file size? Then, how does the server horsepower compare – physical or virtual, how many cores do we have, and how much memory? (In a perfect world I’d know the storage specs too, but that’s usually much harder to get.)

3. What’s the wait time ratio? In any given hour on the clock, how many hours of wait time do we have? If it’s 1 or less, the SQL Server just isn’t working that hard. You can get this from sp_BlitzFirst @SinceStartup = 1. (Again, in a perfect world, I’d have more granular charting like you get from the Power BI Dashboard for DBAs.)

Armed with those 3 things, I have a pretty good idea of whether the server is well taken care of – or not – and if not, whether it’ll be vaguely fast enough to start doing the right database maintenance. For example, if I see an overtaxed, never-backed-up VM with 4 cores, 16GB RAM, and 200 databases totaling 10TB, we’re probably gonna have to have a come-to-Jesus meeting.

↧

First Responder Kit Release: A Year From Now These Will All Stop Working On 2008 And 2008R2

August 10, 2018, 12:19 pm

≫ Next: [Video] Office Hours 2018/8/8 (With Transcriptions)

≪ Previous: The First 3 Things I Look At on a SQL Server

You think I’m kidding.

Time bomb.

Boom.

Get your upgrade underwear on.

You can download the updated FirstResponderKit.zip here.

sp_Blitz Improvements
#1664 – We’re officially smart enough to not warn people that we’re recompiling our own stored procedures.
#1669 – Reworded the stacked instances details. Servers may be installed and not running.
#1687 – @josh-simar has servers linked with AD accounts, and that’ll make for a rootin’ tootin’ bad time when you’re trying to display information about them.
#1695 – @josh-simar found another bugerino for servers that have a member_principal_id over 32,767. Personally I have no idea what that means.

sp_BlitzCache Improvements
#1666 – It took like 4 years, but BlitzCache finally got blocked on a busy server. We are officially reading uncommitted data. Hold onto your pantaloons.

sp_BlitzFirst Improvements
#1680 – You now have the power to skip checking server info, with @CheckServerInfo = 0. Thanks to @jeffchulg for the idea!
#1689 – @Adedba loves some memory analysis. We’ll now give you the skinny on what your RAM is doing.
#1679 – @jeffchulg coded up the magical ability to change the output type to none, if you don’t want any output.
#1676 – @ChrisTuckerNM hit some ran into some XML funk. There were some string concatenation issues.

sp_BlitzIndex Improvements
#1685 – We split the warnings about Heaps into two sections. One for forwarded fetches, and one for deletes.
#1697 – If you wanna examine a single table, we shouldn’t be concerned with all them darn fangled partitions that might exist on another table.
#1679 – @jeffchulg coded up the magical ability to change the output type to none, if you don’t want any output.

sp_DatabaseRestore Improvements
#1681 – @ShawnCrocker fixed a bug where the backup path for diffs wasn’t getting set to null if we weren’t restoring any diffs.
#1673 – @reharmsen fixed things up so you fine people can use Standby mode as expected.
#1672 – @marcingminski added the ability to restore from striped backups! Very fancy!
#1671 – @lionicsql coded a feature that will let you restore full and differential backups in standby. Hooray.

sp_BlitzQueryStore Improvements
Nothing this time around – WON’T SOMEONE PLEASE USE THE QUERY STORE?

PowerBI
Nothing this time around

sp_BlitzLock
Nothing this time around

sp_BlitzInMemoryOLTP Improvements
Nothing this time around

sp_BlitzWho Improvements
Nothing this time around

sp_BlitzBackups Improvements
Nothing this time around

sp_AllNightLog and sp_AllNightLog_Setup Improvements
Nothing this time around

sp_foreachdb Improvements
Nothing this time around

For Support

When you have questions about how the tools work, talk with the community in the #FirstResponderKit Slack channel. If you need a free invite, hit SQLslack.com. Be patient – it’s staffed with volunteers who have day jobs, heh.

When you find a bug or want something changed, read the contributing.md file.

When you have a question about what the scripts found, first make sure you read the “More Details” URL for any warning you find. We put a lot of work into documentation, and we wouldn’t want someone to yell at you to go read the fine manual. After that, when you’ve still got questions about how something works in SQL Server, post a question at DBA.StackExchange.com and the community (that includes us!) will help. Include exact errors and any applicable screenshots, your SQL Server version number (including the build #), and the version of the tool you’re working with.

You can download the updated FirstResponderKit.zip here.

↧

[Video] Office Hours 2018/8/8 (With Transcriptions)

August 12, 2018, 7:15 am

≫ Next: Setting Up SQL Server: People Still Need Help

≪ Previous: First Responder Kit Release: A Year From Now These Will All Stop Working On 2008 And 2008R2

This week, Brent, Tara, Erik, and Richie discuss troubleshooting port blocking, page life expectancy issues, problems with turning off CPU schedulers, coordinating two jobs across servers, adding additional log files to an almost-full partition, tips for getting a new SQL Server DBA job, using alias names for SQL Servers, database going into suspect mode during disaster recovery, SQL Constant Care “Too Much Memory” warning, index operational statistics, running newer versions of SQL Server with databases in older version compat mode, and more!

Here’s the video on YouTube:

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

If you prefer to listen to the audio:

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS.
Leave us a review in iTunes

Office Hours Webcast – 2018-08-08

We can’t connect with Telnet. Now what?

Brent Ozar: First up is a mysterious VRP. VRP says, “Frequently, we have an issue of not being able to connect to port 1433. When we check with Telnet…” Oh, I love – Grandpa VRP, you’re with me in remembering to use Telnet… “Unable to connect to SQL Server, no ports being blocked from antivirus. After restarting the SQL Server services, we’re able to connect using 1433. What should we do to troubleshoot this next?

Erik Darling: Turn on the remote DAC.

Brent Ozar: Elaborate.

Erik Darling: So usually, when you just suddenly can’t connect and then you restart SQL Server and you suddenly can connect, you’ve hit an issue called THREADPOOL. Tara’s blogged about it. I think everyone’s blogged about it at some point. Don’t feel bad though. You’ve just got to check your wait stats. If you see THREADPOOL creeping up in there, even if it’s like tiny increments, then it’s most likely the problem you’re hitting. It’s usually caused by blocking. It’s usually caused by parallel queries getting blocked because they just take a whole bunch of threads and hang on to them and they get blocked and they hang onto those threads and then, all of a sudden, you’re out of worker threads.

So that’s usually what it is and turning on the remote DAC, enabling that, will allow you to sneak in your little VIP entrance to SQL Server and start figuring out what exactly is causing your THREADPOOL waits. You can run, like, WhoIsActive or BlitzWho or something and off to the races.

Brent Ozar: That’s good.

Tara Kizer: Have any of you guys ever tested Telnetting to the SQL to the SQL Server port when THREADPOOL…

Erik Darling: How old do you think I am?

Brent Ozar: Not when THREADPOOL’s happening though; that’s a great question.

Tara Kizer: I mean, I use Telnet all the time when trying to figure out why I can’t connect to a box, but I just wonder if Telnet would fail when THREADPOOL is happening, because you’re still connecting, just that SQL Server is not allowing you in because the server is out of threads; worker threads.

Brent Ozar: That’s such a cool question. Now, I want to find out but not badly enough that I’m going to go recreate the THREADPOOL waits.

Why is Page Life Expectancy dropping?

Brent Ozar: See, Christian asks, “We have page life expectancy dropping to zero and there doesn’t appear to be a performance dip. I’ve looked for large queries scanning big portions of data along with queries with large memory grants.” Wow, you’re like ahead of – two for two, you’re doing good. He said, “What else should I zero in on?”

Tara Kizer: Look at your jobs. See if there’s anything that lines up with when it drops because there’s lots of things that can plummet the PLE, like index maintenance, update statistics; those two. You could also check the error log to see if the, whatever, the DBCC stuff is happening that has wiped it out.

Brent Ozar: Or, when you said error log too, the other thing you could see, maybe something’s forcing external memory pressure, like something else is driving SQL Server low on RAM. Some other process is doing something in SSIS package.

Erik Darling: CHECKDB will…

Brent Ozar: But if nobody’s complaining too, I would go on with your day. Go find the things people are complaining about, like everyone wearing black in the webcast.

Erik Darling: I’m like, what does PLE drop from? Starting from like 100 to zero, then…

Tara Kizer: Yeah, what number – and do the math on that because if it’s a number that’s not ever reaching past a day’s worth of PLEs and what minutes – see if that number even correlates to how often you’re running some of these jobs. Maybe you’re never getting up to a really high number.

Brent Ozar: Or maybe it’s dropping from 5000 to 4000. Who cares?

Should I leave 2 cores offline for Windows?

Brent Ozar: Dan says, “A client with offline CPU schedulers says that they did this on purpose. They want to keep two cores for the operating system. Help me explain why leaving it this way will cause performance problems.”

Erik Darling: How do they know the operating system is only going to use those two cores? What Windows magic do they have? I’ve never seen anyone be able to say, hey, Windows, you can only use these. But maybe they know something I don’t, which is possible; I’ve just never seen it.”

Brent Ozar: Maybe they have some other app that they’ve hardcoded to only use specific cores, although I smell BS too there.

Erik Darling: Yeah, I’ve run into that a few times…

Richie Rump: No programmer’s going to do that of their own volition. It’s like, oh let me go ahead and do core programming, woo.

Erik Darling: I’ve run into that a few times. One person had a bunch of JRE executables that were, like, part of the app on their server and that’s why they left, like, two to four cores offline. Other people have claimed that it’s for SSRS or IS or whatever. I’m like, you can’t just be like, no you only get these; they use what they want. If they can provide some substantive proof that Windows is only using those cores then word-up.

How do I coordinate jobs across servers?

Brent Ozar: This is an interesting one which Richie might be involved with too. Mark asks, “What’s the best way to coordinate two jobs across servers? We’re trying to do backups on one restore and restores on another. We’d ideally like, as soon as the backup job finishes, for the restore to kick off.”

Tara Kizer: Just add another job step to your backup job and have it connect to the other box. You could do a sqlcmd and do sp_start_job on that restore. So the backup job will kick off the restore, or you could do it in reverse; the restore can monitor the backup job, I guess, and just pull it until it’s done. But I would just add a step to the backups since that’s the sequence.

I have 10,000 heaps. Why is sp_Blitz slow?

Brent Ozar: Next up, Michael says, “There’s a SQL Server 2017 server running in 2016 Hyper-V. I restore to database for testing, sp_Blitz takes eight seconds on the 2012 server but it seems to hang on 2017. Sp_WhoIsActive shows that it’s checking for heaps. The database is all heaps and it’s got about 10,000 tables. Where should I be looking?

Erik Darling: At your heaps, boy. Fix those.

Tara Kizer: Fix the problem. Don’t try to troubleshoot bullets.

Brent Ozar: The hell? I’ve got to read these questions before I ask them out loud. I think all of us would have a problem with 10,000 heaps in a database. On the bright side, you found the problem.

When I add data files, should I add log files too?

Brent Ozar: Mark says, “Afternoon, all. I have to add…” A word about time zones here, it’s 9:20 AM over in California. This time zone thing has me so flummoxed. I’ll be looking at like 2PM, in the afternoon, I’m like, where did all the emails go? Oh, that’s right, most of the country is done for the day.

Richie Rump: And what was the error that I got when I did the deployment today?

Tara Kizer: Time zone…

Richie Rump: Time zone…

Brent Ozar: I hate time zones so much. Mark says, “Good afternoon, all. I have to add additional files to my database as I’m reaching the max size of the partition…” Oh, goodness gracious. “Should I add additional log files as well?”

Erik Darling: No.

Brent Ozar: Why?

Erik Darling: Well, not like you just shouldn’t, at all, ever. Like if you have a very tiny drive and you have a log file that’s starting to get bigger and bigger and starting to outpace that drive, then yeah, you might need to add a second log file until you can get that drive situation remediated. But SQL Server writes to log files serially, so it only writes to one at a time and it will kind of like Ouroboros them. Like, if you had a single log file, it will do that from front to back anyway. It will just do that same thing. It will do that same locomotion serially across a whole bunch of log files. So no you don’t really need extra log files unless, you know, you’re running into some sort of apocalyptic situation with the one you’ve got.

Brent Ozar: I always love – every now and then, you’ll read some extreme edge-case of someone who actually needed like eight log files and when you go in, I’m like, I don’t even know how you found that problem. That’s amazing.

Erik Darling: The only person I’ve heard talk about that ever was Thomas Grosser and it was when he had, like, I want to say 128 1MB log files, each on a specific portion of a specific drive and everything was used circularly in some way that increased throughput on his whatever craze super-dome gambling box by like three billion percent. Like, listening to it, you’re like, wow, it’s amazing you came up with that. And then, it’s just like, man, why didn’t you just get some SSDs?

Brent Ozar: I hope I never have that problem.

How should I get a DBA job?

Brent Ozar: Sri has a tough question. Sri is looking for a new SQL Server DBA job. He says, “Any advice or tips, or what’s the best way to get one?”

Erik Darling: Interview… Apply… No, I don’t know.

Brent Ozar: Cross apply…

Erik Darling: I don’t know, are you, like – what are you doing now? Are you working anywhere near a database now? Are you just, like, tangentially interested in touching a database. Do you, like, program? Are you a JUnit, a sysadmin, helpdesk? Whatever you do now, however close you are to the database now, your next job should just get you a step closer to the database until someone finally allows you to put your arm around the database. Don’t air-hand it, like get in there and hug it.

Brent Ozar: And only use your arm. I would call everybody you’ve worked with in the past, or email, because you know us technology people; we don’t like phone calls. Email everyone you’ve ever worked with in the past and just be, like, hey, I’m doing more database work these days. I want to make the next step. Because they people you’ve already worked with, they know you don’t suck. They know you’re not incompetent. They know you’re easy to get along with. They’ve been out to lunch with you, et cetera.

And if that sentence makes you cringe – if you go, I can’t call anyone I’ve ever worked with before – then it’s a big clue to turn around and start doing things differently at your current job. The people you’re working around are going to be your network for the rest of your life. As I say these words, I am suicidal looking at the other people… Oh my god, I’m doomed. I’m never going to get a good job… But no, like, if any of us know people, that’s the fastest route to get into a new company. If you’re a faceless stranger, it’s really, really hard.

Erik Darling: I don’t know, like, just having put my resume somewhere, like years ago, I still get regular recruiter emails like, hey we have this technology position open, you might want to move six states away…”

Tara Kizer: Those are always funny…

Erik Darling: Become an SSIS expert.

Richie Rump: It’s like, oh, so you did Dozer Basic 6… 20 years ago.

Erik Darling: Like everyone else.

Richie Rump: Maybe go to a SQL Server user group meeting. There’s usually one or two folks popping up, hey I’m looking for this, I’m looking for that, and there’s usually a recruiter hanging around there, lurking around the back, you know. You can notice the recruiter because he’s the only one that’s talking to people. That’s the recruiter.

Brent Ozar: Usually overdressed.

I have a query that’s slow in the app, fast in SSMS…but it’s not that.

Brent Ozar: Pablo says, “My app takes one minute to execute an operation. I captured all kinds of metrics and they say that the T-SQL always finishes in two seconds max with no waits. Where should I go to seek the bottleneck?”

Tara Kizer: Seems like you’re application needs to be checked into. It sounds like the query is completing very fast and the bottleneck’s in the application.

Brent Ozar: I’d also look at the metrics on the app server, like how busy the CPU is, whether it’s swapping to disk too.

Erik Darling: So, like, async network I/O is a good wait stat to keep an eye on if SQL Server is just kind of fire-hosing data at your app and your app is not responding in a timely manner. I saw recently that balance power mode on the CPUs on an app server was cutting app response time by, like, 30% to 50%. So you know, little things that you can check on.

Richie Rump: Run a profiler in your application because you may be getting the data, but you may be doing some processing on that data which is taking a long time. So the profiler will tell you how long each function in each line is taking to execute, so…

Erik Darling: What’s a good profiler to run for that kind of app code?

Richie Rump: It depends on your language.

Erik Darling: Assuming it’s probably c# or something, what would you use?

Richie Rump: I forget the name of it. It’s the one that…

Erik Darling: It wasn’t good then…

Richie Rump: Well I haven’t had a need to run c# profiling in a very long time…

Erik Darling: Richie has a stopwatch…

Richie Rump: Whatever the one JetBrains is ranked – hey, I write fast code, man, I don’t need profilers.

Brent Ozar: I thought he was totally going to go for, well when I fix Brent’s queries that come out of PowerBI, I get out the hourglass…

Richie Rump: Can I tell you, I went in yesterday to say, okay I want to see where the slowness is going on in this server, and the top ten queries slow is, like, Brent’s PowerBI queries, boom, boom, boom. And I’m like…

Brent Ozar: I don’t write lightweight queries. I don’t also write good queries. Then there’s, like, no where clause – give me everything. So I am like the preacher who stands up on the pulpit and goes, don’t order by in the database, order by in the database is a bad idea. And you know what I have to do in my queries? I have to use order by because PowerBI can’t manage to order stuff by default on multiple columns. If you want to sort on three columns, you have to come up with a synthetic column in the application or in the database server and then order by that on the way out and then PowerBI will get it. I had to ask Erik for help in order to – I’m like…

Erik Darling: Can you imagine the level of desperation that comes from asking me? That’s like – unless it’s like, I need help moving, then…

Richie Rump: Brent was so embarrassed, he didn’t ask me; he asked Erik.

Brent Ozar: How do I come up with that row number on multiple columns? I suck so bad at windowing functions, it’s legendary. I’m just – it’s not like I don’t like them, they’re awesome. I just don’t ever get to write new queries.

Richie Rump: I think I had a presentation on windowing functions I probably should throw your way there, maybe. No…

Brent Ozar: Unsubscribe.

Erik Darling: You know, we could also just hit, like, some torrent site and get Tableau Server or something.

Richie Rump: You wouldn’t say that if you’d used it.

Erik Darling: No, probably not. But I would say that if I used SSRS, which is almost PowerBI, so I’m probably going to want to get Tableau.

Brent Ozar: You would say it if you used PowerBI…

Erik Darling: Well I do. I hit refresh on PowerBI and I’m suicidal.

Richie Rump: That’s because it hits refresh on Brent’s queries. That’s why you’re suicidal.

Brent Ozar: The lights go dim at Amazon.

Erik Darling: I just love watching the little thing spin when it’s refreshing, waiting on other queries, refreshing – like 12, this number of rows loaded and it’s just spinning and I’m like, oh…

Richie Rump: We had a huge spike in read IOPS yesterday and I’m like, what is this? What is going on here? Brent, was that you? He was like, no, that was not me, I ran at this time, and I’m like, you do realize this was like two o’clock Eastern? He was like, oh wait, yeah that was me. I’m on the West Coast now.

Brent Ozar: I’m like, oh I still have time before the afternoon rush to go run a bunch of PowerBI queries. Oh no, it’s already afternoon in Miami. It’s probably tomorrow in Miami.

Richie Rump: Yeah, it is. It is. Welcome to the future.

Should I use a DNS alias?

Brent Ozar: Hannah says, “Do you use alias names for SQL Servers? What are the pros and cons of using a CNAME for access to your SQL Server?”

Tara Kizer: No real drawbacks on it. The only thing I could think of is needing to have a relationship with the DNS team so that if you switch servers, you can get that switched over and remembering, on upgrade night, you need to get the DNS team to be ready to make that change, otherwise you’re going to be waking someone up.

Why does my execution plan show a key lookup?

Brent Ozar: Marcy asks, “I was stumped by something that feels like it must have an obvious answer. Why would an execution plan have an index seek on a non-clustered index? Everything that it needs is in the non-clustered index, but it still does a key lookup to the clustered index.”

Tara Kizer: It certainly needs something. Look at the output list of the…

Brent Ozar: Predicate…

Tara Kizer: Yeah, hover over it and see what it’s missing. It’s grabbing something…

Erik Darling: Something’s in there.

Brent Ozar: She says the predicate doesn’t show any columns and the output doesn’t show – she didn’t say the output, but I’m guessing, knowing Marcy, the output’s not in there.

Erik Darling: I was going to say, if you’re able to share the execution plan, stick it on PasteThePlan and I would be happy to take a look at it.

Brent Ozar: The other thing is, if it’s a modification query, if it’s doing an update then that can also get locks on the – I’ve seen that grab the key lookup on the clustered index, but…

I got this strange interview question…

Brent Ozar: Niraj says, “I was asked in an interview, our database went into suspect mode during a restore recovery, how would you fix it?”

Tara Kizer: I would fumble in an interview for something like this because how often does this happen? I mean, it happens so infrequently. I mean, sometimes things go into suspect because you’ve done something horribly bad. But it’s rare that you encounter, especially a production database where you’re having to do recovery. I mean, certainly a test environment, this type of thing might happen, but production, rare.

Brent Ozar: And suspect is – it’s not like it’s restoring…

Tara Kizer: Suspect is – probably you’ve lost the disk behind the database. There’s something really bad happening there.

Erik Darling: That happened to me my first day on my last job. I was sitting there looking – I was like just sitting down. I had just gotten my laptop and I was going…

Tara Kizer: They had just given you access too.

Erik Darling: Exactly, and I had like a week or two worth of alert emails that I had to delete from before I could get on my email account. And so I’m going through those and new ones start coming in, this database is in suspect mode, and I’m like, that’s it. I’m going to get fired on the first day.

Richie Rump: Wow, day one hazing rituals. That is amazing.

Brent Ozar: That would be good.

Erik Darling: It turned out that the SAN guy was like moving like moving a one somewhere and it was expected, but no one told me. I’m sitting there, like, I’m done.

Richie Rump: Why is it always the SAN guy?

Erik Darling: Because they have the most power. They control, like, everything. No matter what you use…

Brent Ozar: [crosstalk] it’s transparent, usually. But then when things break…

Richie Rump: It’s really payroll that has the most power, but that’s okay.

Brent Ozar: Human resources – especially our human resources. The only thing I’d look at – and this is terrible. I know I’m going to get flamed for it by somebody because you can never say anything perfect with suspect, but one thing I would check to see is – often I’ve seen antivirus grab a file, when SQL Server restarts, grab lock on a file and then not let go when SQL Server is trying to start up. Then it’s just a matter of getting access to the file again. That can help. The first thing I’d say too is if the thing goes into suspect mode so often that you’re going to ask me that question during an interview, let’s talk about your storage and your hardware. Is that something you’re going to have me do every day, because I’m not sure I really want this job.

Erik Darling: Fire drill.

How do you manage large amounts of VLFs?

Brent Ozar: Anna asks, “How do y’all manage large amounts of VLFs?”

Tara Kizer: Fix it. You need to fix it. Once you fix it, it shouldn’t happen again. So fix your auto-growths. Change them so it’s not 1MB or some low number. You want it to be a little bigger, but not too big. So don’t set it to some really large number. But if you fix it, it should not happen again on that database. It’s auto-growth, the size of the auto-growth of the log files what’s really important. But to fix the issue, you need to shrink it down to a really small size, grow it back out. But changing the auto-growth needs to happen so this doesn’t continue happening.

Why is SQL ConstantCare warning about too much memory?

Brent Ozar: Daryl says, “SQL ConstantCare is warning me about too much memory. I thought I was doing them a favor. Can I just push the memory back down? These folks build cubes and I thought more memory would help.” Well the way that that query is working internally is it’s checking to see that if your buffer pool had stuff in it and is now empty. Typically, what this is driven by is either someone ran a query with a large memory grant – and you mentioned building a cube, which can totally do it; select star form table with no where clause, giant order by.

That query may need a giant memory grant in order to run. And then it turns around and after SQL Server frees all that memory to go run that query, if you run sp_BlitzCache with the sort order of memory grant, you’ll see the queries that have been getting large grants. That’s what I would go through and look at troubleshooting. Do you have queries that say you’ve got a box with 256GB of RAM? Queries are getting 60GB of RAM every time they run; that’s when you start going to tune that thing.

Anything to look out for with 2016 upgrades?

Brent Ozar: Steven asks, “I would like to know your point of view on upgrading SQL Server from 2012 to 2016. Are there any considerations I should look out for or risks?”

Erik Darling: Yeah, 2018.

Tara Kizer: I don’t think I would bother with 2016. I would go with 2017 if 18 wasn’t out.

Erik Darling: Yeah, it’s not like getting a deal on a used car. It’s not like, oh, I’m going to get the 2016 model because it’s cheaper. Go to 2017.

Tara Kizer: You know what – the county of San Diego has – that’s where I started my IT career – they have a policy – and they’re not the ones that run the IT, they outsource that – but the county of San Diego, the government, has a policy that you can never go beyond the current version. They always have to be one version back. And it’s because of running Microsoft products all these years and running into major operating system issues, and so they have this policy. And now the outsource company, the IT people, they can never be on current technologies. So it’s a bad policy.

Richie Rump: What about an in-place upgrade? Should they do that?

Tara Kizer: Yeah, sure, why not…

Erik Darling: While they’re doing dumb crap, they might as well just do it. Just make the most of this. Explore the space.

Brent Ozar: Keeping out for the cardinality estimator too, there’s this cardinality estimator that’s impacted when you change the compatibility level on a database. But I’m a huge fan of the Microsoft SQL Server upgrade guides. They’ve published huge upgrade guides. You don’t read the whole thing. No one has time for that. what you do is look at the table of contents and you’ll learn a ton of stuff just by looking at the table of contents, the stuff they warn you about.

Erik Darling: Pay special attention to, like, breaking…

Brent Ozar: Yeah, small note.

Where can I learn more about index usage statistics?

Brent Ozar: Anika asks, “Is there a good explanation online…” No… “About the metrics that Management Studio shows under index usage statistics; for example, range scans and singleton lookups. I’m trying to figure out what’s going on, specifically index operational statistics.”

Tara Kizer: Well we have to wonder why you’re using Management Studio’s index stuff. Use our sp_BlitzIndex. It will become more clear.

Richie Rump: Yeah, big time.

Tara Kizer: I don’t ever look at that stuff in Management Studio. Index usage – I don’t even know how to get to that screen.

Erik Darling: Things you forget. Reports, maybe?

Brent Ozar: Yeah, and the DMVs are actually really good in terms of – the Books Online documentation on DMVs is really good. I just don’t think any of us pay attention to the specifics inside those range scans and singleton lookups because a range scan can be good or bad. Singleton lookups can be good or bad; that’s okay. I just want to know that the index is getting used.

Richie Rump: A lot of times, I want to target that range scan. I mean, I want to hit that because that’s where I want to go. I write better queries than you, Brent. Just remember that.

Brent Ozar: Look, I need select star for all the singleton lookups. I need to do range scans.

Richie Rump: Predicates? I’ve never heard of her.

Brent Ozar: Predadate, what? Anika follows up with, “Can I use sp_BlitzIndex in production? We don’t really have dev; don’t ask.” So if I had to pick the level of risk between running sp_BlitzIndex in production versus the level of risk of not having a development environment, guess which one I’m more concerned about. Take a wild hairy guess.

Erik Darling: But you know, to answer the question a little bit, we run sp_BlitzIndex in other people’s prod all the time, so we’re pretty cool with it. If it does anything weird on your server, let us know. We have GitHub for that. that’s our insurance policy. You can let us know.

Have the police got Tara surrounded?

Brent Ozar: Ron asks, “Is there a police department copter over you Tara? I hear them broadcasting over a speaker.

Tara Kizer: What?

Erik Darling: Creep.

Tara Kizer: No…

Erik Darling: Stop triangulating Tara, monster.

Tara Kizer: Ron’s in East County too. I did hear, on Monday, I head helicopters outside and then I’ve just seen, in one of my chrome windows, a notification of a fire in a San Diego that broke out. So I’m on the call with my client and it’s like, you know, I’m in California and it is very, very active in fire season. I’m going to look out the window real quick just to make sure…

Richie Rump: Make sure the fire isn’t coming towards us; we’re good.

Tara Kizer: It was the one that started in Ramona. I think it was on Monday, but I think that one’s under control.

Brent Ozar: There have been a lot of fires this year; a lot.

Is compat level 2008 a problem?

Brent Ozar: John asks, “I have SQL Server 2016 with databases in compat level 2008. Is that limiting SQL Server?”

Tara Kizer: You’re not getting some of the T-SQL features. I mean, it just depends what you need.

Brent Ozar: Not getting some of the new cardinality estimator stuff.

Tara Kizer: A lot of people don’t want that new guy though.

Brent Ozar: Backfires…

Erik Darling: Will that mess with the windowing function and the over-clauses that were added in 2012?

Tara Kizer: Yeah.

Erik Darling: So as far as I’m concerned, you can pretty safely bump up from, like, 2008 or 2008 R2 to 2012 without sweating too much about what’s going on there. Obviously the later bump ups can cause some bumps in the night, but 2012 would probably be my bare minimum right now, just because, in case anyone doesn’t know, 2008 and 2008 R2 are no longer supported in, like, a year.

Brent Ozar: It’s coming fast. It’s going to come really fast because, you know of course, the time you start having that discussion with management about this isn’t supported next year, it’s not like they’re going to go, oh well go ahead and install 2017. Go ahead, you can do that this weekend…

Erik Darling: We just got this new server in; crazy you mention it. We got all the paperwork ready and the budget was there. It was amazing. The stars aligned.

Brent Ozar: I had a sales call a while back and somebody has had the hardware sitting in the data center for over a year. They’re like, oh, it’s ready to go for the new SQL Server. I’m like, man, by the time you put it in now at this point – this isn’t normal. Why do you leave it on the palette?

Erik Darling: We had a client-client, not just a sales call, we had a client-client who had a brand new 2016 box sitting around forever and ever and they didn’t do anything with it. They didn’t move anything over to it until they found corruption on their 2008 box. That was the impetus to start moving stuff. Like, oh wait, this one’s screwed. They were like pushed out, forced out, in order to get off that hardware.

Brent Ozar: We should get all our listeners together and do like a potluck hardware kind of thing so that the people with the extra hardware could give Anika their development server.

Richie Rump: See, I tried to do that but then Brent made me turn down the hardware in RDS, so…

Brent Ozar: Yeah, it’s true. Richie went in armed for bear.

Erik Darling: If anyone wants to buy my old, or buy my current desktop, I will sell it to you autographed for like twice as much as I paid for it.

Richie Rump: Does that come with a warranty on it, since you built it yourself?

Erik Darling: Yeah, the warranty is whatever the postal service will cover for insurance.

Brent Ozar: It comes with tire treads on it.

Erik Darling: If I can send this thing medium mail, so…

Brent Ozar: Well that does it for this week’s Office Hours. Thanks, everybody, for hanging out with us and we will see y’all next week. Later, everybody.

Email*
Name*
First Last
Things I want*
- Register me for the Office Hours webcast
- Subscribe me to the daily blog post updates
- I would like a pony

↧

Setting Up SQL Server: People Still Need Help

August 13, 2018, 6:15 am

≫ Next: A Common Query Error

≪ Previous: [Video] Office Hours 2018/8/8 (With Transcriptions)

I Like What’s Happening

I wanna start off by saying that I like what Microsoft has been doing with the setup process — it made a lot of sense to add the tempdb configuration screen, and having a checkbox to turn on Instant File Initialization was amazingly helpful.

Even in the cloud, people still need to install SQL Server, and even in the cloud, not everyone installing SQL Server is a DBA.

It helps to have a setup checklist like the one we put in the First Responder Kit if you fall into that category.

DBAs who have to install SQL Server a lot may have a post-install script they run. In the age of, well, pick any from a long list of buzz words: DevOps, containers, Docker, Kubern-whatever, it sucks to have another moving part that might fail or break.

Modest Proposal

Do for basic sanity settings what’s already happened for tempdb and IFI.

What’s a basic sanity setting?

Cost Threshold for Parallelism
MAXDOP
Max Server Memory
Enable the DAC

At the very least, these are settings that should be in front of people when they’re setting up a server.

If you wanna get extra fancy, you could even let people tweak settings to the model database like autogrowth and recovery model, and setup database mail and alerts.

Death Of The Boring DBA

The cloud is great, and the automation that Microsoft is building sure is nifty, but people still struggle with very basic setup items.

This post might look like dinosaur bones in a few years, but quite often a lot of problems stem from not taking care of the broom and dustpan stuff up front, and not going back to check on things later.

During consulting engagements, it’s really common to hear stuff like “I thought we did that” or “that’s the default so we left it” in really important places.

Thanks for reading!

Brent says: I’d really, really love to see a step in the setup wizard that offers to set up backups and corruption checking. These are table stakes for building a reliable server. I’m stunned by how often SQL ConstantCare® customers are struggling with these basics.

↧