People may look at you funny if you lean over and ask your SQL Server, “What are you waiting on when you run queries?”
You don’t really have to lean over, I suppose. But if you’re serious about maximizing SQL Server performance, asking your server what it’s waiting on is the best way to approach bottlenecks.
In my last post, I described DBA Days: The Brent Ozar Unlimited Show Greatest Hits, a webcast Brent recorded to summarize the work his team did in our test lab. Besides a series of webcasts on improving SQL Server performance in different areas, Brent pulled together a greatest-hits backgrounder suitable for almost all DBAs. In this post, I’ll outline more of that info.
Plus, I’ll point you to that free tool I mentioned last time.
So, how do you figure out what your SQL Server has been waiting on and for how long?
For starters, you can run sp_AskBrent, a free stored procedure Brent offers. During peak loads, let it take a five-second sample, then wait a few seconds and take another five-second sample. sp_AskBrent looks for blocking queries, checks for backup or index maintenance, queries wait statistics and examines Perfmon counters for anything that could contribute to poor performance.
It lets you compare snapshots a few seconds apart. You can then tune based on the resource in which the bottleneck occurs and the wait type:
Storage waits happen all the time in SQL Server. Think of those 15-second I/O warnings in your error logs.
PAGEIOLATCH waits mean you’re waiting to read data from the data file. Brent’s first choice is to tune the queries, removing implicit conversions, making sure to fix sargability issues or functions in the WHERE clause, and tuning indexes so that SQL Server has to read as little data as possible off disk. If tuning doesn’t help, then you may have to spend money adding memory, making your storage faster or both. (Hint: Add memory first. It’s cheaper than solid-state drives.)
WRITELOG waits occur when you’re writing to the transaction log, so Brent recommends writing less data. Don't put processing tables, queue tables and files in the database because SQL Server writes files into the transaction log. Put the log file on solid-state drives. Or, try Delayed Durability (starting with SQL Server 2014), but only if WRITELOG is your biggest bottleneck, you've already tried solid-state storage, you’ve tuned your queries for larger batch operations and the data is “expendable,” like session state.
ASYNC_IO_COMPLETION waits occur when you’re writing data to the data files. Frankly, though, these don’t affect users much. You may see a slowdown in writing the data, but your users aren't running queries directly against the same waits. Make improvements based on ASYNC_IO_COMPLETION waits if you want, but you’ll probably be the only person to notice.
Sure, memory is fast, but it can slow you down if you’re querying a table without indexes hundreds or thousands of times a second, looking through memory for specific rows in that table.
PAGELATCH waits happen when SQL Server takes a quick glance at a page in memory before reading it, to see whether anyone has a lock on it. That can show up as one of your largest waits if you’re hammering the table with queries. Use Dynamic Management Views (DMV) to see which queries are reading the most data on your SQL Server, then create the right index for the table. In fact, you’ll almost certainly find missing index recommendations in the query plan for the queries executing the most reads.
In any event, don’t bother trying to buy faster memory. If your server is new, or even close to new, you probably won’t find faster memory.
ASYNC_NETWORK_IO waits occur when SQL Server has already finished your query results and is trying to send them back to the user. Unfortunately, they can come from any one of several sources. Maybe users are directly querying the database from the other side of the world. Maybe it's a slow network. Maybe the applications are running on VMs all competing for oxygen in a single GigE cable. Maybe an application is processing data by rows, then running some calculation on each row.
If ASYNC_NETWORK_IO is slowing you down, map the waiting queries to the servers they’re waiting on. With SQL Server's DMVs, try to track back to specific VMs, hosts, queries and application names and check workload on the client side.
In The Greatest Hits of DBA Days, Vol. 1: Anatomy of a SQL Server Query I covered Brent’s discussion of running, runnable and waiting operations in SQL Server. When you’re waiting to get more CPU time, SIGNAL waits occur as SQL Server stacks up runnable queries. You can diagnose this by looking at Processor/ % used in Perfmon. You’re looking for a maxed-out processor core, either because the query hits just one core or because you didn't set up SQL Server correctly (incorrect license key, restriction to certain cores) or because you’ve tinkered with affinity masking. Your goal is to get queries to use as many cores as possible.
That’s a quick recap of The Brent Ozar Unlimited Show Greatest Hits, covering the life of the query, the use of wait stats to find performance bottlenecks and some of the most common wait types you’ll see.
Still, a good greatest hits collection makes you want to listen to all the albums. Watch the detailed DBA Days webcasts from Brent and his team.
You’ll still look funny when you ask your SQL Server, “What are you waiting on when you run queries?” But at least you’ll know why you’re doing it.