It’s easy to start tinkering with the gears of SQL query optimization. You open up SQL Server Management Studio (SSMS), monitor wait time, review the execution plan, gather object information and begin optimizing SQL until you’re running a finely tuned machine.
If you’re good enough at it, you score a quick victory and get back to your regularly scheduled chaos. But if you adjust the wrong thing, or adjust the right thing in the wrong direction, well, there went your Wednesday.
SQL query optimization? What makes you think you need it?
Most of the time, it’s a spike in trouble tickets or user complaints. “Why is the system so slow?” your users complain. “It’s taking us forever to run our usual reports this week.”
That’s a pretty vague description, of course. It would be nice if they could tell you, “Things are slow because you’ve got an implicit conversion in line 62 of CurrentOrderQuery5.sql. The column is varchar and you’re passing in an integer.” But it’s not likely that your users can see that level of detail.
At least trouble tickets and phone calls make for an active metric: easy to spot, easy to measure. When they start rolling in, you can be reasonably sure that it’s time for SQL tuning.
But there are other, passive metrics that make the need less clear. Things like slumping sales, which could be due to any number of factors. Is it because painfully slow queries in your online store are making your customers abandon their shopping carts? Is it because the economy is in bad shape?
Or it could be things like sluggish SQL Server performance. Is it because a poorly written query is sending logical reads through the roof? Is it because the server is low on physical resources like memory and storage?
In both scenarios, SQL query optimization can help with the first option, but not the second.
Why apply the right solution to the wrong problem?
Before you go down the path of optimization, make sure that tuning is the right solution to the right problem.
Tuning SQL is a technical process, but every technical step has roots in good business sense. You could spend days trying to shorten execution time by a few milliseconds or reduce the number of logical reads by five percent, but is the reduction worth your time? It’s true that it’s important to meet the requirements of users, but every effort reaches the point of diminishing returns eventually.
Consider these SQL query performance problems and the business context around them:
- Acceptable performance — A query takes 10 minutes to run and the user wants it to run in one minute; that seems like a reasonable disparity and an achievable goal for optimization. However, if the query takes overnight and the user thinks it should run in one minute, then it may be more than a tuning problem. For one thing, you may have to educate the user about the amount of work the query is actually performing. For another, it may be a problem with the way the database was designed or the way the client application was written.
- Utility — Suppose you’re responsible for administering the financial database in a manufacturing company. At the end of every month, users complain about poor performance. You trace the problem to a series of month-end reports run by Accounting that take hours each and go straight into a filing cabinet unexamined by anybody. Instead of tuning, you explain the problem to the business managers and obtain permission to delete the reports.
- Time shifting — Or, suppose those same reports are important for governance but not urgent for the business. If they are run once per week or per month, they can be scheduled for off-peak hours by pre-caching the data set and sending the results to a file. That removes the bottleneck on the other business users and frees the Accounting user from having to wait for the reports.
When you factor business context into your decision to optimize, you can set priorities and buy yourself time.
When you do optimize SQL queries, try SQL diagramming
SSMS and the tools built into SQL Server offer most of what you need for effective SQL query optimization. Combine the tools with a methodical approach around the following steps, as described in the e-book The Fundamental Guide to SQL Query Optimization:
- Monitor Wait Time
- Review the Execution Plan
- Gather Object Information
- Find the Driving Table
- Identify Performance Inhibitors
In step 4, your goal is to drive the query with the table that returns the least data. When you study joins and predicates, and filter earlier in the query rather than later, you reduce the number of logical reads. That’s a big step in SQL query optimization.
SQL diagramming is a graphical technique for mapping the amount of data in the tables and finding which filter will return the fewest records. First, you determine which tables contain the detailed information and which tables are the master or lookup tables. Consider the simple example of this query against a university registration database:
The detail table is registration. It has two lookup tables, student and class. To diagram these tables, draw an upside-down tree connecting the detail table (at the top) with arrows (or links) to the lookup tables, like this:
Now, calculate the relative number of records required for the join criteria (that is, the average ratio of rows related between the detail table and lookup tables). Write the numbers at each end of the arrow. In this example, for every student there are about 5 records in the registration table, and for every class there are about 30 records in registration. That means it should never be necessary to JOIN more than 150 (5×30) records to get a result for any single student or any single class.
That exercise is useful if your join columns are not indexed, or if you’re not sure that they’re indexed.
Next, look at the filtering predicates to find which table to drive the query with. This query had two filters: one on registration cancelled = 'N' and the other on signup_date between two dates. To see how selective the filter is, run this query on registration:
select count(1) from registration where cancelled = 'N'
AND r.signup_date BETWEEN :beg_date AND :beg_date +1
It returns 4,344 records out of the 79,800 total records in registration. That is, 5.43 percent of the records will be read with that filter.
The other filter is on class:
select count(1) from class where name = 'ENGLISH 101'
It returns two records out of 1,000, or 0.2 percent, which represents a much more selective filter. Thus, class is the driving table, and the one on which to focus your SQL tuning first.
The voice of the user
If you’re sure you need SQL tuning, The Fundamental Guide to SQL Query Optimization offers further insight. It walks you through five performance tuning tips with copy-and-paste queries and case studies, including the one described above.
You’ll probably find that the single most important SQL query optimization tool is the voice of the user. Why? Because that voice lets you know when to start optimizing and it tells you when you have optimized enough. It can ensure that you start tinkering with the gears when you need to and stop while you’re still ahead.