How To Tell If Your Query Has Been Auto-Parameterized (and why wasn’t it?)

SQL Server has the ability to auto-parameterize certain queries.  What this means is that constant literal values are replaced with variables before the query is compiled.  This allows SQL Server to reuse query plans and helps speed up query execution by skipping the compilation step, which can be a quite expensive relative to the time it actually takes to get the data from the storage engine.  Future queries that differ in the value of the literal constants can reuse a query plan that is already compiled reducing the round trip time between  a client requesting data and SQL Server returning data to the client.  Aside from reducing the total execution time it has the added benefit of helping to reduce plan cache bloat and processor usage.

Plan cache bloat occurs when the plan cache becomes filled with query plans that are unlikely to be reused.  The plan cache is a limited resource, especially on 32-bit editions, and having the plan cache filled with single-use plans has an adverse effect on SQL Server.  Plan cache pages are stolen from the buffer pool so a swollen plan cache can reduce the amount of memory available to SQL Server to cache data pages.  Less buffer pool means more disk I/O. More disk I/O means a slower system.

As new plans are being generated, plans that haven’t been used in a while are discarded to make room for new plans.  In the event one of the single use plans may have actually been able to be used again it may no longer be available in the plan cache and require compilation again.  Plan compilation takes CPU and a constantly churning plan cache uses more CPU than a stable plan cache with highly reusable plans.

Auto-parameterization takes a query that looks like this:
SELECT Col1 FROM Test WHERE Col1 = 1

and compiles it as this:
(@1 tinyint)SELECT [Col2] FROM [Test] WHERE [Col1] = @1

The @1 is the parameter.  A future query of
SELECT Col1 FROM Test WHERE Col1 = 2
could possibly reuse the query plan generated for the first query if the first query has be auto-parameterized.  It can be difficult to tell when a query has been auto-parameterized. Hopefully this post will help someone one day.

The process of auto-parameterization happens in two stages (conceptually for the purposes of this post anyway).  First when the SQL statement is parsed by the Language Processing and Execution (LPE) component of the storage engine it can be auto-parameterized. The LPE passes the query to the Query Processor for optimization.  While the Query Processor is optimizing the query it can deem a query as “safe”.  If a query has been deemed safe by the Query Processor the query runs and the query plan for the parameterized query is cached.  If the Query Processor deems the query as “unsafe” the query still executes but the plan that is cached is for that specific query only.  A similar “unsafe” query that differs only by the constant literals will have its own plan generated while a similar query for a “safe” plan will use the parameterized plan that already exists in the plan cache skipping the compilation step.

The key aspects to take away from this are the two stages of auto-parameterization.  Just because a query is auto-parameterizable doesn’t mean that it will be auto-parameterized.  The query processor can “veto” auto-parameterization.  Generally the query processor will veto auto-parameterization if it detects that the choice of constant literals could affect the plan choice.  In the follow test I’m going to show how you can look at what happened in the two stages to discover if a query is auto-parameterizable and if the query processor has allowed auto-parameterization by deeming the query safe or has vetoed auto-parameterizing by deeming the query as unsafe.  I’m going to do this with two examples. In both examples cases the queries can be auto-parameterized but in one example the query and the indexes could lead to different plan choices depending on the values of the constant literals while in the second it is unlikely that SQL Server will choose a different plan regardless of the constant literals used.

First I will create a test database and test table.
USE MASTER
GO

–DROP DATABASE AutoParam
–GO
CREATE DATABASE AutoParam
GO
USE AutoParam
GO

IF object_id(‘dbo.Test’) IS NOT NULL
DROP TABLE dbo.Test
CREATE TABLE dbo.Test
(
Col1 int,
Col2 char(2)
)
GO

Now let’s fill the test table with some data and create an index.

DECLARE @c int
SET @c = 0
WHILE @c < 10000
BEGIN
INSERT INTO Test (Col1,Col2) VALUES (@c,’aa’)
SET @c = @c + 1
END
GO
CREATE INDEX TestIndex ON dbo.Test (Col1)
GO

The test data is 10000 rows with the values of Col1 being even distributed from 0 to 9999 and the values of Col2 being constant ‘aa’ throughout.
Before actually running the query we want to take a look at the Estimated Execution Plan to see if the query itself is auto-parameterizable.
SET SHOWPLAN_XML ON
GO
SELECT Col1 FROM Test WHERE Col1 = 1
GO

Running this query should produce a single row, single column of data that contains the XML of estimated execution plan.  Click on the XML link and the execution plan should appear.  Now right click on the window and select Show Execution Plan as XML.  This should cause the XML of the plan to show.  In the StmtSimple element of the XML there is an attribute called ParameterizedText. This should contain the parameterized text of the query.  It should look like this:
ParameterizedText=”(@1 tinyint)SELECT [Col1] FROM [Test] WHERE [Col1]=@1″

Now let’s run the query for real.  Close the XML window and the Execution Plan window.  Now in query windows run this:
SET SHOWPLAN_XML OFF
GO
DBCC FREEPROCCACHE
GO
SELECT Col1 FROM Test WHERE Col1 = 1
GO
First we are turning off the feature which returns the estimated execution plan so the query will run live.  Second we are clearing the plan cache so that when we investigate it later it will be very easy to see.
After running this query we can investigate the plan cache by joining a couple of DMVs together to see what is there.

SELECT t.text, p.objtype, p.cacheobjtype, p.plan_handle, qp.query_plan, s.query_hash, s.query_plan_hash
FROM sys.dm_exec_cached_plans p
CROSS APPLY sys.dm_exec_query_plan(p.plan_handle) qp
CROSS APPLY sys.dm_exec_sql_text(p.plan_handle) t
LEFT JOIN sys.dm_exec_query_stats s
ON s.plan_handle = p.plan_handle
ORDER BY s.query_plan_hash

Explaining these DMVs is out of the scope of this post but I put in enough info to see what is going on.  The first column has the text of the query.  The second, the type of plan, the third is the identifier of the in-memory plan (it goes away when the plan goes out of cache), the forth is the XML of the query plan itself.  The query_hash column and query_plan_hash column are a dash of information from sys.dm_exec_query_stats.  I included this because there will be no query stats available for the literal text of the query that has been parameterized and it makes it easier in this example to pick out the parameterized query.  Query_hash and query_plan_has are only available on SQL 2008 and higher so if you run this on SQL 2005 use any other columns from the sys.dm_exec_query_stats DMV (by the way this example is for SQL Server 2008 and possibly 2005, not for 2000).

(Compressing the pictures for the page makes them hard to read. Click on any picture to see a larger, readable version of it.)

 

The first line of these results is the query to show the plan cache and can be ignored.  The second line is a shell query that contains the literal text of the query that was run.  If you look at the plan for the shell query you will see a pointer to the parameterized query on line 3.   Click on the query plan in line two and look at the plans XML.

Notice that in the XML element StmtSimiple the value for the attribute ParameterizedPlanHandle contains the plan handle for the parameterized query on line 3.
So this query has not only been auto-parameterized, it has been deemed safe.
Next lets run several queries that can use this cached plan.

DBCC FREEPROCCACHE
GO
SELECT Col1 FROM Test WHERE Col1 = 1
GO
SELECT Col1 FROM Test WHERE Col1 = 7
GO
SELECT Col1 FROM Test WHERE Col1 = 1000
GO
SELECT Col1 FROM Test WHERE Col1 = 7745
GO
SELECT Col1 FROM Test WHERE Col1 = 10000
GO
After running these queries run the DMV query from earlier and you should get this:

As you can see these five queries point to two parameterized queries. You can use the ParameterizedPlanHandle attribute in each query plan to see which parameterized query each points to.

Now let’s look at a query that isn’t deemed safe.   First we want to look at the Estimated Execution Plan
SET SHOWPLAN_XML ON
GO
SELECT Col2 FROM Test WHERE Col1 = 1
GO

Unlike the previous query this query isn’t covered by the index we created.  So in this case the query processor has two choices to return Col2.  It can scan the table or it can do an index seek on Col1 and then a bookmark lookup on the table for Col2.
Looking at the XML of the plan we see that this query has been auto-parameterized.  But is it safe.
ParameterizedText=”(@1 tinyint)SELECT [Col2] FROM [Test] WHERE [Col1]=@1″
Running this:
SET SHOWPLAN_XML OFF
GO
DBCC FREEPROCCACHE
GO
SELECT Col2 FROM Test WHERE Col1 = 1
GO

 

Taking a glance at the query plan also will show that there is no longer a shell plan, thus no auto-parameterized query plan.
Finally by running five queries that differ only by their literal constants shows that all five queries have not been deemed safe.
DBCC FREEPROCCACHE
GO
SELECT Col2 FROM Test WHERE Col1 = 1
GO
SELECT Col2 FROM Test WHERE Col1 = 7
GO
SELECT Col2 FROM Test WHERE Col1 = 1000
GO
SELECT Col2 FROM Test WHERE Col1 = 7745
GO
SELECT Col2 FROM Test WHERE Col1 = 10000
GO

In this case the first stage investigation shows that the query can be auto-parameterized but the second stage investigation shows that it was deemed unsafe to be auto-parameterized, and thus wasn’t.

Why was the first example successfuly auto-parameterized while the second example was not? First there are the conditions necessary for any query to be auto-parameterized.  These are available in Appendix A of the Plan Caching in SQL Server 2008 white paper.  That is not what is at play here though as I intentionlly used queries that met those conditions. These conditions are enforced at stage one. However another condition of auto-parameterization is that “only those SQL statements for which parameter values do not affect query plan selection are auto-parameterized.”  This condition is enforced by the query processor, or in what I am calling stage two.  It is this stage that the query is deemed safe to parameterize or not.

Hopefully someone out there will find this information useful. The next post delves into this topic a bit further

Other resources:
SQL Programmability & API Development Team Blog
Plan Caching in SQL Server 2008 White Paper