Optimize for Ad hoc Workloads (the not so good news)

This new feature in SQL 2008 has received a lot of attention so I’m only going to note one of the peculiarities I’ve discovered working with it. Mainly, having this option enabled, along with Forced Parameterization, could in fact accentuate the problem it is meant to solve and cause greater plan cache bloat.

First, here is a quick introduction to Optimize for Ad hoc Workloads for the uninitiated.
In short, this configuration option helps fight plan bloat by not caching single use plans.  With this turned on the plan does not get cached the first time a query is run, only the second.  This is pretty easy to see by running a simple test.  This test uses the database and tables I set up in a previous post on auto-parameterization.  I cleared the procedure cache (DBCC FREEPROCCACHE) and ran this query against that database:

SELECT Col2 FROM Test WHERE Col1 = 7

Looking at the plan cache using this query I see some interesting things:

SELECT p.size_in_bytes,p.cacheobjtype,p.objtype,qp.query_plan,t.text
FROM sys.dm_exec_cached_plans p
CROSS APPLY sys.dm_exec_query_plan(p.plan_handle) qp
CROSS APPLY sys.dm_exec_sql_text(p.plan_handle) t

The cacheobjtype is Compiled Plan Stub.  Notice that the query_plan is null but that the text is not.  We can see the text of the query that was run but no query plan was cached.  Also note that the size_in_bytes of the query is a measly 336.  Running the SELECT again and looking at the plan cache shows that the Compiled Plan Stub is gone replaced with a Compiled Plan. Now the size is 24576 bytes and the query plan is now available to view.

Now on to the interesting part. If your query is auto-parameterized then the auto-parameterized query plan will still get cached, even on the very first run of the query! Again, using the database and queries from the auto-parameterization blog post, I run a query I know will be auto-parameterized.

SELECT Col1 FROM Test WHERE Col1 = 1

A glance at the plan cache shows that not only has the Compiled Plan Stub been created but that a full parameterized plan has also been created which is 40960 bytes.

What happens if Forced Parameterization is enabled along with Optimize for Ad hoc Workloads?

With Optimize for Ad Hoc Workloads and Forced Parameterization turned on I am running a query from my Forced Parameterization blog post which I know will not be auto-parameterized but will be force-parameterized.

SELECT Col2 FROM Test WHERE Col1 = 1
Here is the cached plan:

The plan cache shows that this produced a compiled plan stub along with a compiled plan of 232 bytes and 57344 bytes respectively.  In other words optimizing for ad hoc workloads would normally not cache the query plan on the first run but only create a compiled plan stub.  It did create a compiled plan stub but it created one for what would have been the shell execution plan or the pointer to the parameterized plan.  But on top of that it also created a fully cached parameterized plan.

If this is a single use plan, then we have essentially blunted the blade of the Optimize for Ad hoc Workload knife.  The purpose of using Optimize for Adhoc workloads is to keep these single use plans from being cached,  yet under forced parameterization (and auto-parameterization too), plan caching is exactly what is happening.

Using Optimize for Ad hoc Workloads under the simple parameterization model this query would have produced a compiled plan stub of only 232 bytes and never created the compiled plan that is 57344 bytes.  A fantastic memory savings for a plan that will never be used again.

So, what would happen if there were thousands of single use plans created because forced parameterization is enabled?  I’m going to scale this down to two queries for a simple example and look at the results under six different circumstances.

Simple Parameterization with Optimize for Ad hoc disabled
Forced Parameterization with Optimize for Ad hoc disabled
Simple Parameterization with Optimize for Ad hoc enabled (1st run)
Simple Parameterization with Optimize for Ad hoc enabled (2nd run)
Forced Parameterization with Optimize for Ad hoc enabled (1st run)
Forced Parameterization with Optimize for Ad hoc enabled (2nd run)

Simple Parameterization with Optimize for Ad hoc disabled:

This is a straight forward out of the box example.  These two plans use 106496 bytes in the plan cache.  For a query to reuse these query plans the exact text of the query must be run again, otherwise a new plan will be created.

Forced Parameterization with Optimize for Ad hoc disabled:

In this case forced parameterization puts two plans for each query in the plan cache, the parameterized plan and the shell plan that points to the parameterized plan for a total of 188416 bytes.

Simple Parameterization with Optimize for Ad hoc enabled (1st run):

Only 672 bytes.  This is fantastic. If you never run the queries again then you’ve saved a lot of space in the plan cache.

Simple Parameterization with Optimize for Ad hoc enabled (2nd run):

106496 bytes.  Exactly the same as the first example.  So if you need to run the query again then you have to burn a little bit of CPU to compile the plan and now it is cached in case you need it a third time or more.

Forced Parameterization with Optimize for Ad hoc enabled (1st run):

139728 bytes.  This is expensive for a plan that may never be used again.  Especially compared to the 672 bytes from example 1.
Forced Parameterization with Optimize for Ad hoc enabled (2nd run):

188416 bytes.  This uses the most memory of all and if you never use these plans again you have just wasted this space in your plan cache.  The gamble, of course, is that you may have a lot of different shell plans but that these will all point to a core group of parameterized plans and ultimately save memory and compile time.

This example highlights the importance of understanding what it going on under the hood. If you have mostly single use plans and turn on Optimize for Ad hoc Workloads you could save yourself a lot of plan cache memory.  If you think turning on Forced Parameterization on top of that may help save plan cache you could be shooting yourself in the foot and not even realize it.

On the other hand if your database is characterized by mostly multi-use plans then having Forced Parameterization turned on may help you.  If you turn on optimize for Ad-Hoc workloads also then you are likely to burn more CPU because you have to recompile plans on the second time they are run.

The problem is that the query plan cache is very unlikely to be characterized by single use plans only or by multi use plans only.   This is why it is important to understand your data and how you access your data. Often we hear or read about a feature in SQL Server that helped another DBA and are too quick to enable that feature for ourselves without understanding what it really does. This is an example of why it is bad to do that.

UPDATE 6/27/2013:
Based on some of what I’ve been reading online and communication with fellow SQL Server guys I thought it important to come back here and state that this is an edge case situation. I think this may have been misconstrued as a recommendation to not have both Optimize for Ad hoc Workloads and Forced Parameterization on at the same time. That is not my recommendation. My recommendation is that if you do happen to have both on check to make sure you aren’t encountering the characteristics described in this blog post.