Monthly Archives: May 2012

No More MDI in SSMS?

After installing SQL Server 2012 I was disappointed to discover that the MDI option has been removed leaving me with having to use the tabbed interface. I hate the tabs. Technically the tabed interface is MDI ’cause you can undock a tab and separate it from the application. Once you do that though you can’t minimize it (unless you minimize the entire application) and if you have a lot of windows undocked you have a window management nightmare. The undock option is useful but doesn’t replace what I’ve been forced to give up.

My main problem with the tabs is that they interupt how I work. My screen is wide enough to only show 4 or 5 tabs at a time. The rest of the tabs have to be accessed via a pull down. When a tab is activited it becomes one of the shown tabs an a previously shown tab is relegated to the pull down menu. If you go find the tab and surface it again, it gets put back on the main header but in a different place than before. If I had become used to the order they were in I now find myself clicking a bunch of tabs looking for the right one. And if I have to look in the pulldown for the right tab then I end up doing a scrambled search which reorganizes all of my tabs and componds the problem when looking for future tabs (which were no longer where I left them).

In the 2008R2 version when a query windows is minimized it is put onto the main desktop of SSMS (assuming you have the MDI option set). I could then move the minimized windows anywhere on the screen I wanted. I could group these icons by the project I was working on. When I minimize the query window it always gets put back exactly where it was before. I could easily switch between projects without getting lost. If I had to do a mad scramble for the right window maximizing and minimizing them wouldn’t completely rearrange the working order I had set and since I had the icons grouped by project I usually only had two or three windows to search. I was very efficient with the window working like this.

After I installed SQL Server 2012 I found out the tabs were gone. Back to the mad scramble I suppose. I’m not digging the new SSMS interface at all.

Parameters vs. Variables or One of These Things Is Not Like the Other

Take a look at @test in each of these code snippets.

@test in this procedure

CREATE PROCEDURE GetTestData
(@test char(1))
AS
SELECT * FROM ParamTest WHERE TestData = @test

Looks strikingly similar to @test in this code snippet

DECLARE @test char(1)
SET @test = 'A'
SELECT * FROM ParamTest WHERE TestData = @test

They look the same. They both use the same data types. The select statements use them the exact same way. So how can they be different?

SQL Server treats them differently because of how they are used. During optimization the SELECT statement using the parameter in the procedure will take advantage (disadvantage?) of a SQL Server feature known has parameter sniffing and can optimize the batch in ways that it cannot when using the variable.

To show this I need some test data so I will create a simple table:

CREATE TABLE ParamTest
(
    ParamTestID int identity(1,1)
    ,TestData char(1)
)

In this table I will load 10,000 rows – 9999 of them with TestData set to A and 1 with TestData set to Z.

DECLARE @i int
SET @i = 1
WHILE @i<10000
BEGIN
INSERT INTO ParamTest (TestData) SELECT 'A'
SET @i = @i+1
END

INSERT INTO ParamTest (TestData) SELECT 'Z'

I then will put a non-clustered index on TestData so the optimizer has multiple options for fulfilling a query which is filtered on TestData.

CREATE INDEX IX_ParamTest ON ParamTest(TestData)

Depending on the WHERE clause in your search the optimizer will choose either a table scan or an index seek/bookmark lookup to fulfill the query.
First lets constrain our query by looking for TestData that is set to A

SELECT * FROM ParamTest WHERE TestData = 'A'

We can see that with this criterion the optimizer chose a table scan.

And by looking at the properties on the select we see that optimizer estimated 9999 rows would be returned.

Now lets contrain our query by looking for TestData that is set to Z

SELECT * FROM ParamTest WHERE TestData = 'Z'

In this case the optimizer chose an index seek/bookmark lookup to fulfill the query

And that the properties show an estimate of 1 row would be returned.

So depending on the data distribution we can get a different plan. In the first case a table scan was deemed to be the quicker way to fulfill the query and in the second case an index seek/bookmark lookup.

What happens if, instead of filtering by A or Z directly we use a variable to filter.

DECLARE @test char(1)
SET @test = 'Z'
SELECT * FROM ParamTest WHERE TestData = @test

The execution plan with properties:

Notice that even though @test is set to Z and that previously Z gave us an index seek with a bookmark lookup this plan now uses a table scan. Also notice that the Estimated Number of Rows is now 5000 instead of 1. This is because when optimizing the query the value the variable is set to isn’t used by the optimizer. Without knowing specifically what to optimize for the optimizer will look at the total density of the table. Since there are only two values, A and Z in TestData the total density is .5. Half of 10,000 rows is 5000. If the TestData column had 3 values in it the total density would be .3333333 and it would estimate 3333.67 rows.

This is fine if you are looking for the value A because a table scan would be used anyway. If you are looking for the value Z though the table scan would be inefficient. The optimizer has no way of knowing what you are looking for so it must make some assumptions. The assumption is that the data is evenly distributed.

Things are different when inside of a stored procedure.

CREATE PROCEDURE GetTestData
(@test char(1))
AS
SELECT * FROM ParamTest WHERE TestData = @test

If I execute this stored procedure with  the value of Z
EXEC GetTestData 'Z'

*Note: stored procs like this can open you up to SQL Injection. Use caution.

Notice that even though the where clause was using @test and @test was set to Z, unlike the previous example where @test was a variable the optimizer is now using the value of @test in the optimization process. This is parameter sniffing. The parameters of the stored procedure can be used by SQL Server in optimizing statements that use the parameter.

This is both good and bad. The bad part is that once the plan for the stored procedure is compiled it will continue to use the same plan. A long as this execution plan is in the plan cache if you execute GetTestData with a parameter of A instead of the using the more optimal method of fulfilling the query, a table scan, an index seek and bookmark lookup will be used.
See more about plan caching here.

This can be a problem if the data in your table is widely skewed so that an execution plan for one data point isn’t optimal for another and vice versa.

There are several ways around this.
1. You can use the RECOMPILE option on the stored procedure definition. This will cause the query plan of the stored procedure to never be cached. This means every time it runs it will be compiled from scratch. This has performance ramifications. It will require CPU to compile. If the compilation of the plan uses a lot of memory you can also get bottlenecks on your SEMIPHORE which means you are limiting how often SQL Server can execute the procedure. This is not recommended.
2. You can add the RECOMPILE option to individual statements in the procedure. This means that instead of the entire procedure being compiled on each execution on the statement with the RECOMPILE option will get compiled each time. This may not have the same level of performance ramifications that using RECOMPILE on the procedure definition does.
3. If you have skewed data and know that a query plan built for Z will perform well with Z and horrible for A but know that a plan built for A will perform great with A but only so-so for Z then it might be worth always optimizing for the value the performs the best overall. You may not get the best performance every time the procedure executes but you get the best overall performance. You can do this by putting the OPTIMIZE FOR option at the statement level.
4. With the OPTIMIZE FOR option you get another option called OPTIMIZE FOR UNKNOWN. This option doesn’t try to optimize for any specific value but creates a plan utilizing the overall density. If I modify the GetTestData procedure to use the OPTIMIZE FOR UNKNOWN option I will get a plan that looks exactly like the plan created when using the variable in the ad hoc query. In fact, before the OPTMIZE FOR UNKNOWN option was added to the product in order to get this behavior from the optimizer the value of a parameter as assigned to a variable and the variable was used in the SQL statement rather than the parameter. This was often used to circumvent parameter sniffing.

Sometimes when I’m troubleshooting a stored procedure I will paste the contents into another window and run it ad hoc. I will DECLARE some variables at the start of the ad hoc batch in order to simulate the stored procedure and because they are required by statements in the batch. This can lead to execution plans of statements in the ad hoc batch being different than execution plans of statements in the stored procedure. Depending on where the bottleneck is this can cause some problems with optimizations and the resultant query plans.

The moral of this story is that things aren’t always how they look. Just because a statement has and @value being used doesn’t mean SQL will always treat it the same.

SQL Injection Info

A list of resources to look at to help understand SQL Injection.

I’ve not seen anything in blog land that describes SQL injection the way I have done so in my original post on the topic. There may be something out there but I don’t know of it.

SQL Injection falls under the rubric of security. Program Manager for the SQL Server Engine team, Il-Sung Lee, gives a talk about developing secure applications. He discusses SQL injection and other interesting things.

Understanding SQL injection requires understanding how the query processor parses a batch. Conor Cunningham gave a speech at SQLbits X about how the query processor works. Not specific about SQL injection but he does talk about the parser and the query tree.

Paul White talks specifically about the parse tree and how you can see the initial tree is. Great for inspecting how a query was parsed.