Monthly Archives: September 2011

Anatomy of a hack (SQL Injection woes)

When I first heard about SQL injection attacks and learned how to prevent them I was confused.  I didn’t understand how introducing a potentially destructive string into a batch could produce different results depending on how it was introduced.  Include ‘;DROP TABLE ToDrop-- as a search condition in a WHERE clause and usually your query returns no data. Include it in the construction of a dynamic SQL string and you’ve lost not only your favorite spaghetti rum cookie recipe but everything that was in the table that just got dropped.

Eventually I learned. The reason the string fragment is sometimes dangerous and other times benign is that the string gets introduced to the query processor at different times and, depending on how and when it is introduced, it is treated differently.

The compilation process follows a basic pattern of parsing the T-SQL batch, binding your statements to real objects, optimizing the statements to produce a good-enough execution plan and, finally, executing the query.  Understanding the entire compilation pipeline isn’t necessary to understand SQL injection attacks though. The important thing to know is that during the compilation stage the T-SQL gets parsed by the Query Processor. If someone can introduce an errant string into your T-SQL code text before it is parsed they can modify how your T-SQL code is interpreted.  The parsing of the batch is the danger zone of SQL injection attacks.

A batch can contain multiple independent SQL Statements.  SQL injection attacks take advantage of this by inserting another SQL statement into the batch as it is getting compiled so that will be executed as if it were supposed to be there. Executing a batch that contains two simple SQL statements can look something like this:

SELECT * FROM Test WHERE Col1 = 1;SELECT * FROM Test WHERE Col2 = 'aa'
GO

This produces an execution plan that looks like this:

What this shows us is that two independent SQL Statements were executed as part of a single batch. It also shows us that a carriage return isn’t necessary to separate the two SQL statements.   In T-SQL the semi-colon functions as a statement separator.

Now, imagine giving an external user access to SSMS and the ability to write the commands directly into a query window that has access to your database.  When we use EXEC and dynamic SQL we risk that exact scenario. The EXEC command creates a batch that is independent of the batch that launches it.  This batch follows the same basic compilation process of every other batch with parsing being the first step.  This query, plus a peek at the plan cache, demonstrates that this is the case.

DECLARE @stmt varchar(100)
SET @stmt = 'SELECT * FROM dbo.Test WHERE Col1 = 1'
SELECT * FROM dbo.Test WHERE Col1 = 1
EXEC(@stmt)

Line 3 is the batch created by the T-SQL shown above.  Line 2 is the independent batch created by the EXEC command in that T-SQL.

Knowing that when parsing a batch SQL Server treats the semi-colon as a statement separator and that the EXEC command will result in a new batch being created is the key to understanding how SQL injections attacks work. If we allow an outside individual to affect the string that gets fed to the EXEC command that individual can inject syntax that modifies how the batch is parsed and, ultimately, what gets executed. They can do this if we create a dynamic SQL statement that directly incorporates a string which is passed to us from some web form or win32 application.  This sample simulates this scenario.

DECLARE @inject varchar(100)
DECLARE @var char(1)
DECLARE @stmt varchar(100)
SET @var = '1'
SET @inject = ''''';DROP TABLE ToDrop--'
SET @stmt = 'SELECT * FROM dbo.Test WHERE Col2 = ' + @inject + ' AND Col1 = ' + @var
EXEC(@stmt)

In this sample I am declaring @inject and @var as variables but these could easily be input parameters of an encapsulating stored procedure.  When @inject and @var get substituted into @stmt the resulting string becomes:

SELECT * FROM dbo.Test WHERE Col2 = '';DROP TABLE  ToDrop-- AND Col1 = 1

This creates a T-SQL batch that when passed into EXEC is parsed like every other batch submitted to SQL Server.  The semi-colon causes what was intended to be a single command to be separated into two separate commands.  The -- at the end of the injection causes the remainder of the command to be treated as comments. This keeps the injected string from causing a syntax error since there could easily be more T-SQL in the string whose syntax is no longer compatible with the newly formed T-SQL.  I simulate this by adding a another search condition. If this second statement is syntactically correct and binds to objects that really exist in the database, it will execute.  The query plan produced by the sample above looks like this:

Notice -- AND Col1 = 1 at the end of Query 2.  If AND Col1 = 1 had been added to the end of DROP TABLE ToDrop there would have been a syntax error.

So how do we protect ourselves from SQL injection attacks?  If we must use dynamic SQL then using sp_executesql is the key. Rewriting the previous SQL statement using sp_executesql looks something like this:

DECLARE @inject nvarchar(100)
DECLARE @var int
DECLARE @stmt nvarchar(100)
DECLARE @paramdef nvarchar(500)
SET @var = N'1'
SET @inject = N''''';DROP TABLE ToDrop<code>--</code>'
SET @paramdef <a href="mailto:=N'@param1">=N'@param1</a> varchar(100),@param2 int'
SET @stmt = N'SELECT * FROM dbo.Test WHERE Col2 = @param1  AND Col1 = @param2'
EXEC sp_executesql @stmt,@paramdef,@param1 = @inject,@param2=@var

When the statement in @stmt is parsed during compilation @param1 and @param2 are practically ignored.  SQL Server understands that these are placeholders for values that will be included later and produces a parameterized query.

Notice there is no semi-colon separating two statements in the batch.  There is no drop table statement. And notice the execution plan is safe.

When this query is finally executed the values in @inject and @var are substituted for @param1 and @param2.  The previously dangerous string is now introduced as a simple search condition that returns no results because there is no record where Col2 = ”;DROP TABLE ToDrop--

In the sp_executesql example there is no direct path from @inject to the batch.  It is never included during the parsing of the select statement and thus cannot be injected into the batch. This doesn’t mean that sp_executesql by itself is the panacea for all injection woes.  It must be used correctly.  It is possible to execute the unsafe statement from the first example with sp_execute as well.  To make it safe you must use parameters properly.

If you don’t want to use parameterized queries but still want to be protected from SQL Injection you can use QUOTENAME and REPLICATE to sanitize your input. Examples of that are here.

Dude, where’s my plan (part 2).

In part 1 I looked at the general makeup of the plan cache and how SQL Server decided where in the plan cache to save a plan.  Now I will look at how SQL Server finds a plan in the plan cache so it can be reused instead of precious resources being wasted for plan recompilation.  When SQL Server needs to look up a plan it uses the same hashing algorithm it used to decide where to save the plan.  This algorithm relies on the object id, the database ID and the size of the cache in question. This algorithm produces a bucket number and it is in this bucket of the specific cache store where SQL Server looks for a saved plan.  This sounds vague but part 1 has a lot more detail. I’m trying not to rehash that here.

For objects (like views, triggers, etc) the object id used in the hash is simply the object id of the object in question. These object ids can be seen by querying sys.objects. Ad hoc queries though don’t have an object id.  For these the batch is hashed to produce an object id.  Once you know the algorithm that SQL Server uses to cache a plan it now makes sense why ad hoc queries which don’t text match cannot be reused by a logically similar but physically different query.  The object id produced by each query will be different and if that object id is different the bucket used to store that plan will be different.

For example these two select statements, though logically similar and producing practically the same plan, get stored in different buckets.  The difference in these queries is that one of them has an extra space between the equal sign and the number one.

Once the proper bucket is discovered the plans stored in that bucket have to be examined to see if one matches the query being executed.  The question then becomes, what is SQL Server looking at to determine if a query being executed matches an execution plan that is saved in the plan cache?

When a query is being run there are attributes that can affect the execution of the query.  For example, a session with one specific SET option may return different results than a session with different SET options.  Collectively these attributes make up what is known as the cache key.  You can see which attributes comprise the cache key for a plan by querying sys.dm_exec_plan_attributes and looking at the is_cache_key column.

select st.text,pa.*
from
sys.dm_exec_cached_plans cp
cross apply sys.dm_exec_sql_text(cp.plan_handle) st
cross apply sys.dm_exec_plan_attributes(cp.plan_handle) pa
where st.text = 'SELECT * FROM dbo.Test WHERE Col1 = 1'

If any of the attribute values for the query being executed don’t match a plan already stored in the bucket then the cached plan will not be reused, even if the text of the query matches and the results of the query will be the same.

To show this I have two examples.

Example 1: I created a small test database and opened two sessions in SSMS.  In one of the sessions I set the DATEFORMAT to dmy. In the second session I left the DATEFORMAT as the default, which is mdy on my instance.

Then I ran this query in each one:

SELECT * FROM dbo.Test WHERE Col1 = 1

Looking at the plan cache I see two cached plans with same bucketid. I know they are different plans because they have different plan handles.

To see which cache key values don’t match I compare the cache key attributes side by side.

The values for the date_format attribute are not the same. Even though the plans are stored in the same hash bucket two different plans will be stored.

Example 2: In the above plan attribute comparison notice that one of the attributes is user_id.  For this attribute Books Online says

Value of -2 indicates that the batch submitted does not depend on implicit name resolution and can be shared among different users. This is the preferred method. Any other value represents the user ID of the user submitting the query in the database.

If the object names in the query aren’t qualified by a schema name the search for the object will start with the users default schema.  If each user happened to have a table or view named Test in their database this is the object that would be found and the plans generated would actually be querying completely different tables.

For this second test I still use two session but I am executing the query as a different user in each session. Each of these users has a different default schema. I am also removing the schema qualification from the query: SELECT * FROM Test WHERE Col1 = 1

A comparison of the plan attributes for the different plans produced shows that now the user_id attribute is different.

In this particular database “Test” resolves to the Test table in the dbo schema.  But that can’t be guaranteed without qualifying the table name with the schema that owns it. This is one of the reasons it is a good idea to qualify objects with the schema the object lives in.

I can envision a poorly written application that uses many ad hoc queries and has database users all defaulting to their own schema.  A common query executed by each client application (such as querying some lookup tables) could each produce a query that will be cached in the same bucket as every other client running the same query.  Since their cache keys won’t match each instance of the application will produce another plan to be put in the same bucket, potentially producing a very long hash chain that must be traversed every time that query is run.  If the lookup tables are used often this could result in an application with reduced performance.

Is this scenario likely? No, but something similar could happen with any of the attributes designated as cache keys.

DBCC CHECKPRIMARYFILE

I discovered this undocumented command one day when I was looking for a way to loop through a bunch (as in hundreds) of unattached SQL Server database and collect information about each one to put into a table.  This was the perfect command to do that.  It takes two parameters.  The first is the path to a SQL Server datafile.  The second is an ordinal, 0 to 3.
Option 0 returns one column, IsMDF.  IsMDF is 1 if the file is the primary data file for the database, 0 if not.  While the column name is IsMDF if you run it against an MDF file that isn’t the primary file it will still return false.  I assume this comes from the suggested SQL Server naming convention that all data files but the primary use a .ndf extension.

Once you know that the data file is the primary data file option 1 will return info about the other files and file groups that are included in that database.  This information can be used to create dynamic SQL to attach the database.  Being able to determine this information dynamically is useful if the databases to be attached vary in the number, type and location of the secondary files.  Running this against a data file that isn’t the primary file will produce an error.  Option 3 produces an abbreviated version similar to Option 1.

Option 2 returns information about the database itself including the database name (which isn’t always the same as the file name), the database version (no need to try to attach a database that is a higher version or accidently upgrade a database by attaching it to a higher version), and the database collation.

The usually caveats apply when talking about undocumented commands.  They generally shouldn’t be used in production.

Dude, where’s my plan (part 1).

When a plan is cached it is stored in one of four cache stores.  Query  sys.dm_os_memory_cache_counters to see the cache stores in SQL Server.  There are four cache stores that handle plans and plan related objects.  They are Object Plans, SQL Plans, Bound Trees, and Extended Stored Procedures.

SELECT *
FROM sys.dm_os_memory_cache_counters
WHERE type in
    ('CACHESTORE_OBJCP','CACHESTORE_SQLCP',
     'CACHESTORE_PHDR','CACHESTORE_XPROC')

Each of these cache stores holds different types of plans.  In general I’m only going to look the cache stores for Object Plans and SQL Plans.  The Object Plans cache store holds the plans for stored procedures, triggers and functions.  The SQL Plans cache store holds the plans prepared queries, ad-hoc queries and parameterized queries.

Each cache store contains a hash table. I think the wikipedia entry about hash tables is a pretty good intro for those who don’t really know much about them (as I didn’t before I started studying the plan cache.) Each hash table contains multiple buckets and it is in these buckets that the plans are stored.  Typically in a hash table a key identifies a single index value (though sometimes multiple keys can point to the same index producing what is known as a hash collision…more on this later).   This index value is the bucket (or slot) that contains the item being stored.   The number of buckets each cachestore can be seen by querying sys.dm_os_memory_cache_hash_tables.

The question is, how does SQL Server use the plan cache and why do I care?  The reason I care is because understanding how plans are hashed and stored can help me identify when I’m having problems with plans not being reused or if I’m having performance problems because of long hash chains.    Also, it is nifty to see how SQL Server is working behind the scenes.

According to Kalen Delaney the hash key used is determined by this algorithm: (object_id * database_id) mod (hash table size).

What this should mean is that if I take the object id, multiply it by the database id and then mod it by the hash table size I should get the bucket that the plan is going to be stored in.  For my first test this worked great but I noticed that this didn’t work consistently for all my plans.  So after a little research I learned that the formula above isn’t exactly correct.  Well, I’d better say, it doesn’t exactly work like I thought it would.

I was still thinking in terms of unlimited storage space for a number.  Unlike people who’ve been programming in C for years I wasn’t thinking that the result of a 32-bit unsigned integer multiplied by another 32-bit unsigned integer had to fit in the space allowed to a 32-bit unsigned integer.  The chance of overflow is great.  My first example that appeared to work didn’t overflow the space allowed so I didn’t see any problems with it.  As I began to look at other examples the results weren’t correct and I had to figure out why.

One of the ways of solving the overflow problem is to wrap back around when you run out of space.  A 32-bit unsigned integer can hold a range of values from 0 to 4294967295, or 4294967296 values total.  If you are at 4294967295 and add 1 to that the result is zero.  Since multiplication is simply adding the same number over and over again it should be relatively easy to figure out what the result would be after wrapping around again and again.

Rather than figure that out I looked for a shortcut.  If the two numbers I multiple together are greater than 4294967295 I can mod this number by 4294967296 to get what the final number would have been had I manually calculated what it would eventually wrap around to.

So now the trick is to test the theory out on the compiled plans in the plan cache.

For objects like stored procedures and triggers this is relatively easy because the object id and the database id are presented in sys.dm_exec_query_plans DMV.  For ad hoc plans it is a bit harder.  The object id is created by hashing the batch text and it isn’t shown in sys.dm_exec_query_plans.  It is however shown in sys.dm_exec_plan_attributes.  The problem with sys.dm_exec_plan_attributes is how it presents each attribute.  Each attribute is a row and not a column in the DMV.  This means to get it in row form I have to pivot it.

;WITH PlanAttributePivot (plan_handle, objectid, dbid)
AS
(
    SELECT plan_handle, pvt.objectid, pvt.dbid
    FROM
    (
        SELECT plan_handle, epa.attribute, epa.value
        FROM sys.dm_exec_cached_plans
        OUTER APPLY sys.dm_exec_plan_attributes(plan_handle) AS epa
        WHERE cacheobjtype = 'Compiled Plan') AS ecpa
        PIVOT (MAX(ecpa.value) FOR ecpa.attribute IN
        ("objectid", "dbid")
    ) AS pvt
)
SELECT *
FROM PlanAttributePivot PA
INNER JOIN sys.dm_exec_cached_plans cp
on cp.plan_handle = PA.plan_handle

Now that I have the object ids and database ids I need to put it through the substitute hashing function I created, (((objected * databaseid) MOD 4294967296) MOD buckets_count).  Since the plans I’m interested in are stored in different cachestores there can be a different bucket count for each store.  In my query I have to delinate by the store the plan will be stored in and use the bucket count for that store. This query worked for all the plans in my plan cache with the exception of CLR Compiled Func which appear to be stored in a different store altogether.

DECLARE @obj_buckets_count int
DECLARE @sql_buckets_count int
DECLARE @boundtree_buckets_count int
DECLARE @xproc_buckets_count int
SELECT @obj_buckets_count = buckets_count
FROM sys.dm_os_memory_cache_hash_tables
WHERE type = 'CACHESTORE_OBJCP'
SELECT @sql_buckets_count = buckets_count
FROM sys.dm_os_memory_cache_hash_tables
WHERE type = 'CACHESTORE_SQLCP'
SELECT @boundtree_buckets_count = buckets_count
FROM sys.dm_os_memory_cache_hash_tables
WHERE type = 'CACHESTORE_PHDR'
SELECT @xproc_buckets_count = buckets_count
FROM sys.dm_os_memory_cache_hash_tables
WHERE type = 'CACHESTORE_XPROC'
;WITH PlanAttributePivot (plan_handle, objectid, dbid)
AS
(
    SELECT plan_handle, pvt.objectid, pvt.dbid
    FROM
    (
        SELECT plan_handle, epa.attribute, epa.value
        FROM sys.dm_exec_cached_plans
        OUTER APPLY sys.dm_exec_plan_attributes(plan_handle) AS epa
    ) AS ecpa
    PIVOT (MAX(ecpa.value) FOR ecpa.attribute IN
    ("objectid", "dbid")) AS pvt
),
CalculatedBuckets (plan_handle,bucketid)
AS
(
    SELECT
    PA.plan_handle
    ,(cast(PA.objectid as bigint) * cast(PA.dbid as bigint))%4294967296%@obj_buckets_count [bucketid]
    FROM PlanAttributePivot PA
    INNER JOIN sys.dm_exec_cached_plans cp
        ON cp.plan_handle = PA.plan_handle
    WHERE cp.objtype in ('Proc','Trigger') and cacheobjtype <> 'Extended Proc'
    UNION ALL
    SELECT
    PA.plan_handle
    ,(cast(PA.objectid as bigint) * cast(PA.dbid as bigint))%4294967296%@sql_buckets_count [bucketid]
    FROM PlanAttributePivot PA
    INNER JOIN sys.dm_exec_cached_plans cp
        ON cp.plan_handle = PA.plan_handle
    WHERE cp.objtype in ('Adhoc','Prepared')
    UNION ALL
    SELECT
    PA.plan_handle
    ,(cast(PA.objectid as bigint) * cast(PA.dbid as bigint))%4294967296%@boundtree_buckets_count [bucketid]
    FROM PlanAttributePivot PA
    INNER JOIN sys.dm_exec_cached_plans cp
        ON cp.plan_handle = PA.plan_handle
    WHERE cp.cacheobjtype in ('Parse Tree')
    UNION ALL
    SELECT
    PA.plan_handle
    ,(cast(PA.objectid as bigint) * cast(PA.dbid as bigint))%4294967296%@xproc_buckets_count [bucketid]
    FROM PlanAttributePivot PA
    INNER JOIN sys.dm_exec_cached_plans cp
        ON cp.plan_handle = PA.plan_handle
    WHERE cp.cacheobjtype in ('Extended Proc')
)
SELECT CB.bucketid,cp.bucketid,*
FROM CalculatedBuckets CB
INNER JOIN sys.dm_exec_cached_plans cp
    ON cp.plan_handle = CB.plan_handle
--WHERE cp.bucketid <> CB.bucketid

With this I can easily compare the bucket ids SQL Server produced with the pseudo hash I created.  Uncommenting the WHERE clause at the end and I can see where they don’t match.   I’m pleased with the results.  The database I tested this on had over 14,000 plans in the cache store and the only ones that didn’t match were the CLR compiled functions.  A little research showed that these probably live in a different store so I didn’t worry about it.

One of the problems with a hash table is the chance of hash collisions.  This happens when hashing different keys produces the same hash value or bucket.  This leads to multiple plans being stored in the same bucket.  When SQL Server is looking for a plan to reuse it uses the same hashing algorithm to determine the bucket to look in.  It then must search each plan in the bucket to see if it matches certain criteria (more on this in the next post).  If the number of plans in the bucket is excessive SQL Server will have to search through a large number of plans to find one that matches (and to find if there isn’t a matching plan).  Looking at the columns in sys.dm_os_memory_cache_hash_tables will show some general information about how the buckets have been used.  The column buckets_max_length tells you how long the longest hash chain is.

In my next post I plan on looking more at how plans are reused and talk about how SQL Server finds the right plan after it has found the right bucket.