Monthly Archives: October 2011

The danger of NOT IN (Or, you keep using that word. I do not think it means what you think it means.)

I hinted about this in my post on anti-joins but I thought it deserved a little more attention, especially since this SQL construct is one I’ve seen used many times and, in the past, I used it quite liberally. The danger is that when doing an anti-join with NOT IN there is a risk that SQL will give you unexpected results.

First it helps to understand what an anti-join is. Basically you are telling SQL server to give you all of the rows from one result set that don’t have a match in another result set. My previous post has more details.

However I’m not going to start with an anti-join example but with an example of IN and NOT IN because this makes the anti-join example easier to understand.

First I need to create some data to work with.

Name varchar(20),
AddressID int

CREATE TABLE dbo.Address
AddressID int,
Address varchar(20)

INSERT INTO dbo.Person (Name,AddressID) VALUES ('Wesley',1)
INSERT INTO dbo.Person (Name,AddressID) VALUES ('Buttercup',1)
INSERT INTO dbo.Person (Name,AddressID) VALUES ('Inigo',2)
INSERT INTO dbo.Person (Name,AddressID) VALUES ('Fezzik',2)
INSERT INTO dbo.Person (Name,AddressID) VALUES ('Rugen',3)
INSERT INTO dbo.Person (Name,AddressID) VALUES ('Humperdinck',3)
INSERT INTO dbo.Person (Name,AddressID) VALUES ('Vizzini',4)

INSERT INTO dbo.Address (AddressID,Address) VALUES (1,'Farm')
INSERT INTO dbo.Address (AddressID,Address) VALUES (2,'Theives'' Forest ')
INSERT INTO dbo.Address (AddressID,Address) VALUES (3,'Castle')
INSERT INTO dbo.Address (AddressID,Address) VALUES (null,'Dungeon')

The two tables look like this:


The Person table has 7 rows.  AddressID contains the values 1,2,3,4.
The Address table has 4 rows. AddressID contains the values 1,2,3, and for one of the rows this value is null.
Continue reading The danger of NOT IN (Or, you keep using that word. I do not think it means what you think it means.)

Parsing the SET options in sys.dm_exec_plan_attributes

When doing my previous post about plan caching and plan reuse I spent some time looking for how to decipher the set_options attribute of sys.dm_exec_plan_attributes. I couldn’t find anything. Maybe I didn’t look hard enough.

Thankfully, Benjamin Navarez recently authored a post on the effects of SET options on plan reuse and published the very information I had been looking for.

declare @set_options int = 251
if ((1 & @set_options) = 1) print 'ANSI_PADDING'
if ((4 & @set_options) = 4) print 'FORCEPLAN'
if ((8 & @set_options) = 8) print 'CONCAT_NULL_YIELDS_NULL'
if ((16 & @set_options) = 16) print 'ANSI_WARNINGS'
if ((32 & @set_options) = 32) print 'ANSI_NULLS'
if ((64 & @set_options) = 64) print 'QUOTED_IDENTIFIER'
if ((128 & @set_options) = 128) print 'ANSI_NULL_DFLT_ON'
if ((256 & @set_options) = 256) print 'ANSI_NULL_DFLT_OFF'
if ((512 & @set_options) = 512) print 'NoBrowseTable'
if ((4096 & @set_options) = 4096) print 'ARITH_ABORT'
if ((8192 & @set_options) = 8192) print 'NUMERIC_ROUNDABORT'
if ((16384 & @set_options) = 16384) print 'DATEFIRST'
if ((32768 & @set_options) = 32768) print 'DATEFORMAT'
if ((65536 & @set_options) = 65536) print 'LanguageID'

I suggest you go read the whole thing.

Thank you, Benjamin.

Semi-joins and Anti-joins

I orignally started this post for T-SQL Tuesday but wasn’t able to complete it in time. Being late I can see what other wrote about and it appears that semi-joins and anti-joins were a popular topic.  A while back I did a post on joins but didn’t talk about these. Since I’m teaching a class about joins I am finishing it up anyway so I can refer people to it.

A couple of join types that are commonly used but rarely explicitly labeled are semi-joins and anti-joins.  In a regular join you can get multiple results back for a single row because a row in Table A may match more than one row in Table B.  However with a semi-join you are only getting one row back from Table A regardless of how many matches there are for that row in Table B.

Just like there are right and left outer joins there are right and left semi-joins.
I have two tables, People and Employers.  People has 5 records.

Employer has 10000+ records.  The top 5:

When I originally wrote this post my Employer table only had 5 records as well but I couldn’t produce all the examples I wanted with that so I add 10,000 records or so to the table.  I did this by cross joining AdventureWorks table Purchasing.Vendor with itself.
Continue reading Semi-joins and Anti-joins

The Value of Expertise (or Are You Sure About That?)

One of the things I love to do is sail.  The first time I was on a sailboat, other than a Sunfish, was a 30ft yacht pounding to weather into 25 knot winds and heeled about 20 degrees.  It hooked me. Having dreamed about owning a boat since then I heard many cautions from my friends and family.  “You know what boat stands for…Break Out Another Thousand.”  Or, “A boat is a hole in the water you throw money into.”  Not one to be deterred by sanity I decided it is time to fulfill that dream and have started the hunt for the right sailboat for me.

I’ve been reading about boats for 10 years now and thought I would be able to make a wise decision. What I discovered is that despite all those years of study I still don’t know enough.  I’ve heard most of the conventional wisdoms.  Full keel boats are safer.  Ketches are slow.  Flat bottom boats will pound you to death. Sailboats with long overhangs are tender.  There is probably a kernel of truth to most of these statements but I know from studying that they aren’t always true.  The problem is that I don’t know when they aren’t always true.

The realization was that I can speak the words of boats but I don’t know the language.  Tillers, shrouds, booms, boom vang, wind vane, gooseneck, bow pulpit, spars, topsides, shear, stanchions, cutter rigs, ketch rigs, Bermuda rigs, yawls, weather helm, sheets, head up, head down, soft chine, hard chine, full keel, skeg, prop, spade rudder, tack, gybe, lift, slip, windward, leeward…it goes on and on.  I understand the parts but I don’t understand the whole.  I understand that if you put a long fin keel on a sail boat it will generally point better than a boat full keel. But I don’t know what compromises are made when you do that.  What do I lose? Safety? Stability?  I have an idea but I don’t know for certain.

Because of this I have decided to seek the advice of an expert.  I contacted a renowned marine architect of small pleasure sailboats to help me navigate this maze.  He has literally (and I don’t literally mean figuratively, I literally mean literally) designed hundreds of boats. He’s had to decide what compromises to make.  When designing a boat he has to know the potential downsides of every design decision and when other parts of the design can exacerbate those downsides or counter them. When buying a used boat you have to make decisions about what features you want.  Deciding on one feature necessarily rules out others.  He can advise me of what I’m giving up with every decision I make and every feature I choose. That’s the kind of guy I need in my corner. I would hate to make a decision based on ignorance and it cost me tens of thousands of dollars or years of loss of enjoyment because I bought the wrong boat. By working with him I don’t guarantee that I’ll make the right decision but I increase my odds significantly.

I can’t help but think about how this translates to the technology field, specifically, SQL Server.  There are thousands of people out there whose knowledge of SQL Server is like my knowledge of sailing.  They know the words but they don’t know the language.   Because of the nature of their jobs they can’t spend hours every day studying and working with SQL Server. They can’t have developed hundreds of databases and have had to make the compromises that one must make. They are good developers and good project managers. They can talk very intelligently about SQL Server and can make good decisions if they have the information.  But they don’t. Just like my purchase of a boat they run the risk of making a decision born from ignorance  that can cost them tens of thousands (and in some cases, I’m sure, hundreds of thousand) of dollars.

This isn’t some sort of shill to promote myself.  I’m not a consultant. This message has personal impact too. Every day I work with SQL Server I run into something I don’t know. Some things can be solved with a Google search.  Others require me to seek the advice of those who know more than I do.  It is a humbling thing to admit that I don’t know something but the downsides of making a bad decision based on ignorance can be great.  I’ve had to learn how to swallow my ego. There is a long game involved too.  It’s much easier to swallow my ego now than it is when a decision I made seriously impacts delivery or production or costs my company lots of money. I know your company is different. They don’t care about wasted money.

When I learned how much it cost to hire the marine architect to help me I initially balked.  No one wants to spend money they don’t have to.  On the other hand the potential costs of not hiring him are huge.  In the grand scheme it isn’t that much money and in a month or two I won’t even miss it.  I’ve spent a lot more money on a lot stupider things.

Companies should make these same considerations.  Hiring a SQL Server expert to consult on a design doesn’t have to cost a fortune. The cost of some expert guidance can be tiny compared to the total development cost of a single project.  If you have good people who know the words, the expert can help them with the language. The consultant doesn’t have to do the work. The consult is there because he knows stuff you don’t know, and given your time constraints likely can’t know. A few days consult here and there can help guide the project and avoid the pitfalls that could cost tens of thousands of dollars in the long run.