Contact  |  Site Map  |  Privacy      Call for more info
 
Latest articles

Common Bad Practices in SQL Development that Lead to
SQL Server Errors and Performance Problems


There is one common mistake, which is repeated again and again, that it is definitely in the beginning of the ìtop tenî list of common causes of the performance problems.

We have a procedure that selects data from some table, say, Customers:

create table Customers (
  SSN char(9) primary key,
  ContractNumber int not null,
  Name varchar(64),
  Comment varchar(255))
GO
create index IX_ContractNumber on Customers (ContractNumber)
GO

This procedure selects data by the given SSN, or ContractNumber, or by Name:

create procedure SelectCustomers
  @SSN char(9) = null,
  @ContractNumber int = null,
  @Name varchar(64) = null
as
  if @SSN is not null 
    select * from Customers where SSN=@SSN
  else if @ContractNumber is not null
    select * from Customers where ContractNumber=@ContractNumber
  else 
    select * from Customers where Name=@Name 
GO

This procedure works fine, but many developers are uncomfortable about this procedure, because it is considered ëcopy/paste programmingí. We have the same thing (select * from Customers, in a real world this fragment is much longer) repeated 3 times (in the real world we can have 10 or even more conditions).

So a developer rewrites the code, making a brilliant trick, collapsing all 3 cases into one:

create procedure SelectCustomers
  @SSN char(9) = null,
  @ContractNumber int = null,
  @Name varchar(64) = null
as
  select * from Customers 
    where (SSN=@SSN or @SSN is null)
    and (ContractNumber=@ContractNumber or @ContractNumber is null)
    and (Name=@Name or @Name is null)
GO

It works much slower then expected. To understand why, lets examine the execution plan of the first stored procedure:

For each case, a different execution plan is generated: for the first one SQL server uses Clustered Indexed Seek (on SSN), for the second Index Seek (on ContractNumber), and for the third one, on Name, there is no index, so SQL server uses a Clustered Index Scan, it has to read all records to verity the WHERE condition.

In the second version of a procedure, the plan is quite simple:

SQL server always has to read and examine all records one by one. We have 1 case instead of 3, so the execution plan is prepared must work in all cases, so, SQL server can not use any indexes.

Note that it is not the problem caused by an absence of an index over column Name, Full Table Scan is used even it a much simpler case:

create procedure SelectCustomers2
  @SSN char(9) = null,
  @ContractNumber int = null
as
  select * from Customers 
    where (SSN=@SSN or @SSN is null)
    and (ContractNumber=@ContractNumber or @ContractNumber is null)
GO

In this case SQL server must use different indexes for different cases, and as a result, can not use any index at all.

However, this solution with ëORí is ok for the non-indexed columns. For example, if we have not only Name, but also FirstName, location, and Status, all not indexed, it is ok to write:

create procedure SelectCustomers
  @SSN char(9) = null,
  @ContractNumber int = null,
  @Name varchar(64) = null,
  @FirstName varchar(64) = null,
  @Location varchar(64) = null,
  @Status int = null
as
  if @SSN is not null 
    select * from Customers where SSN=@SSN
  else if @ContractNumber is not null
    select * from Customers where ContractNumber=@ContractNumber
  else 
    select * from Customers where 
      (Name=@Name or @Name is null) or
      (FirstName=@FirstName or @FirstName is null) or
      (Location=@Location or @Location is null) or
      (Status=@Status or @Status is null)
GO

In other words, we write explicitly cases for indexed columns, but merge together cases for all non-indexed ones. This is because for non-indexed columns, SQL server will need to do a Clustered Index Scan (on SSN) and satisfy the conditions for each non-indexed column.

Disclaimer:

In many articles you will find the execution times of different queries. Please do not interpret these results scientifically or as official benchmarks. You will not even find the configuration of the computer used for these experiments because it is rather unimportant.

What is important is that query A is running 2 times slower then query B. On your computer, it could be different: 1.5 or even 3 times slower. Read the full disclaimer