Contact  |  Site Map  |  Privacy      Call for more info
 
Latest articles

Encrypted Columns and SQL Server Performance

For legal reasons, it is very important to encrypt table column with sensitive data, like SSNs. SQL server 2005 allows you to encrypt data using different algorithms using symmetric and asymmetric keys. Alternatively, you can also use password-based encryption (that password must be supplied by the client to encrypt/decrypt data).

But everything has a price, so we want to know how expensive it is. As usual, we prepare some test data, 100K ëcustomersí:

SET NOCOUNT ON
go
create table Customers_Data (
  name varchar(128) not null, 
  SSN varchar(10) not null)
go
declare @cnt int set @cnt=100000
while @cnt>0 
  begin
  set @cnt=@cnt-1
  insert into Customers_Data (name,SSN) 
    select 'Cust_'+convert(varchar,@cnt),
    '010'+convert(varchar,@cnt)
  end
go

Our artificially-generated fake SSNs are all unique, they have a format ë010xxxxxxí (but they are shorter in some cases).

So, at first, we perform our experiments without any encryption:

-- Not encrypted
create table Customers (
  id int identity primary key, 
  name varchar(128) not null, 
  SSN varchar(10) not null)
go

In our first experiment we copy raw Customers_Data into the target table and measure the elapsed time for that:

-- inserts
declare @t1 datetime, @cnt int
truncate table Customers
set @t1=getdate()
insert into Customers (name,SSN) 
  select * from Customers_Data
select datediff(ms,@t1,getdate())
go

In the second experiment, we find one record by given SSN, using a table scan (comparing all SSNs). This test is repeated multiple times to make it more precise.

-- table scan
declare @t1 datetime, @name varchar(128), @cnt int
set @t1=getdate()
set @cnt=30
while @cnt>0 begin
  set @cnt=@cnt-1
  select @name=max(name) from Customers 
    with(index(0)) ñ forced table scan
    where SSN='010'+convert(varchar,@cnt)
  end
select datediff(ms,@t1,getdate())/30.
go

To encrypt data, we must create encryption keys (this is an example):

create master key encryption by password = 'p@sswOrd'
go
create asymmetric key AsymKey With Algorithm = RSA_1024
go

Now if a database is stolen, encrypted data can not be recovered, because master key is not backed up with a user database.

When data is encrypted, the result is stored as varbinary(128) (and encrypted data can not be too long).

create table encr_Customers (
  id int identity primary key, 
  name varchar(128) not null, 
  SSN varbinary(128))
go

We want to make our encryption transparent to other SQL code and the front end. Our view Customers must behave exactly like an old table Customers (except for you can not use ëtruncate tableí).

create view Customers 
as
  select 
     id, name, 
     convert(varchar(10),
       DecryptByAsymKey(AsymKey_ID('AsymKey'), SSN)) as SSN
    from encr_Customers
go

So, when we read data, we decrypt column SSN on the fly. How can we encrypt data when it is inserted into this view? We can use an INSTEAD OF trigger:

create trigger IT1 on Customers
instead of insert
as
begin
  insert into encr_Customers (name,SSN) 
    select name,
       EncryptByAsymKey(AsymKey_ID('AsymKey'),SSN)
       from inserted
end
GO

We should create similar trigger for the updates, if encrypted column is involved in the update.

Our view works exactly like a table before (except you can not say truncate table Customers, you should use truncate table encr_Customers instead). We can not make a password-based encryption transparent: we would have to hardcode a password in a view, but it ruins the very idea of encryption)

Now we can compare different keys and different algorithms of encryption:

Note that asymmetric keys are much longer to encrypt, and especially to decrypt data.

But now the funny part. Say we want to find a Customer by SSN:

select * from Customers where SSN=@value

Usually, SSN is indexed and record is found immediately. But if you try it on our encrypted view, you would see that SQL server uses a table scan, even if an index is created on encr_Customers.SSN. Why? It appears that a statement above is equivalent to (simplified code):

select * from encr_Customers where decrypt(SSN)=@value

If we rewrite it as

select * from encr_Customers where SSN=encrypt(@value)

it would work much faster and it would use an indexÖ but it would not find our record! This happens because all encryption functions are non-deterministic, every time they generate another value: (values truncated)

select EncryptByAsymKey(AsymKey_ID('AsymKey'), 'This is my example')
select EncryptByAsymKey(AsymKey_ID('AsymKey'), 'This is my example')
select EncryptByAsymKey(AsymKey_ID('AsymKey'), 'This is my example')

(values truncated)

0x9FD1DDA8A6ACED67C0C24CDD566CAD655E7B7E3D76197896Ö
0xE18FFFB9EC75CD2093089A5DDB83220A244346F77AA548BFÖ
0x98D563BA855573A442A278B2565D9216192AD5BC7B664637Ö

For that reason, unfortunately, all indexes on encrypted data are absolutely useless.

Disclaimer:

In many articles you will find the execution times of different queries. Please do not interpret these results scientifically or as official benchmarks. You will not even find the configuration of the computer used for these experiments because it is rather unimportant.

What is important is that query A is running 2 times slower then query B. On your computer, it could be different: 1.5 or even 3 times slower. Read the full disclaimer

 
Close

Are you looking for help with SQL Server performance?

A SQL Server expert is currently available to help you.