|
SQL Server and VMware: A Potentially
Fatal Combination Update 01/29/2009
Virtual machines are becoming increasingly
popular. They are often used to create
‘safe sandboxes’ on a physical box.
Administrators are likely to put programs inside of these virtual
machines and occasionally SQL servers.
Unfortunately, putting SQL Server inside of a virtual machine is potentially
a scenario for disaster! Let’s think back for a moment to the state of
databases 30 years ago. What was a critical property of all databases before
they had stored procedures, partitioned views, XML and all of the other bells
and whistles of today? From the
beginning, all databases (excluding “toy” databases like MySQL) included some
type of write-ahead logging mechanism to ensure data integrity. LDF log in MS SQL, Undo and redo logs in
Oracle… they insured that when a COMMIT was executed, the data was guaranteed
to be saved. To implement it on Windows, SQL Server marks
its IO operations with the flags WRITETHRU and NOBUFFERING. When SQL Server reads, it utilizes its own
cache and reads the data without the use of the OS cache. When SQL Server writes, it asks the OS to
wait until the operation is complete.
Notice that SQL Server has everything under control and does not use the
OS cache at all. This is a critical
difference between SQL Server and other applications (Fig. 1).
Fig. 1 From the perspective of the Host system, what
happens when SQL server is running inside of the virtual machine? The Host system does not know the contents of
the ‘safe sandbox’. It has no knowledge
as to whether Windows, Linux, or another operating system is inside. This renders it completely unaware of the OS
flags inside of the virtual machine.
Hence, when the virtual machine “asks” the Host system to write data, it
is served in the same manner as are ordinary, non-SQL Server applications. The write is done asynchronously, using a
cache (again, Fig. 1). This difference can have multiple effects, both
positive and negative. On the positive
side, Perfmon data reveals that SQL Server under VMWare often works even faster
(Fig. 2)!
Fig. 2 Given that Perfmon only updates
charts once per second, the IO system must be artificially slowed down to
observe the effect. As evidenced in the
chart, when writes were finished on the virtual machine (the point is
demarcated by the vertical green line), there were still writes remaining on
the physical box. This leads to the negative effect. Notice that when COMMIT is executed on the
virtual machine, the data is not sent to the IO system. The data is simply flashed to the real box. Unfortunately, when SQL Server is run with
VMWare, an interruption such as a power failure can cause you to lose
transaction data reported as “committed”.
Much worse, in the event of such an interruption, the cache can change
the order of writes, leading to a corrupted database! We set up an experiment to illustrate the
effect of a power failure on data integrity when SQL Server is set up with
VMWare. We created two instances of SQL
Server 2005 on a physical and virtual PC.
We called them ZOOSTATION and VZOO, respectively. Each server instance had VMTest database specially created for this test. The virtual server was linked to the physical
one using Linked servers. Both VMTest
databases contained the same data table: demotable
(81832 rows each) with identical data. To test virtual machine database integrity
after a power failure, we did the following:
The same UPDATE statement was executed over the physical and virtual
database in the same distributed transaction.
This update appends “2_” to the beginning of every row in the demotable
table.
After
COMMIT,
The
external drive where data/logs were located was removed immediately to emulate
an OS crash. After the OS crash had
been emulated, we restarted both SQL servers. On the physical box everything
was fine and the data was updated as expected. Unfortunately, on the virtual machine we saw a
number of errors.
We set state of the VMTest database to “Emergency”
to read the data inside.
In
conclusion, it is important to be aware that SQL Server does not work under
VMWare. While it seems to work, it leads
to to database instability. We repeted
the entire experiment numerous times and every time, the database on the
physical machine was unharmed, yet the database on the virtual machine became
corrupted. There are no warnings, not
red flags or alerts. At some point, you
may just be left with a corrupted database. |
|
Disclaimer:
In many articles you will find the execution times of different queries. Please do not interpret these results scientifically or as official benchmarks. You will not even find the configuration of the computer used for these experiments because it is rather unimportant. What is important is that query A is running 2 times slower then query B. On your computer, it could be different: 1.5 or even 3 times slower. Read the full disclaimer » | |
|
|