Contact  |  Site Map  |  Privacy      Call for more info
 
Latest articles

SQL Server and VMware: A Potentially Fatal Combination

In this experiment we used WMPlayer 2.0.2. Guest system and Host system: Vista HP. Very likely this problem does not affect the ESX server, but we are still waiting a confirmation from VMware that ESX server fully respects Microsoft SQL Server Core I/O Requirements
http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/sqliobasics.mspx

Virtual machines are becoming increasingly popular.  They are often used to create ‘safe sandboxes’ on a physical box.  Administrators are likely to put programs inside of these virtual machines and occasionally SQL servers.  Unfortunately, putting SQL Server inside of a virtual machine is potentially a scenario for disaster!  

Let’s think back for a moment to the state of databases 30 years ago. What was a critical property of all databases before they had stored procedures, partitioned views, XML and all of the other bells and whistles of today?  From the beginning, all databases (excluding “toy” databases like MySQL) included some type of write-ahead logging mechanism to ensure data integrity.  LDF log in MS SQL, Undo and redo logs in Oracle… they insured that when a COMMIT was executed, the data was guaranteed to be saved.

To implement it on Windows, SQL Server marks its IO operations with the flags WRITETHRU and NOBUFFERING.  When SQL Server reads, it utilizes its own cache and reads the data without the use of the OS cache.  When SQL Server writes, it asks the OS to wait until the operation is complete.  Notice that SQL Server has everything under control and does not use the OS cache at all.  This is a critical difference between SQL Server and other applications (Fig. 1).

    Fig. 1

From the perspective of the Host system, what happens when SQL server is running inside of the virtual machine?  The Host system does not know the contents of the ‘safe sandbox’.  It has no knowledge as to whether Windows, Linux, or another operating system is inside.  This renders it completely unaware of the OS flags inside of the virtual machine.  Hence, when the virtual machine “asks” the Host system to write data, it is served in the same manner as are ordinary, non-SQL Server applications.  The write is done asynchronously, using a cache (again, Fig. 1).

This difference can have multiple effects, both positive and negative.  On the positive side, Perfmon data reveals that SQL Server under VMWare often works even faster (Fig. 2)! 

Fig. 2

Given that Perfmon only updates charts once per second, the IO system must be artificially slowed down to observe the effect.  As evidenced in the chart, when writes were finished on the virtual machine (the point is demarcated by the vertical green line), there were still writes remaining on the physical box.

This leads to the negative effect.  Notice that when COMMIT is executed on the virtual machine, the data is not sent to the IO system.  The data is simply flashed to the real box.  Unfortunately, when SQL Server is run with VMWare, an interruption such as a power failure can cause you to lose transaction data reported as “committed”.  Much worse, in the event of such an interruption, the cache can change the order of writes, leading to a corrupted database!

We set up an experiment to illustrate the effect of a power failure on data integrity when SQL Server is set up with VMWare.  We created two instances of SQL Server 2005 on a physical and virtual PC.  We called them ZOOSTATION and VZOO, respectively.  Each server instance had VMTest database specially created for this test.     The virtual server was linked to the physical one using Linked servers.     Both VMTest databases contained the same data table: demotable (81832 rows each) with identical data.

To test virtual machine database integrity after a power failure, we did the following:  The same UPDATE statement was executed over the physical and virtual database in the same distributed transaction.  This update appends “2_” to the beginning of every row in the demotable table.

vm6_update.jpg

After COMMIT,

The external drive where data/logs were located was removed immediately to emulate an OS crash.           After the OS crash had been emulated, we restarted both SQL servers. On the physical box everything was fine and the data was updated as expected.

Disclaimer:

In many articles you will find the execution times of different queries. Please do not interpret these results scientifically or as official benchmarks. You will not even find the configuration of the computer used for these experiments because it is rather unimportant.

What is important is that query A is running 2 times slower then query B. On your computer, it could be different: 1.5 or even 3 times slower. Read the full disclaimer »

 
Close

Are you looking for help with SQL Server performance?

A SQL Server expert is currently available to help you.