Infrastructure at your Service

David Barbarin

SQL Server 2014: classic commit vs commit with delayed durability & I/Os

When you learn about SQL Server, you will often hear that a commit transaction is a synchronous operation and that you can trust it. In this blog post, I will provide some details about what we mean by synchronous behavior. The reason is that sometimes, when I talk about the new delayed durability feature provided by SQL Server 2014, there are some confusions. If you want more details on this new feature, please read the blog post of my colleague Stéphane Haby here. A quick shortcut is often the following: writing to the transaction log is synchronous, while writing with the new delayed durability feature is asynchronous.

First of all, you probably know that the buffer manager guarantees that the transaction log is written before the changes to the database are written. This is the famous protocol called Write-Ahead logging (aka WAL). Log records are not written directly to disk but first into the buffer cache and then flushed to the disk in a purely asynchronous manner. However, at the commit time the related thread must wait for the writes to complete to the point of the commit log record in the transaction log. This is the synchronous part of commit operation in order to meet the WAL protocol.

On the other hand, the new delayed durability feature allows the commit operation to be asynchronous (like writing to the transaction) but the big difference is that the related thread doesn’t have to wait until the commit log record is written in the transaction log. This new feature introduces some performance improvements, but as a caveat, there is the loss of data.

We can prove that both commit operations write asynchronously by using either the process monitor tool or by using a debugger and trying to catch the part of the code responsible for writing into the transaction log file.

I will use the following T-SQL script for this demonstration:

–> Commit transaction (without delayed durability option)

AdventureWorks2012;
GO
 
– Ensure DELAYED_DURABILITY is OFF for this test
ALTER DATABASE adventureworks2012 SET DELAYED_DURABILITY = DISABLED;
GO
 
– Create table t_tran_delayed_durability
IF OBJECT_ID(N’t_tran_delayed_durability’, ‘U’) IS NOT NULL
       DROP TABLE t_tran_delayed_durability;
GO
 
 
create table t_tran_delayed_durability
(
       id int identity
);
GO
 
– insert 1000 small transactions
declare @i int = 1000
 
while @i 0
begin
       insert t_tran_delayed_durability default values
 
       set @i = @i - 1;
end;

 

–> Commit transaction (with delayed durability enabled)

– Ensure DELAYED_DURABILITY is ON for this test
ALTER DATABASE adventureworks2012 SET DELAYED_DURABILITY = ALLOWED;
GO
 
– Create table t_tran_delayed_durability
IF OBJECT_ID(N’t_tran_delayed_durability’, ‘U’) IS NOT NULL
       DROP TABLE t_tran_delayed_durability;
GO
 
create table t_tran_delayed_durability
(
       id int identity
);
GO
 
– insert 1000 small transactions
declare @i int = 1000
 
while @i 0
begin
       begin tran tran_1
       insert t_tran_delayed_durability default values
       commit tran tran_1 with (DELAYED_DURABILITY = on)
 
       set @i = @i - 1;
end;

 

Below, you will find an interesting picture of the process monitor trace output that shows the SQL Server file system activity that writes to the transaction log file in both cases.

–> Commit transaction (without delayed durability option)

blog_17_1_procmon_normal_transaction

You will notice that SQL Server uses the WriteFile() function to write to the transaction log for each commit operation (4096 bytes each). I will only show you a sample of the output, but you can imagine the final number of records you can have here. If we take a look at the process monitor stack you will notice that SQL Server uses the WriteFile() Windows function located in the Kernel32.lib library to write to the transaction log with an overlapped structure (in others words asynchronous I/O).

blog_17_3_procmon_stack

This test confirms what Bob Dorr explains in the Microsoft article about SQL Server I/Os and transaction log I/O.

–> Commit transaction (with delayed durability enabled)

blog_17_1_procmon_delayed_transaction

In this case, the same function is used by SQL Server with a big difference here: SQL Server will group some IO into chunks (in my case 16K, 48K, and 60K) before writing to disk. Cleary, there is less activity here (in my case 18 lines against approximatively 1000 lines for the first test).

We can also attach a debugger (for instance WinDbg) to the SQL Server process and set a breakpoint in the Kernel32!writefile() function for the calling thread in order to have more details about the execution stack. Note that the process monitor stack showed the module KERNELBASE.dll for the WriteFile() function but as mentioned by this Microsoft article kernelbase.dll gets functionality from kernel32.dll and advapi32.dll.

blog_17_1_windbg_stack_writefile

Both commit operations show the same stack except of course the number of executions.

To summarize, I wanted to show you that both commit operations (with and without delayed duration) are using asynchronous IO to write to the transaction log file. The big difference is that with the delayed durability option, SQL Server improves the log IO writes performance by deferring and grouping the IO into 60K chunks before writing to the disk. I hope this will help you understand more about SQL Server commit operations.

 

4 Comments

  • Olivier Berthommé says:

    Hi David,

    Thanks for this demo with write functions. As much I know, SQL Server always access files (data & log files) with FILE_FLAG_WRITETHROUGH flag ; is DELAYED_DURABILITY changed this behavior ?
    And a more open question : why data files should need such flag ? With WAL protocol, even with a cache loss (power issue), data can be recovered by transaction logs.

    Thanks,
    Olivier

     
  • Hi Olivier,

    DELAYED_DURABILITY doesn’t change the SQL Server behavior of opening the data and log files with the option FILE_FLAG_WRITETHROUGH. In fact this feature only changes the way that SQL Server will flush the log blocks into the transaction log file (grouped and deferred).

    SQL Server will always use the FILE_FLAG_WRITETHROUGH to meet the FUA (Force Unit Access). As you know, FILE_FLAG_WRITETHROUGH tells the underlying OS not to use write caching that isn’t considered stable media. So, the local disk cache is “normally” bypassed but it’s not always the case for all storage systems that simply ignore this command. FILE_FLAG_NO_BUFFERING is used by SQL Server as well to avoid the system caches that can be synchronous by nature (remember that SQL Server uses asynchronous IO in most cases ).

    As you know also, WAL is how ACID is achieved. Basically, the log record must be flushed to a stable media before the data file is modified. A controller with a battery backed cache is also considered stable and you can easily imagine what is going on when a battery failure occurs … The idea of using FILE_FLAG_WRITETHROUGH is to bypass the intermediate caches to ensure write operations to be stable but as I said earlier, this command is ignored in some cases.

    Hope this answer helps you ! ;-)

     
  • Olivier D says:

    Hello,
    We had to change the durability option for one of our server because the wait on the log was too high. Now we have the question about AG.
    What about synchrone AG vs Delayed_Durability = ON.
    I read many articles but non mention the combinaison of both. If I understand correctly, the synchrone AG will wait that the log is hardened on the secondary to be sure everything is ok and the system is healty…
    But with Delayed_Durability, we say to sql that we don’t want to wait for the log and we can continue. So what happens if we apply the Delayed_Durability to a Synchronous database ?
    Does the primary waits for the log to be hardened on the secondary and so bypass the option, or is the log flush that is passed to the secondary ?
    Or the System automatically changes to asynchronous mode ?
    Thanks

     
    • David Barbarin says:

      Hi Olivier,

      Sorry for the delay. I missed the comment :-(

      The behavior is well explained in the Microsoft documentation (cf. section Always On Availability Groups and Mirroring)

      Delayed durable transactions do not guarantee any durability on either the primary or any of the secondaries … addition, they do not guarantee any knowledge about the transaction at the secondary … After commit, control is returned to the client before any acknowledgement is received from any synchronous secondary

      To cut the story short, you will get the same guarantee than using asynchronous replica

      Hope it helps (even if I’m late)

       

Leave a Reply


one + = 10

David Barbarin
David Barbarin

Senior Consultant & Microsoft Technology Leader