“Oh, if we deadlock we’ll retry a few times until it goes through.”:
Ive had a few variations of this conversation of the years and on the face of it, what does it really matter ? You get a deadlock, you retry, retry, retry until eventually it does complete ok. You get the right data eventually and job done.
The problem here is that although SQLServer does a fine job of detecting and killing out one process or other that is involved in a deadlock, this is NOT instantaneous. There is a background process called the Lock Monitor that scans the lock chains for deadlocked processes. This job fires every 5 seconds. http://msdn.microsoft.com/en-us/library/ms178104(v=sql.105).aspx . The point being that in a deadlock scenario, it can be upto 5 seconds before a process is rolledback. 5 seconds is an eternity on a busy OLTP system.
Lets dummy up some code to demonstrate how with scale the situation gets worse and worse.
Create table DeadLockTab
ID integer identity Primary key,
insert into DeadLockTab
Select top(10) newid()
Now using Adam Machanics SQLQueryStress, lets run some code that will deadlock. All I wish to do I simply update 3 random rows in the 10 row table.
set nocount on
Declare @c integer =0
Declare @id integer = cast(rand()*10 as integer)+1
set SomeData = newid()
where ID = @id
Not the prettiest of code but it will do the job and if we run 100 iterations on 1 thread…
So, how about 100 iteration on 3 threads ?
A massive increase in the Seconds/Iteration from 0.0005 to 0.0900 ( well over a hundred times slower) and only with 6 deadlocks (Total Exceptions) over the 300 iterations.
Lets go crazy, and simulate a webscale 7 concurrent sessions J
That is the cost of a deadlock right there, an average of .1 second per call.
There are many solutions to this, but I want to say: don’t be afraid to proactively block. If we drop concurrency to 1 by using “Select count(*) from DeadlockTab with (TABLOCKX,HOLDLOCK)”
Then the average execution speed AT SCALE (and that’s what really matters) is dramatically improved.
So consider the time taken to resolve a deadlock next time you are told “If that happens we’ll simply resubmit the query”. 5 seconds is an eon and during this time further locks and blocks will accumulate behind the deadlocked processes further exacerbating the situation.