What Price A TimeOut ?

If you have been interested in query processing and looking at execution plans, you may may well have noticed that a query states a “Reason For Early Termination” as a property.

clip_image001

There are 4 return values here, <BLANK>, “Good Enough Plan Found”, “MemoryLimitExceeded” and”Timeout”.

Let’s ignore “MemoryLimitExceeded”, I’ve never experienced it in the wild but the intent behind this is (relatively) clear.  “Good Enough Plan Found” states that the optimizer has found a decent plan to execute, remember that the optimizers’ job is to find a “Good enough plan in a short enough time”, not a perfect plan.  It’s not a good thing to happen if the optimizer takes 60seconds to return that perfect plan, if an OK plan can be found and executed is 5 seconds.  <BLANK> is tricky one to summarise succinctly, but this states that optimization has come to a natural end, the optimizer completed a parsing stage but did not meet the entry requirements of the next. That leaves “TimeOut” and in this blog I wish to focus on that.

Conventional wisdom dictates that TimeOut states that the optimizer has taken too long whilst optimizing a query and the best of the plans so far found is returned.  This would imply that if the optimizer could work harder a ‘better’ plan would be returned.  First off, let’s find a query that does “Timeout”.

 

 
declare @SalesPersonID integer
declare @FullName nvarchar(255)
declare @JobTitle nvarchar(30)
declare @SalesTerritory nvarchar(30)
declare @Sales2002 money
declare @Sales2003 money
declare @Sales2004 money
select  @SalesPersonID = SalesPersonID,
        @FullName = FullName,
        @JobTitle = JobTitle,
        @SalesTerritory = SalesTerritory,
        @Sales2002 = [2002],
        @Sales2003 = [2003],
        @Sales2004 = [2004]
   From Sales.vSalesPersonSalesByFiscalYears
  where SalesTerritory ='Canada'

If you execute this against AdventureWorks2012, you will see that the reason for early termination is “Timeout”

clip_image002

Plan explorer has highlighted the issue, and also neatly add a yellow warning triangle to the plan itself.  So why did this query timeout?  Well the number of potential combinations was massive and the optimizer only tried so many before returning, how many? Interesting question, let’s use the undocumented flag 8675 to find out the stages entered and the number of tasks (not potential plans as such) considered by the optimizer.

 
declare @SalesPersonID integer
declare @FullName nvarchar(255)
declare @JobTitle nvarchar(30)
declare @SalesTerritory nvarchar(30)
declare @Sales2002 money
declare @Sales2003 money
declare @Sales2004 money
select  @SalesPersonID = SalesPersonID,
        @FullName = FullName,
        @JobTitle = JobTitle,
        @SalesTerritory = SalesTerritory,
        @Sales2002 = [2002],
        @Sales2003 = [2003],
        @Sales2004 = [2004]
   From Sales.vSalesPersonSalesByFiscalYears
  where SalesTerritory ='Canada'
option(QueryTraceOn 8675,QueryTraceOn 3604)

That gives us the output

clip_image003

To prevent the optimizer carrying on for longer, search(1) has been aborted at task 1100 and a plan has been returned.

So this is potentially a bad plan, right?  And if we made the optimizer work harder a better plan would be found.  This can actually be achieved by another undocumented flag 8780, and that disables the optimizer timeout(s).  Let’s see that in action:

 

 
declare @SalesPersonID integer
declare @FullName nvarchar(255) 
declare @JobTitle nvarchar(30) 
declare @SalesTerritory nvarchar(30) 
declare @Sales2002 money 
declare @Sales2003 money 
declare @Sales2004 money 
select  @SalesPersonID = SalesPersonID, 
        @FullName = FullName, 
        @JobTitle = JobTitle, 
        @SalesTerritory = SalesTerritory, 
        @Sales2002 = [2002], 
        @Sales2003 = [2003], 
        @Sales2004 = [2004] 
   From Sales.vSalesPersonSalesByFiscalYears 
  where SalesTerritory ='Canada'
option (querytraceon 8780,querytraceon 8675,querytraceon 3604)

clip_image004

As you can see no more “TimeOut” and the returned plan, which is now an entirely different plan to before, has a reason for early termination of “Good enough Plan Found”, but we did take a total time of 0.132 secs as opposed to 0.053 secs of the “TimeOut” version.  Is that sacrifice worthwhile? How does our “Good Enough” query runtime duration compare to the Timeout query duration?

clip_image005

As it turns out not to well, the “Good Enough” plan is slower by quite a significant margin and rejecting the notion that “TimeOut” = Bad.

But, this is just my laptop (i3 with SSD if you must know) hardly what you would call enterprise standard hardware. Maybe, the extra cost of cpu is offset by the saving in IO? Not a bad assumption. Lets throw some ‘big iron’ at the problem, to the Cloud.

A quick hattip must go to Jamie Thomson and his AdventureWorks on Azure community project, if you wish to play on Azure this is the best way to do it.  http://sqlblog.com/blogs/jamie_thomson/archive/2012/03/27/adventureworks2012-now-available-to-all-on-sql-azure.aspx

 

SQLAzure does not support the use of traceflags, but you can use the USE PLAN hint and provide a plan. This I have done and also created stored procedures for the queries. What are the timings now?

With timeout

clip_image006

and “Good enough”

clip_image007

These timings are after several executions so no compile times are included here.

That’s another assumption put to bed, throwing hardware at the problem hasn’t helped (as usual J ).

Returning back to on premise (or at least my laptop), why is the “Good Enough” query slower, specifically why is the CPU usage so high? There is nothing in particular in the plan that would point to a high cpu usage.  One tool that is gaining more exposure in the SQLServer world is XPERF and that is an excellent tool to reveal what a process has spent its time doing.  Examining the traces whilst running the “Good Enough” plan shows something quite interesting:

clip_image008

The majority of the processing time has been in Dateadd and DatePart, this is coming from the line in the view “YEAR(DATEADD(m, 6, soh.[OrderDate])) AS [FiscalYear] “.

The plan itself exposes this activity in a rather innocuous “compute scalar” operator that is executed 31,465 times.

clip_image009

A quick reworking of the schema and view will enable us to entirely remove this calculation and therefore this cpu cost.

 
alter table Sales.SalesOrderHeader add FiscalYearCalc smallint
go
update Sales.SalesOrderHeader set FiscalYearCalc = YEAR(DATEADD(m, 6, [OrderDate]))
go
CREATE VIEW [Sales].[vSalesPersonSalesByFiscalYearsCalc]
AS
SELECT pvt.[SalesPersonID]
      ,pvt.[FullName]
      ,pvt.[JobTitle]
      ,pvt.[SalesTerritory]
      ,pvt.[2002]
      ,pvt.[2003]
      ,pvt.[2004]
FROM (SELECT soh.[SalesPersonID]
            ,p.[FirstName] + ' ' + COALESCE(p.[MiddleName], '') + ' ' + p.[LastName] AS [FullName]
            ,e.[JobTitle]
            ,st.[Name] AS [SalesTerritory]
           ,soh.[SubTotal]
           ,[FiscalYearCalc] AS [FiscalYear]
      FROM [Sales].[SalesPerson] sp
INNER JOIN [Sales].[SalesOrderHeader] soh
   ON sp.[BusinessEntityID] = soh.[SalesPersonID]
INNER JOIN [Sales].[SalesTerritory] st
   ON sp.[TerritoryID] = st.[TerritoryID]
INNER JOIN [HumanResources].[Employee] e
   ON soh.[SalesPersonID] = e.[BusinessEntityID]
INNER JOIN [Person].[Person] p
   ON p.[BusinessEntityID] = sp.[BusinessEntityID]
) AS soh
PIVOT
(
SUM([SubTotal])
FOR [FiscalYear]
IN ([2002], [2003], [2004])
) AS pvt;

Now upon retrying the equivalent queries something quite interesting happens:

 
declare @SalesPersonID integer
declare @FullName nvarchar(255)
declare @JobTitle nvarchar(30)
declare @SalesTerritory nvarchar(30)
declare @Sales2002 money
declare @Sales2003 money
declare @Sales2004 money
select @SalesPersonID = SalesPersonID,
       @FullName = FullName,
       @JobTitle = JobTitle,
       @SalesTerritory = SalesTerritory,
       @Sales2002 = [2002],
       @Sales2003 = [2003],
       @Sales2004 = [2004]
  From Sales.vSalesPersonSalesByFiscalYearsCalc
 where SalesTerritory ='Canada'

clip_image010

 
declare @SalesPersonID integer
declare @FullName nvarchar(255)
declare @JobTitle nvarchar(30)
declare @SalesTerritory nvarchar(30)
declare @Sales2002 money
declare @Sales2003 money
declare @Sales2004 money
select @SalesPersonID = SalesPersonID,
       @FullName = FullName,
       @JobTitle = JobTitle,
       @SalesTerritory = SalesTerritory,
       @Sales2002 = [2002], 
       @Sales2003 = [2003],
       @Sales2004 = [2004]
  From Sales.vSalesPersonSalesByFiscalYearsCalc
 where SalesTerritory ='Canada'
option(querytraceon 8780)

clip_image011

Allowing the optimizer to run to its conclusion and not timeout has resulted in the same plan being returned. So, in this case, we can state that “Timeout” is certainly “Good Enough”.

Hopefully this blog will remove some of the vilification of “Timeout” and to some extent shows how it protects the optimizer from getting carried away with itself to much and also demonstrates how a change in schema results in a change of plan as the expected workload changes.

Naturally this ‘all depends’, your mileage will vary, and there will be cases where maybe removing the timeout will allow for a better plan to be built.  In the grand scheme of things should you overly worry about timeouts ? Ill leave you with a quote from Conor Cunningham “In practice, what I suggest you do is to examine whether you are happy with any given plan and stop worrying about the nastiness of the generalities of the search framework”. To paraphrase, If the plan you have does the job you need it to in an acceptable time frame, then that is “Good Enough”.

Leave a Reply

Your email address will not be published. Required fields are marked *