Jan 28

TSQL Smells in SSDT – The monster lives

What seems like many many moons ago, I released a powershell script that could detect many different “TSQL Smells“. Whilst it worked, I was never very happy with the interface, it was clunky and didnt feel quite ‘right’. One of my longer term aims was to do something about that and make TSQLSmells more accessible.
Static code analysis would have been mine prime choice, and that would make perfect sense, however after much blood, sweat and tears (and time wasted) i found that extending code analysis in SSDT is not supported, so i gave up disheartened.

However, poking around the other day I stumbled over a solution to my problems, and rekindled my interest, how about creating a visual studio add-in and dropping it straight into SSDT ?

So, without any further ado here it is “TSQL Smells in SSDT”, to run drop two files from here into your Visual Studio Addin directory, something like Documents\Visual Studio 2010\Addins and run SSDT. You may see an error saying it has failed to load (error number 80131515), in which case use the “Unblock” option in the file properties of the dll.

unblock

Now, load up your project and in the R-Click menu you should see “SQL Smells”.

SQLSmells

Click that and your error list will be populated with the Smells that you know and love.

smells

If that is to noisy for you, you can reduce the noise by unselecting smells in Tools->Options->TSQL Smells.

Have fun and let me know how you get on 🙂

Massive Hat Tip to Jamie Thomson for being a guinea pig for this process

If you are running VS2013, then the .addin needs a small tweak.
Please see http://bit.ly/1bRB1SN for details.

Jan 03

Reading a snapshot – Are there performance implications ?

My present role is to look at the performance of a large datawarehouse load.  This is right up my alley,  and I have be playing with all manner of performance tuning fun, indexing, execution plans, filtered statistics etc, all good stuff.

However,  during QA testing some of my suggestions actually made things worse.  Not a little worse, but a whole lotta worse,  big surprise.  Going back to my dev box I confirmed that the TSQL changes should have made things better, but there was one significant ( other than data load )  difference.  To ensure transactional integrity a database snapshot is used,  this means that the load process works on a static view of the data whilst the OLTP system can carry on doing its work.  In the QA testing we, by definition, ran the whole process,  when I was dev’ing I read from the database not the snapshot,  interesting.

When I switched to using a snapshot things did indeed go slower,  lets try this against AdventureWorks:

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
CREATE DATABASE AdventureWorks2012ss ON
( NAME = AdventureWorks2012_Data, FILENAME = 
'D:\DATAAdventureWorks2012_data.ss' )
AS SNAPSHOT OF AdventureWorks2012;
GO
set statistics io on
GO
DBCC DROPCLEANBUFFERS
GO
use AdventureWorks2012
GO
select count(SOH.Comment),count(SOD.ProductID) from sales.SalesOrderHeader SOH
inner loop join sales.SalesOrderDetail SOD
on SOH.SalesOrderID = SOD.SalesOrderID
GO
use AdventureWorks2012ss
GO
set statistics io on
go
DBCC DROPCLEANBUFFERS
GO
select count(SOH.Comment),count(SOD.ProductID) from sales.SalesOrderHeader SOH
inner loop join sales.SalesOrderDetail SOD
on SOH.SalesOrderID = SOD.SalesOrderID
GO

OK, so that is a fairly non-sensical query and forcing the loop join is there to demonstrate the issue, but the plans are exactly the same. Running those and tracing in profiler shows the performance difference :
SnapshotRead

Reading the database ~1.0-1.1 secs, reading the Snapshot ~1.5-1.6 seconds, a sizeable difference. Other noteworthy things here are that the snapshot file and the database file exist on the same physical drive and that drive is a slow USB drive. Testing against my laptops SSD, try as I might, demonstrated no measurable difference.

There is however, one difference in the output of the above scripts, physical reads.
SnapshotPhysicalRead

150 against the database, 1150 against the snapshot. This sounds to me like a likely cause of the slowdown.

When reading a snapshot the sparse file has to be read initially but then the database is read if the page has not yet been changed, a double whammy if you will. Also this happens on a page read by page read basis, this is why I forced the nested loop join in the query. Using Process Monitor from the sysinternals tool set we can see this activity in action….

Firstly reading the database:

ReadDB

Nice large reads of 65,536 bytes each, now the snapshot…

ReadSnapshot

Lots of reads of 8,192, this is the activity that I believe is slowing my queries down. I asked Microsoft for an opinion on what is actually happening here, the response was :

When we do a large read against a COW sparse file, pages that have not been updated in the primary will not be in the sparse file and we will have to read from the primary’s file. This is done on a page by page basis so can result in many reads against the primary for a single read in the snapshot DB.

If the pages have been pushed over to the snapshot’s COW file then they can be read in a large read.

Essentially, we’ll only do large reads against pages that have been pushed into the snapshot.

The broken up reads against the primary are issued in parallel, so while it is more IOs it is not necessarily a longer time to do those IOs.

Sparse file reads themselves are not likely very optimal since they are by definition fragmented, so OS will likely end up breaking up those reads anyway.

So the activity that I am seeing is natural and normal and that explanation does highlight the physical differences that are necessitated when reading a snapshot.

So back to the problem in hand, to fix the queries that were running slow in the real examples I used the FORCESCAN hint which will prevent the nested loop operation and enable SQLServer to do larger page reads against the database file for those unchanged pages in the snapshot, this did have a dramatic effect.

Have you seen similar activity before ? Is this only an issue for slow or overloaded io systems ? I’d love to know what you have witnessed.

Oct 20

Manchester and Leeds user groups – 16/17 Oct

Thanks to all those that turned out of Wednesday and Thursday night in Leeds and Manchester to hear my presentation on Estimation and Statistics.  The ‘lucky’ Manchester crowd also got to hear ‘Trace Flag fun’, a deep dive into optimizer internals.

As promised here are the slides and scripts:

 

Estimation and Statistics

TraceFlag Fun

If you have any follow up questions then please mail me : dave@clearskysql.co.uk

Sep 30

A quickie post on debugging

I’m not a great fan of the debugging feature in management studio, It always strikes me that to get any use from it then you are iterating (looping) too much ie doing TSQL wrong.  TSQL is not a procedural programming language and the debugger would seem to be most useful if you did develop TSQL in a procedural manner.  If you could use it to peek inside a temporary table or execute a single cte in a complex statement then we are talking 🙂

What I do like to do though is use the little known, at least to the people I speak to,  “Trace Query in SQL Server Profiler” option. 

Profile

This is accessed by R-Clicking in a query window and will run profiler with a filter to the current windows spid.  Quite neat.

This feature can be further enhanced to be even more useful.  When profiler is run it will default to using the template “TSQL_SPs”.  The default configuration of this template is, IMO, not very useful,  so I have created my own “TSQL_SPs” template which contains those events and columns that I am interested in.  That being, all errors and warnings, SP:StmtCompleted, SQL:StmtRecompile, SQL:BatchCompleted and a few others,  with the Duration,CPU, Reads, Writes columns.  This is simple to do, create a profiler session with the events and columns that you are concerned about and simply File->Save As->Template and select “TSQL_SPs” from the drop down.  Hopefully next time you “Trace Query in SQL Server Pofiler” your template will be loaded and have the SPID filter automatically added.

To my mind, that is real TSQL debugging, how hard is this statement going to hit the metal?

 

Jul 24

Nearing Nirvana with Hekaton

Rarely does a piece of technology make a difference to a system without someone (developer, DBA, architect etc) sitting down and thinking about how to apply the technology to their problems.

For me one of the big pain points in TSQL development is that arrays/in memory storage are missing. If I wish to hold some data temporarily for processing, I have to use either a temporary table or a table variable. Reading and writing to these tables involves a massive load of logging/locking/latching, hmm isn’t that a use case for hekaton ?. Can we use a hekaton table in place of a temp table ?

Well, there are some restrictions in CTP1 but, Yes we can have a temporary hekaton table J

Firstly we need to define a table type:

1
2
3
4
5
6
7
8
9
Create type HekatonTest as TABLE 
(
    [IdCol] [int] NOT NULL,
    [guid] [uniqueidentifier] NOT NULL
    PRIMARY KEY NONCLUSTERED HASH 
    (
    [IdCol]
)WITH ( BUCKET_COUNT = 131072)
)WITH ( MEMORY_OPTIMIZED = ON )

You may be thinking here can the DURABILITY option of SCHEMA_ONLY be used here? Afraid not ( at least not yet ), the durability option is explicitly not supported when declaring a user defined type, it throws an error.

typedur

Lets attempt to use this type :

1
2
declare @Table HekatonTest
select tbl.IdCol,tbl.guid from @Table tbl

This errors with :

That’s a shame L We need to have a natively compiled sproc to use this type, oh well!

UPDATE 2013-10-28 : In CTP2 you can now do this.  Though, as with regular table variables row estimates are 1 so careful with those spills.

1
2
3
4
5
6
7
8
9
10
11
12
CREATE PROCEDURE HekatonTableVar
WITH NATIVE_COMPILATION, EXECUTE AS OWNER, SCHEMABINDING
AS
BEGIN ATOMIC WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = N'us_english')
declare @Table dbo.HekatonTest
insert into @Table values(1,'267E1137-4EE7-4DA0-A63F-FC68F718403B')
insert into @Table values(2,'FCF669A3-2E39-4609-B64A-A5A200A62B80')
insert into @Table values(3,'CB06D311-B92C-4C7B-9C82-407C9276627C')
insert into @Table values(4,'DF468BE9-5B28-4874-9935-59057C4E29BF')
select tbl.IdCol,tbl.guid from @Table tbl
end
go

Now this is obviously a pretty trivial (and stupid) example but does show that we can at least use an in-memory table as temporary storage, a massive win.

Does this make a big difference? Lets compare against a table variable:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
CREATE PROCEDURE TableVar
AS
declare @Table table
(
    [IdCol] [int] NOT NULL,
    [guid] [uniqueidentifier] NOT NULL
)
 
insert into @Table values(1,'267E1137-4EE7-4DA0-A63F-FC68F718403B')
insert into @Table values(2,'FCF669A3-2E39-4609-B64A-A5A200A62B80')
insert into @Table values(3,'CB06D311-B92C-4C7B-9C82-407C9276627C')
insert into @Table values(4,'DF468BE9-5B28-4874-9935-59057C4E29BF')
select tbl.IdCol,tbl.guid from @Table tbl
go

Running that over 100 iterations in 100 threads (10,000 individual calls) we get an average execution time of 0.38 seconds. With the Hekaton table : 0.0014 seconds, 200 times faster!! All very un-scientific, but this alone would be enough to justify a jump to 2014 on release IMO.

UPDATE 2013-10-28 : With CTP2 allowing the use of hekaton table typed table variables outside of natively compiled stored procedures this is a very compelling reason to consider upgrading.

 

Jul 17

How full are your Hekaton Buckets ?

Hekaton is a really interesting technology, but is a world away from the functionality that we know and love. The SQL team have done a great job of disguising this departure from us by integrating it inside the SQL Server engine but none the less is a different beast entirely. Although the ultimate aim, I would imagine, is a seamless integration where the user ( and developer) is not really concerned with the underlying storage technology there will be many real world issues occurring if the differences are not fully understood.

The way that Hekaton stores data is in hash buckets, this is a fundamental tenet. A hash is simply a function applied to some key data and the bucket is where the relating row is stored.

For example : if our hash function was X%5 then our buckets for the values 1 through 10 would be populated thusly :

Bucket     Values

  1. 5,10
  2. 1,6
  3. 2,7
  4. 3,8
  5. 4,9

As % is the function for modulo (divide an return the remainder) 9%5 = 4. The hash function for SQL Server would be much more complicated than this and like the hash function used in a hash join will differ depending on the data being hashed.

So this is quite interesting since when we define a hekaton table we need to specify the number of buckets that we think ( perhaps even assume that we need) upfront:

1
2
3
4
5
6
7
8
CREATE TABLE BucketDemo
(
col1 INT NOT NULL PRIMARY KEY NONCLUSTERED HASH WITH (BUCKET_COUNT = 1025),
col2 INT NOT NULL,
coll3 INT NOT NULL,
INDEX [idx] NONCLUSTERED HASH ([col2]) WITH(BUCKET_COUNT = 1025),
INDEX [idx2] NONCLUSTERED HASH ([col3]) WITH(BUCKET_COUNT = 1025),
) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA)

Now lets populate that up with some data:

1
2
3
4
5
INSERT INTO BucketDemo
select top(8000) row_number() over(order by (select null)),
cast((cast (newid() as varbinary)) as integer),
row_number() over(order by (select null))%10
from sys.columns a cross join sys.columns b

So what is the bucket usage for the BucketDemo table now ?

Well, there are a number of new dmvs have been added to SQL Server 2014 as you may well expect, the one relevant one that we want is sys.dm_db_xtp_hash_index_stats.

1
select * from sys.dm_db_xtp_hash_index_stats where object_id=object_id('BucketDemo')

First thing to note is the obvious buggette, the index name is cartesian joined to the indexes, this will be fixed from CTP2 (if such a thing will exist) onwards. For your reference here is sys.indexes

So, looking at sys.dm_db_xtp_hash_index_stats what information can we glean?

Lets start with the primary key index “PK__BucketDe__35…” , index_id 4.

Total_bucket_count is 2048, that’s interesting in itself, we asked for a bucket count of 1025. By experimentation it easy to see that the requested bucket count is rounded up to the next power of 2 and not the exact number requested. Empty_bucket_count is quite obviously this is the number of buckets that have no data contained within them. For the primary key we have 0, there are no empty buckets. Next up avg_chain_len this is the average chain length (or how many rows are contained within ) for all buckets and max_chain_len is the maximum chain length.

For index_id 3 we can see 49 unused buckets and a fairly short max_chain_len but for index_id 2 , 2038 buckets are empty ie 10 buckets are used and avg len and max len both 800. This isn’t to surprising the data contained within will always be in the range 0-9 as the formula to calculate it is x%10. So I have vastly over estimated the number of buckets required to support this index. I specified 1025 which was rounded up to 2048 but given the data I have there will be no chance of any of the others being used.

To my mind a correct bucket estimation required for your production data will make or break your system, underestimate and you will be ‘scanning’ to higher number of rows, to high and to much memory will be allocated, its only 8bytes per bucket if unused but could add up to a fair amount.

Jul 12

Hekaton CTP1 – Row order limitation

UPDATE : Good news, this post relates to CTP1 only, the 8000 row limit has been lifted in CTP2

If you have been reading about hekaton and compiled stored procedure you are probably full of aspirations to move all your data into memory and magically have your system for faster.

Slow down just a minute though and carefully look through the list of supported and non-supported functionality here. That is quite a long list and im sure that given time there will be many items moving into the ‘supported’ arena.

However there is one restriction not listed that I have stumbled over, ordering is limited to 8000 rows. If you are planning to do anything which requires ordering , ORDER BY , GROUP BY ,TOP etc, there is a hard limit of 8000 rows that can be ordered.

Just for japes lets give it a try…

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
CREATE TABLE Hek1
(
--See the section on bucket_count for more details on setting the bucket count.
col1 INT NOT NULL PRIMARY KEY NONCLUSTERED HASH WITH (BUCKET_COUNT = 100000),
col2 INT NOT NULL,
col3 INT NOT NULL,
INDEX [idx] NONCLUSTERED HASH ([col3]) WITH(BUCKET_COUNT = 100000)
) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA)
GO
 
 INSERT INTO Hek1
select top(8001) row_number() over(order by (select null)),
row_number() over(order by (select null)),
row_number() over(order by (select null))
from sys.columns a cross join sys.columns b
go
 
 Drop PROCEDURE LargeSort
go
CREATE PROCEDURE LargeSort
WITH NATIVE_COMPILATION, EXECUTE AS OWNER, SCHEMABINDING
AS
BEGIN ATOMIC WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = N'us_english')
select col3,count(*) from dbo.Hek1
group by col3
end
 
 exec LargeSort

You should now have the error :

What if we run :

1
2
3
4
5
6
7
CREATE PROCEDURE HekatonGroup
WITH NATIVE_COMPILATION, EXECUTE AS OWNER, SCHEMABINDING
AS
BEGIN ATOMIC WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = N'us_english')
select col1,count(*) from dbo.Hek1
group by col1
end

Well that works. Why ? Just like in a regular query (ie non compiled) it is not need here, the aggregation is on col1, our primary key. It simply doesn’t matter.

So does this limit really matter ? Well as this functionality is aimed at an OLTP environment then im saying no. If are requiring to order that many rows you are *probably* doing something wrong.

As an aside, did you notice the order, or the lack of it ? That’s right it completely unordered, all SQL Server has done is scan the hash buckets in turn. The rows are always going to be in a ‘random’ order. So now more than ever the adage of “Without order by, order is not guaranteed’ is true now more than ever.

Jul 08

Examining Hekaton Code

In my previous post, I made reference to a Microsoft Research paper on Hekaton Internals. This states that the SQL Engine produces C code which is then compiled to a DLL. So that would leave you to the assumption that there is a C file somewhere that can be looked at to find out some more about Hekaton, its internals and how it processes data. Well there is J

Under the directory ‘C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\xtp‘ (assuming that you have installed SQL Server itself to C:, there will be some subdirectories, that is , one further assumption, that you have compiled at least one natively compiled stored procedure. The folders here are the databaseids. In them you will find a series of files with the naming convention of : xtp_p_5_629577281. 5 here being the databaseid and 629577281 being the procedures’ object id.

The file of most interest is the .c file, the C code itself. The .MAT file is the required metadata for the procedure.

Have fun reading the C code, it can make your ears bleed, remember that this code is not meant for human consumption.

Reading this code can give you an insight into what operations the engine is going to perform on your behalf in much the same way that a query plan would.  In CTP1 there is no execution plan for compiled code but as Ben Nevarez blogs here ( http://www.benjaminnevarez.com/2013/07/hekaton-transforming-query-plans-into-c-code/ ) they are coming, so in time we wont have to go to C files for that.

Have fun….

Jul 03

First steps into Hekaton – XE Slow parameter passing

If you are anything like me then you will have already downloaded and installed the CTP1 release of SQL Server 2014. There are a lot of things of interest here, probably top of most peoples list is Hekaton, the new in memory OLTP transactionally compliant database.

There are some bold claims with hekaton but to add a touch of realism to the mix the MSDN page Support SQL Server Features [for Hekaton] http://msdn.microsoft.com/en-us/library/dn133181(v=sql.120).aspx is an invaluable resource for the ins and outs of the tripwires in the way to a hekaton deployment.

So you downloaded and installed SQL Server 2014 CTP and have run the demo script supplied at http://msdn.microsoft.com/en-us/library/dn133079(v=sql.120).aspx what else do you need to know about hekaton ? Firstly you would be foolish to overlook Kalen Delanys whitepaper and excellent resource http://download.microsoft.com/download/F/5/0/F5096A71-3C31-4E9F-864E-A6D097A64805/SQL_Server_Hekaton_CTP1_White_Paper.pdf. Perhaps less accessible is this http://research.microsoft.com/pubs/193594/Hekaton%20-%20Sigmod2013%20final.pdf paper from Microsoft Research. Im sure that there will lots of caveats and techniques popping up on the web from here on going forwards too.

SQL Server 2014, also features a slew of new extended events, one that caught my eye is “hekaton_slow_parameter_passing” in the Analytics channel. So as the name would suggest there is a ‘fast’ and ‘slow’ way to pass parameters into a natively compile stored procedure.

So lets modify the example provided in the link above to have a parameter:

1
2
3
4
5
6
7
CREATE PROCEDURE ParamTest @col3 integer
WITH NATIVE_COMPILATION, EXECUTE AS OWNER, SCHEMABINDING
AS
BEGIN ATOMIC WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = N'us_english')
SELECT col1, col2, col3 FROM dbo.Destination where col3 = @Col3;
END
GO

 

And run an XE Session with this event and ‘natively_compiled_proc_execution_started’ ( which as the name implies is raised when a compiled procedure is executed’ in SSMS.

 

If you call the procedure with :

ParamTest 1

 

In the live data window you should now see the call to the stored procedure registered.

So what is involved in a ‘slow’ parameter call ?

 

Well try:

ParamTest @Col3 = 1

 

You should now see:

 

Yup, naming the parameter has sent the code down a ‘slow’ route. Another reason is ‘parameter_conversion’, so if the data type has had to be converted.

 

This is simple to see

ParamTest ‘1’

 

Is there any real significant performance difference between the two ? Good question, I would say yes as why the ‘fast’ code route. Once I get a good test-bed of data it is something I will certainly try.

 

It is interesting to think that now this goes against accepted best practice, there is a valid reason for not naming procedure parameters on the call. Also think of the RPC calls, they all specify a parameter name right ?

Jun 30

SQLRelay – Cardiff and London – Estimation and Statistics

The basis of query optimization is Estimation and Statistics.  Without a decent estimation of the amount of rows to be processed, then the entire query plan could well be ‘wrong’.  These estimations are built via the statistics and the interactions between them is vital piece of understanding.

With this in mind its always a great pleasure to present my “Estimation and Statistics” session, to spread the knowledge and watch as a few pennies start to drop in the audience whilst revealing the internals.  Last week I was able to do this twice at the Cardiff and London legs of the SQL Relay chain of events that tracked it way around the country, a big hat tip to all the organisers and committee members and helpers for putting on a great series of events.

As promised that slides and scripts from my session are now available here :

http://sdrv.ms/1aYY3tb

 

Hope that you find them useful and if it is still all a bit of a mystery, feel free to mail me : dave@clearskysql.co.uk