Aug 22

Who did what to my database and when…

One of the most popular questions on forums / SO etc is, “How can i find out who dropped a table, truncated a table, dropped a procedure….” etc.   Im sure we have all been there,  something changes ( maybe schema or data ) and we have no way of telling who did it and when.

SQL auditing can get you there, and i’m not suggesting you shouldn’t use that,  but what if that is not setup for monitoring the objects you are interested in ?

If you look back through my posts, I have been quite interested in ScriptDom and TSQL parsing,  so if you have a trace of all the actions that have taken place over a time period, you can parse the statements to find the activity.

Drawing inspiration ( cough, stealing ) from Ed Elliot’s dacpac explorer,  I have created a git repo for a simple parser that is created using T4 templates. T4 templates are an ideal use here as the full language domain can be exploded out automatically and then you as a developer can add in your code to cherry pick the parts you are interested in.

At this time the project has hooks to report on SchemaObjectName objects , so any time any object is referenced be that in a CREATE, SELECT , DROP, MERGE …  will fall into the function OnSchemaObjectName and  OnDropTableStatement that will be hit when DROP TABLE is used.

This is not intended to be a full end to end solution, not least as the SQL to be parsed could be coming from and number of sources,  but if you have the skills to run it as is , you probably have enough to tailor it for your requirements.

As ever,  let me know what you think and any comments gratefully received.

The github repo is   :   https://github.com/davebally/TSQLParse

 

Jun 17

Why is creating excel sheets from SSIS so hard ?

If there is one process that should be simpler than it is out of the box, it is creating Excel spreadsheets from SSIS.   Over the years i’ve tried doing it a number of ways, using the built in component,  Interop , OLE DB etc all suck to one degree or another.  Either unreliable or to slow or simply unusable.

A twitter conversation, A) proved I wasn’t alone and B) Pointed me in the direction of EPPlus.

Over on SSC there is already a post on using EPplus with SSIS, some of which, such as putting EPPlus into the GAC, is still relevant for this post.

However, right now, i have a big love of BIML, simply put i think that this is what SSIS should have been in the first place and I personally find all the pointing and clicking a real time sink.  Additionally,  in BIML, one you have written a package to do something , ie a simple dataflow, its a snip to repeat that over 10, 20, 50 , 100 or 1000s of tables.  But the real time saver for me is when you need to re-architect,  ie turn sequential dataflows into a parallel dataflow.  Its only really a case of changing where you iterate in your BIML code.

Anyway,  i’ve combined these two pain points to create a BIML routine that uses EPPlus to output multi-sheeted Excel spreadsheet reliably and fast.

At the moment its very basic , take SQL statements and output the data to an excel file, but in time i will be hoping to create some meta data to start ‘getting the crayons out’ and making them look a bit more pretty.

Code is on GitHub at https://github.com/davebally/BIML-SSIS-Excel-Output ,  hope this of use to someone.

 

 

Mar 24

.Net 4.6.1 and MultiSubnetFailover

Im not a big clustering/HA expert, but I obviously understand the need for such things to exist 🙂

However, the standard SQL tooling, SSMS etc, seems to go out of its way to make the MultiSubNetFailover option as inaccessible and convoluted as possible.  Quite why this option even exists and its not done as default escapes me, clients should be entirely agnostic to the server its talking to.

After going through this pain again, this time deploying SSIS packages to an AG, i stumbled over this msdn blog

https://blogs.msdn.microsoft.com/alwaysonpro/2015/12/01/improved-multisubnet-listener-behavior-with-newly-released-sql-client-provider-in-net-4-6-1/

So .NET 4.6.1 does default to MultiSubNetFailover = True, which is great.

There is a rather cryptic note on the end though

Note that this isn’t what ended up being shipped.  Instead a new default property TransparentNetworkIPResolution was introduced to SqlConnection.

Ive asked for some clarification on what that exactly means, im unable to find any other reference to that property, but In my testing the effect seems to be the same :S

So,  if you have MultiSubNetFailOver=True pain,  give .NET 4.6.1 a whirl,  it should resolve that.

 

Mar 20

A journey to a movie recommender in U-SQL

I haven’t gotten to involved in the whole “Big Data” movement to much,  I’ve kept an arms length watch and played around with a few of those technologies but never really jumped in and got down and dirty.  However, with the public preview of Azure Data Lake, I thought that it was probably time to get past the simple parrot style examples and tutorials and try something from scratch.

What is compelling for me about ADL over other technologies is  I don’t have to learn a platform.  I never have to setup commission / provision any hardware.  As a data guy, to even get to creating and setting up a data storage area in other tech there are any number of hoops to be jumped through, software to be installed and VMs to be built.  With ADL all I have to do is give my card details to Microsoft ( YMMV on if that is a good or bad thing) and I can start playing.

In a more practical sense, ADL also provides you massive scale ( in my Pay as you go subscription, I’m limited to 50 AU, which I believe is effectively 2 cores and many many more are available if required), but 0 of the hardship in managing the machines.  Not only do you have access to that power but you have control where to use it.  If you only have a single job that requires 2000 processors then you only have to pay for them while that job is running, every other job you can pay for the 20 that are needed.

The integration to Visual Studio is neat and there is now no longer any need to submit jobs to Azure to simply try then out as a local emulator is used instead.  Also the powershell functionality looks to be fairly wide-ranging and will help in automating tasks.

What follows is an overview of my experiments that i have published into a GitHib repo. The “Examples” folder are what i would term “simple learnings” and “Full Scripts” are scripts that to a lesser or greater extent do something “useful”.  Im also not suggesting that anything here is “best practice” or method A performs better than method B,  I simply do not have the required size of data to make that call. My aim was to learn the language.

TLDR: Check out the script MovieLens09-CosineSimilarityFromCSVWithMax.usql for a U-SQL movie recommender.

The Movie Recommender is somewhat of the “Hello World” of a Big Data system and is also an interesting subject that everyone can relate to, so here are my attempts in exploring Azure Data Lake and Data Store with this final goal in mind.

First matter is to find a decent data set and for this ill be using the MovieLens 1m dataset, this presents an interesting challenge in simply consuming the  data as its not in a CSV format that is used by ADL.  ADL does support the use of “Custom Extractors”, so that was the logical place to start and I had a few interesting experiments defining an extractor using a few different possibilities, Ie Using a ‘hardcoded’ string output or using the meta data of the EXTRACT statement itself.  This is done by simply interrogating the Output.schema array :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
var s = output.Schema[x];
if (s.Type.Name.Equals("String", StringComparison.InvariantCultureIgnoreCase))
{
   output.Set(x, strings[x].ToString());
}
if (s.Type.Name.Equals("Int32", StringComparison.InvariantCultureIgnoreCase))
{
   try {
      output.Set(x, Int32.Parse(strings[x].ToString()));
   }
   catch
   {
   output.Set(x, -1);
   }
}

Finally though, I considered an extractor a bit ‘cumbersome’ and opaque for my purposes here, but in a productionised system I can certainly see the uses of it especially if you are consuming more complex type data.

Instead I reverted a string split method, again useful to understand how to call the integrated .Net functionality as this required setting up a string array inline to pass in the 2 character separator for the dataset. This necessitates a three part operation, read the file as a series of long strings ( @UserLine ), split on ‘::’ into an array (@UserSplit) and reform into a properly formed, typed and named dataset (@Users).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
@UserLine =
EXTRACT UserLine string
FROM @USERSInp
USING Extractors.Tsv( rowDelimiter: "\n");
 
@UserSplit =
SELECT new SQL.ARRAY<string>(UserLine.Split(new string[]{"::"},StringSplitOptions.None)) AS UserArr
FROM @UserLine;
 
@Users =
SELECT int.Parse(UserArr[0]) AS UserID,
       UserArr[1] AS Gender,
       int.Parse(UserArr[2])  AS Age,
       int.Parse(UserArr[3]) AS Occupation,
       UserArr[4] AS Zip
FROM @UserSplit;

This is functionality that has been long overdue in TSQL and its good that we now have STRING_SPLIT to do this however, joining a string together is not supported (well).

In U-SQL its a snap,  first use ARRAY_AGG and then use .Net (string.Join) to form the string.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
@UserWithMovies =
SELECT u.*,ARRAY_AGG(r.MovieID) AS MoviesArr
FROM @Users AS u
     JOIN
     @Ratings AS r
    ON u.UserID == r.UserID
GROUP BY u.UserID,
         u.Gender,
         u.Age,
         u.Occupation,
         u.Zip;
 
@outer =
SELECT UserID,MoviesArr.Count AS MoviesCount,string.Join(":",MoviesArr) AS MovieList
FROM @UserWithMovies;

For the maths behind the recommender itself I chose a cosine similarity to match those people who are most similar, as it is good enough for my purposes and simple enough to understand.  I also match people on Age and Occupation and filter out ‘exact’ similarities.

I wont repeat the full code here verbatim but ive added comments  and if you have an ADL account you should simply be able to download the movielens dataset cut n paste the script and run it.

Please take a look over the other scripts in the Repo, they may not be as well commented but the intention should be clear.

In terms of cost,  how much did It cost me to get this far ? Peanuts, £1.50 was my bill.  Next step is to take the 20m dataset and repeat this exercise, hopefully with that size data I might see some measurable differences between use flat CSV, tables etc and experiment with the best way to increase vertices.

Mar 13

A first look at SQL Server 2016–String_split

I’ve been developing T-SQL for about 20 years now and almost since day one there has been functionality that , IMO, has been missing.  That is grouped string concatenation (ie forming a delimited string ) and string split ( breaking up a delimited string).  When I say missing, I really mean some form of succinct functionality that can do it and not some form of hideous kludge that can cause a religious argument between various members of the community about why their method is better than any one else’s Smile

My hopes were not high for SQL Server 2016 fixing this either when this article https://blogs.msdn.microsoft.com/sqlserverstorageengine/2015/09/08/passing-arrays-to-t-sql-procedures-as-json/ was published.  I’ve never been convinced by complex (ie XML) datatypes within the relational database,  there are better ways if that is what you have to do, but , <sarcasm> yippee </sarcasm>, with JSON baked in here is another method of string splitting rather than doing it properly.  Honestly,  it really does make me want to cry.

But, with the release of SQL Server RC0,  snuck into the release notes was   image

Woohoo, a STRING_SPLIT function, finally at last, lets break out the bunting.

So, lets have a quick investigation into how to use it and are there any caveats around it.

I wont repeat the official documentation https://msdn.microsoft.com/en-gb/library/mt684588.aspx but to recap, using it is really simple  STRING_SPLIT(string,delimiter) where string is the delimited string to split and delimiter to split that up on.  Notably, delimiter can only be a single char, that’s not a deal breaker by any means, but could be an annoyance.

Right, I’m now assuming that you are familiar with the official docs, so is there anything else to know about this that function before we go forward and immediately change all our previous kludges with this new function.

Firstly anyone that knows me will know that I’m interested in estimations ( and the wider optimization area) , so the first thing is , how many rows are estimated to be returned from the TVF ?

Here is a script to test this :

1
2
3
4
5
6
7
8
9
10
11
Create Table Strings
(
String varchar(max)
 
)
go
insert into Strings values('one,two,three')
go
Select *
from  Strings
Cross Apply string_split(strings.String,',')

The plan is as you would expect from a CLR TVF,  in fact I suspect that this is just an CLR TVF implementation but i’m not gonna state that as such. Smile

image

and the detail for the TVF ?

image

50, the estimate of the rows outputted is 50. After a number of tests this hasn’t changed.  Inline with similar TVF functionality, this isn’t surprising news, but i was curious as to what the number was.

Secondly, as I’m suspecting that this is just CLR, then i’ve found in the past that the optimizer plays ‘safe’ and as the TVF functions are opaque to the optimizer, then Halloween protection is implemented to protect against the possibility that the same data is being read as is being updated.

Does this happen here ? Test code is:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Create Table Strings
(
String varchar(max)
 
)
go
Create Table Dest
(
String varchar(max)
 
)
go
insert into Strings values('one,two,three')
go
Insert into Dest(String)
Select Value
from  Strings
Cross Apply string_split(strings.String,',')

Notice that i’m inserting to a different table.

And….

image

Awesome, a table spool,  so the same does apply here, we have Halloween protection, on large resultsets that spooling is going to hurt.

If you need to insert the result somewhere, then SELECT INTO wont need this as it knows you are creating a new table so there is no chance of a double update.  But, if you SELECT INTO a table and then INSERT those to the table you intended to in the first place, then that is effectively the same thing as a spool so more than likely not worth the effort.

Finally to finish this blog off, here is a query with a parallel plan :

1
2
3
4
5
SELECT sod.SalesOrderDetailID,sod.ModifiedDate,p.Style
FROM Sales.SalesOrderDetail sod
INNER JOIN Production.Product p ON sod.ProductID = p.ProductID
--cross apply string_split(style,',')
ORDER BY Style

Here is a query with a serial plan :

1
2
3
4
5
SELECT sod.SalesOrderDetailID,sod.ModifiedDate,p.Style
FROM Sales.SalesOrderDetail sod
INNER JOIN Production.Product p ON sod.ProductID = p.ProductID
cross apply string_split(style,',')
ORDER BY Style

Right <sigh> , so it inhibits a parallel plan being built,  sheesh.  Also FYI, there is no NonParallelPlanReason specified in the plan.

That’ll do for now, but i will be running some performance tests soon to see if it is worth the trouble.

UPDATE : 2016-03-21 For performance tests,  Aaron Bertrand has done a comprehensive write up.

Feb 28

My favourite wait state– ASYNC_NETWORK_IO

Sometimes it does feel that a problem is a database problem until proven otherwise.  When this does occur I have had to put a lot of time and effort into demonstrating the hows and whys.  One of the easiest ( and most common ) scenarios of this is related to ASYNC_NETWORK_IO waits.

Simply put ASYNC_NETWORK_IO waits occur when SQL Server is waiting on the client to consume the output that it has ‘thrown’ down the wire.  SQL Server cannot run any faster, it has done the work required and is now waiting on the client to say that it has done with data. 

Naturally there can be many reasons for why the client is not consuming the results fast enough , slow client application , network bandwidth saturation, to wide or to long result sets are the most common and in this blog I would like to show you how I go about diagnosing and demonstrating these issues.

Firstly lets make a query execute 10 times faster with no effort, seriously no effort , a single tick box.  This tick box is “Discard results after execution in SSMS”.

image

SSMS is surprisingly slow at painting the text data on the screen,  in fact it seems woefully slow.  By using this tick box SQL server does exactly the same work required as without,  throws the data down the wire as normal and waits for the client to responded, again, as normal.   SSMS however will read all the bytes off the wire,  however it will not show you the data.  This will enable you to run queries at the speed that SQL server can run at but be crippled by the client application speed. Yes SSMS is not SQL Server but a client application.

So lets test this theory with the simplest query possible “SELECT * FROM Sales.SalesOrderHeader”…Here is the profiler output running that query 5 times:

image

An average of ~2.4 ish seconds.

Now, lets turn on ‘Discard results after execution”,  this is the only thing i have done, and re run the query 5 times and…

image

Running at ~240ms , a 10 times speed improvement, reads and CPU are in the same ballpark, the only difference is SQL Server hasn’t been throttled by SSMS having to present the result set to the user.  If you are trying to performance tune a query , are you taking this into account ?  Maybe the problem is not where you think it is.

So why is  ASYNC_NETWORK_IO wait my favourite ?  Well at the global level we can now find out how much of an effect this is having.

If i clear the waits and use Paul Randal’s “Tell me where it hurts” query when running without discard the ASYNC_NETWORK_IO , SQL Server has ‘waited’ 10.44 seconds.

image

With discard on:

Nothing, no rows returned.  There were no waits.

Clearly this demonstrates that client application performance is a serious consideration in our scenario here.

Incidentally,  where you are executing a query from will make a hell of a difference,  I was once involved in a performance issue where the DBAs could not reconcile a duration time shown in a profile trace against the application with the same query executing in SSMS.   The big differentiator was that SSMS was being run on the server itself (RDP’d into <facepalm>), the query was fine,  if SSMS had been run off of the saturated network connection that the applications were running over then the same ‘performance problem’ would have been witnessed.  ‘Fair test’ is something we should always be looking to achieve.

SSMS has another often underused feature ( well it has loads Smile ) :  Client Statistics

This is enabled by this icon in SSMS.

image

What this will enable you to do is compare different result executions over a number of different attributes

image

As you can see my “Wait time on server replies” the inverse if you like of ASYNC_NETWORK_IO is 3(milliseconds).

Why i particularly like this is the “Bytes Received from server” metric,  self evident what that means but in performance tuning in my experience its a question rarely asked but here we have an exact answer to that.  With this simple number we can now ask  “Well how long would it take to copy a same sized file from server to client ?” When the result set size is gbs,  this  simple question will result in a a fairly rapid re-architecting of a system.

In conclusion ASYNC_NETWORK_IO is my favourite wait state as it means ( fairly mendaciously  Smile ) its not a SQL Server problem, its done its work.  Now, lets talk about doing it right.

If you are interested in more real world stories and hints and tips like this then you’ll hear them in my precon at SQLSaturday Exeter on 11th March 2016, be great to see you there .

Feb 14

SQL Saturday 496 Exeter – PreCon Agenda

Hi

Unfortunately due to matters outside of my control, SQL Saturday 502 in Edinburgh has been cancelled, I would like to thank the team though for selecting my pre-con.  Let me know if you would have liked to attend and if a viable option, I would be more than happy to still give the session at some other time.

Also ,after a few requests I thought it best to publish a fuller agenda for the precon at Exeter.

Agenda AM

Plan and optimizer basics.   We start the day with an overview of the optimizer and touch base on many different aspects of it , the functionality involved, why it can sometimes do ‘crazy’ things and remediation of such things.  Overall it will look like this: 

  • Physical and Logical Operators
  • Things the optimizer can do to help you
  • Things you can do to help the optimizer
  • Why schema definition is be important
  • Implied knowledge vs explicit knowledge

Estimation and Statistics  We will start building on the knowledge previously gained to understand how sqlserver ‘costs’ a plan.  The basis of this is Statistics and we will drill down into the algorithms on Estimation and the ‘guesswork’ and assumptions that come with that.

  • The metrics used to cost plans
  • What does estimation mean ?
  • What are statistics and how are they used ?
  • What effect do constants and variables have?
  • Stored procedure effect
  • Remediation for when the assumptions made are wrong.

All being well, lunch Smile

Agenda PM

SQL2014/16 – The new cardinality estimator The cardinality estimator had a significant rebuild in SQL Server 2014,  this section will look at those changes and their effects.  These are sometimes subtle and sometimes not,  significantly though if you are running SQL server 2014 you will need to understand  why your resource usage has changed compared to previous versions.

  • Cardinality Estimator was rewritten in SQL2014
  • Many changes that have subtle and not-so subtle effects
  • Incrementing statistics
    • Statistics at the partition level
  • SQL2016 – Query Store
    • Alerting you to plan changes

Query Optimizer Internals Finally we will get further into the inner workings of the optimizer than you ever thought possible.  If you wish to know why one candidate plan has been preferred over another than this is where to look.  This is a deep dive session exposing inner workings and how to interpret some extremely verbose logging.  

  • Deep dive using documented and undocumented trace flags
  • Gaining further insight into the optimizer operations
  • Examining Stages, The Memo, Trees and Rules

If this sounds like a fun way to spend the day, I would love your company Smile

http://sqlsatexeter.azurewebsites.net/?page_id=5561 contains all the signup details.

Jan 17

SQL From South West to Scotland : SQLSaturdays #496 & #502

I’m happy, honoured and humbled to be able to say that I will be presenting pre-cons and both SQL Saturday 496 in Exeter and SQL Saturday 502 in Edinburgh.

SQL Saturday 496 Exeter is on Friday 11th March and Saturday 12th March

The event home page is here and signup to my class is :

http://sqlsatexeter.azurewebsites.net/?page_id=5561

SQL Saturday 502 Edinburgh is on Friday 10th and Saturday 11th Jun

The event home page is here and signup to my class is :

https://www.eventbrite.com/e/sql-server-2016-masterclass-query-processing-and-internals-tickets-20699726439

Both will follow the same format and general material so no need to worry that you’ll miss out on something if you can only attend one and not the other, though i would suspect that distance would be a better differentiator 🙂  We will be looking at the Query Optimizer and delving into its internals to find how it builds a plan,  what its limits and limitations are and understanding how to recognize that we have a ‘Bad Plan’.

Hope that you can make one or the other.

 

Jan 05

A first look at Redgate Reference Data Manager

Over the past few years one of the big drives and innovations in the SQL Server world, has been to bring database schema changes into a full ALM (Application Lifecycle Maintenance Methodology).  There are many competing products in this space, but all of them ( well, that I know of )  only solve database schema change problems.  Databases are all about the data and without data applications are pretty useless.  Most systems have some form of standing reference data, countries, currencies, name prefixes etc and Reference Data Manager is a (private) beta product from Redgate and its aim is to treat data as a first class citizen in the ALM cycle.

Mostly,  even if this data is even ‘managed’,  they are managed by large MERGE ( or MERGE-esque ) scripts that validate the whole of the dataset at deploy time.  The problem here is that these are large by nature and the whole intent is unclear, its all or nothing and as human eyes do generally like to look over a deployment script before it happens, make the process longer and more error prone.

Essentially Reference Data Manager, is Redgate bringing the already mature technology of Data Compare into Microsoft’s SSDT environment using Deployment Contributors.  This, Redgate working within SSDT, is interesting in itself and I do hope that Redgate will bring more of their products into SSDT in the future.  Also, and more pertinently,  this means that the data is now maintainable with SSDT right next to the schema and is an integral part of the deployment, not an extra ‘cog’ somewhere.

So, lets take a look, as with all Redgate tools, its pretty simple.  Ive defined a table “COUNTRY”, which is nothing special:

1
2
3
4
5
6
CREATE TABLE [dbo].[Country]
(
	CountryId INT NOT NULL PRIMARY KEY,
	CountryName varchar(255) not null,
	Currency    char(3) not null 
)

Now, to add data I simply R-Click on “Country.Sql” in Solution Explorer and I see a new “Add Reference Data” option:

That adds a child file to Country.Sql called Country.refdata.sql,  this is the file in which you define your data.  Presently this is a simple SQL script with inserts:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
insert into [dbo].[Country](CountryID,CountryName,Currency)
Values (1,'UK','GBP');
 
insert into [dbo].[Country](CountryID,CountryName,Currency)
Values (2,'USA','USA');
 
insert into [dbo].[Country](CountryID,CountryName,Currency)
Values (3,'France','EUR');
 
insert into [dbo].[Country](CountryID,CountryName,Currency)
Values (4,'Germany','EUR');
 
insert into [dbo].[Country](CountryID,CountryName,Currency)
Values (5,'Austrlia','AUD');

In the fullness of time I would expect a grid UI here, but as I say it’s beta ATM.

So now when I deploy my solution, the Reference Data Manager kicks in and adds to the deployment script the relevant data changes.  As this is a new table this is obviously just inserts.

PostDeploy

What about adding a new country and modify ‘USA’ to be ‘U.S.A.’ and correcting the error on currency from ‘USA’ to ‘USD’. I modify the refdata.sql file to be :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
insert into [dbo].[Country](CountryID,CountryName,Currency)
Values (1,'UK','GBP');
 
insert into [dbo].[Country](CountryID,CountryName,Currency)
Values (2,'U.S.A.','USD');
 
insert into [dbo].[Country](CountryID,CountryName,Currency)
Values (3,'France','EUR');
 
insert into [dbo].[Country](CountryID,CountryName,Currency)
Values (4,'Germany','EUR');
 
insert into [dbo].[Country](CountryID,CountryName,Currency)
Values (5,'Austrlia','AUD');
 
insert into [dbo].[Country](CountryID,CountryName,Currency)
Values (6,'New Zealand','NZD');

When I now deploy the solution, there is no schema change but the script will contain only the data changes.

Postusd

As you can see,  it only publishes the required changes,  not retest using MERGE or whatever at deployment time.  Obviously this is a really trivial example but how many reference data tables does your system have ? and how many rows in each of those ? Is the intent a lot clearer ?

Personally I think that now by treating data as a first class citizen the Reference Data Manager will not only enable data to be source controlled but also simplyfy the deployment pipeline by being very explicit about the changes that are to be made.

If you would like to get involved then you can sign up for the beta here : http://www.red-gate.com/ssdt

Oct 04

SQL Saturday 467 – Precon – Query processing and internals

Im not doing a tremendous amount of public speaking this year and concentrating on more professional matters,   however im pleased to say that on Friday 5th December i will be presenting a day long pre-con on Query Processing and internals at SQL Saturday 467 in Southampton.

We will be spending the whole day looking at the optimizer and picking apart its internals to show how it generates a plan from your given sql. This will cover a wide range of topics, starting from execution plan basics and working ourselves through to demonstrating internals with undocumented traceflags.

Other precons are available if this doesn’t float your boat.   Full details, pricing and hotel discount codes are available from http://www.sqlsaturday.com/467/eventhome.aspx