Mar 13

A first look at SQL Server 2016–String_split

I’ve been developing T-SQL for about 20 years now and almost since day one there has been functionality that , IMO, has been missing.  That is grouped string concatenation (ie forming a delimited string ) and string split ( breaking up a delimited string).  When I say missing, I really mean some form of succinct functionality that can do it and not some form of hideous kludge that can cause a religious argument between various members of the community about why their method is better than any one else’s Smile

My hopes were not high for SQL Server 2016 fixing this either when this article https://blogs.msdn.microsoft.com/sqlserverstorageengine/2015/09/08/passing-arrays-to-t-sql-procedures-as-json/ was published.  I’ve never been convinced by complex (ie XML) datatypes within the relational database,  there are better ways if that is what you have to do, but , <sarcasm> yippee </sarcasm>, with JSON baked in here is another method of string splitting rather than doing it properly.  Honestly,  it really does make me want to cry.

But, with the release of SQL Server RC0,  snuck into the release notes was   image

Woohoo, a STRING_SPLIT function, finally at last, lets break out the bunting.

So, lets have a quick investigation into how to use it and are there any caveats around it.

I wont repeat the official documentation https://msdn.microsoft.com/en-gb/library/mt684588.aspx but to recap, using it is really simple  STRING_SPLIT(string,delimiter) where string is the delimited string to split and delimiter to split that up on.  Notably, delimiter can only be a single char, that’s not a deal breaker by any means, but could be an annoyance.

Right, I’m now assuming that you are familiar with the official docs, so is there anything else to know about this that function before we go forward and immediately change all our previous kludges with this new function.

Firstly anyone that knows me will know that I’m interested in estimations ( and the wider optimization area) , so the first thing is , how many rows are estimated to be returned from the TVF ?

Here is a script to test this :

1
2
3
4
5
6
7
8
9
10
11
Create Table Strings
(
String varchar(max)
 
)
go
insert into Strings values('one,two,three')
go
Select *
from  Strings
Cross Apply string_split(strings.String,',')

The plan is as you would expect from a CLR TVF,  in fact I suspect that this is just an CLR TVF implementation but i’m not gonna state that as such. Smile

image

and the detail for the TVF ?

image

50, the estimate of the rows outputted is 50. After a number of tests this hasn’t changed.  Inline with similar TVF functionality, this isn’t surprising news, but i was curious as to what the number was.

Secondly, as I’m suspecting that this is just CLR, then i’ve found in the past that the optimizer plays ‘safe’ and as the TVF functions are opaque to the optimizer, then Halloween protection is implemented to protect against the possibility that the same data is being read as is being updated.

Does this happen here ? Test code is:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Create Table Strings
(
String varchar(max)
 
)
go
Create Table Dest
(
String varchar(max)
 
)
go
insert into Strings values('one,two,three')
go
Insert into Dest(String)
Select Value
from  Strings
Cross Apply string_split(strings.String,',')

Notice that i’m inserting to a different table.

And….

image

Awesome, a table spool,  so the same does apply here, we have Halloween protection, on large resultsets that spooling is going to hurt.

If you need to insert the result somewhere, then SELECT INTO wont need this as it knows you are creating a new table so there is no chance of a double update.  But, if you SELECT INTO a table and then INSERT those to the table you intended to in the first place, then that is effectively the same thing as a spool so more than likely not worth the effort.

Finally to finish this blog off, here is a query with a parallel plan :

1
2
3
4
5
SELECT sod.SalesOrderDetailID,sod.ModifiedDate,p.Style
FROM Sales.SalesOrderDetail sod
INNER JOIN Production.Product p ON sod.ProductID = p.ProductID
--cross apply string_split(style,',')
ORDER BY Style

Here is a query with a serial plan :

1
2
3
4
5
SELECT sod.SalesOrderDetailID,sod.ModifiedDate,p.Style
FROM Sales.SalesOrderDetail sod
INNER JOIN Production.Product p ON sod.ProductID = p.ProductID
cross apply string_split(style,',')
ORDER BY Style

Right <sigh> , so it inhibits a parallel plan being built,  sheesh.  Also FYI, there is no NonParallelPlanReason specified in the plan.

That’ll do for now, but i will be running some performance tests soon to see if it is worth the trouble.

UPDATE : 2016-03-21 For performance tests,  Aaron Bertrand has done a comprehensive write up.

Mar 29

T4 Support in SSDT

Sometimes a piece of technology passes you by completely, T4 templating happens to have done that for me.  On tuesday night, at SQL Supper, Geoff Clark ( a uk MCM ) (t) done a fantastic presentation on end-to-end datawarehouse loading, a portion of this was on t4 templating.  The question arose about SSDT and T4 templating and as a coincidence literally the next day  the SSDT team put out the March 2014 release,  having my interest already piqued i thought id have a play 🙂

So… lets add a TSQL Template

sqltt

 

That will now add base TSQL template into you project, however there is a slight snag , its broken 🙁

broken

This is confirmed and will be fixed in the next release as stated in this msdn forum thread. Bugger!!!

However, until then, all is not lost,  although the DacFX portion is broken T4 templating still works so a more ‘generic’ solution is available.  The basis of this is remarkably similar to TSQL Smells,  iterate through a project adding the .sql files into a model and examining the DOM for interesting ‘stuff’.

After a bit of playing around, very much a T4 noob, this is the script I came up with:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
<#@ template language="C#" debug="true" hostspecific="true" #>
<#@ assembly name="Microsoft.VisualStudio.Shell.Interop.8.0" #>
<#@ assembly name="EnvDTE" #>
<#@ assembly name="EnvDTE80" #>
<#@ assembly name="VSLangProj" #>
<#@ assembly name="C:\Program Files (x86)\Microsoft SQL Server\120\SDK\Assemblies\Microsoft.SqlServer.TransactSql.ScriptDom.dll" #>
<#@ assembly name="C:\Program Files (x86)\Microsoft Visual Studio 11.0\Common7\IDE\Extensions\Microsoft\SQLDB\DAC\120\Microsoft.SqlServer.Dac.dll" #>
<#@ assembly name="C:\Program Files (x86)\Microsoft Visual Studio 11.0\Common7\IDE\Extensions\Microsoft\SQLDB\DAC\120\Microsoft.SqlServer.Dac.Extensions.dll" #>
<#@ import namespace="Microsoft.VisualStudio.Shell.Interop" #>
<#@ import namespace="EnvDTE" #>
<#@ import namespace="EnvDTE80" #>
<#@ import namespace="Microsoft.VisualStudio.TextTemplating" #>
<#@ import namespace="Microsoft.SqlServer.Dac" #>
<#@ import namespace="Microsoft.SqlServer.Dac.Model" #>
<#@ import namespace="System.IO" #>
<#@ import namespace="System.Collections.Generic" #>
<#@ output extension=".sql" #>
 
-- Dynamic File generated by db
<#
    var hostServiceProvider = (IServiceProvider)this.Host;
    var dte = (DTE)hostServiceProvider.GetService(typeof(DTE));
 
    using (TSqlModel model = new TSqlModel(SqlServerVersion.Sql110, new TSqlModelOptions { }))
    {
        foreach(Project project in dte.Solution)
        {
            IterateThroughProject(project.ProjectItems,model);
        }
 
        List<TSqlObject> allTables = GetAllTables(model);
        foreach (var table in allTables)
        {
#>
--				Table <#= table.Name.Parts[0] #>.<#= table.Name.Parts[1] #>
<#
 
        }
    }
 
#>
-- File Done
<#+
 
    public List<TSqlObject> GetAllTables(TSqlModel model)
    {
        List<TSqlObject> allTables = new List<TSqlObject>();
 
        var tables = model.GetObjects(DacQueryScopes.All, ModelSchema.Table);
        if (tables != null)
        {
            allTables.AddRange(tables);
        }
        return allTables;
    }
 
 
 
    private void IterateThroughProject(ProjectItems PrjItems,TSqlModel model)
    {
        foreach(ProjectItem PrjItem in  PrjItems)
        {
 
            if(PrjItem.Name.EndsWith(".tt", StringComparison.OrdinalIgnoreCase)){ // Dont Load the files we want to build
                continue;
 
            }
            if(PrjItem.ProjectItems!=null)
            {
                IterateThroughProject(PrjItem.ProjectItems,model);
            }
            if(//PrjItem.Object.GetType().ToString() == "Microsoft.VisualStudio.Data.Tools.Package.Project.DatabaseFileNode" && 
                PrjItem.Name.EndsWith(".sql", StringComparison.OrdinalIgnoreCase))
            {
#>
--				This is a sql file and will be processed
--				<#= PrjItem.FileNames[0] #>
<#+
                if (!PrjItem.Saved)
                {
                    PrjItem.Save();
                }
                StreamReader Reader = new StreamReader(PrjItem.FileNames[0]);
 
                string Script = Reader.ReadToEnd();
                model.AddObjects(Script);
            }
        }
    }
 
#>

This should all be fairly self explanatory, we iterate through the project in the member function IterateThroughProject adding any found .sql files to the model. Then use the model.GetObjects member to find all tables in the model, iterate over that list printing schema and table name.

I have seen a few clunky TSQL generation routines written in TSQL but i think porting those into T4 templating and having them built directly into a dacpac will be a big boon.

Have fun

Mar 13

Whats This Command Bar ?

One of the challenges I faced while creating my TSQL Smells visual studio (SSDT) add in, was trying to find out exactly what the names of the Command Bar objects internal to Visual Studio were.

Most of the advice on the internet is fairly haphazard ie guess/trial and error, but there is a simpler way.

Here http://1drv.ms/1fsR3n3 is a C# project that will add the Command Bar name into every command bar.  After dropping that into you addins directory you should now see something like this:

cmdbar

The code itself is pretty simple :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
public void OnConnection(object application, ext_ConnectMode connectMode, object addInInst, ref Array custom)
{
    _applicationObject = (DTE2)application;
    _addInInstance = (AddIn)addInInst;
    if(connectMode == ext_ConnectMode.ext_cm_UISetup)
    {
        object []contextGUIDS = new object[] { };
        string toolsMenuName = "Tools";
 
        Commands2 commands = (Commands2)_applicationObject.Commands;
        Microsoft.VisualStudio.CommandBars.CommandBars Bars = (Microsoft.VisualStudio.CommandBars.CommandBars)_applicationObject.CommandBars;
 
       for(int x=1;x<Bars.Count;x++){
           Microsoft.VisualStudio.CommandBars.CommandBar Cb = Bars[x];
           try
           {
                Command command = commands.AddNamedCommand2(_addInInstance,"Addin"+x.ToString(),
                    "This CmdBar is ["+Cb.Name+"]" , "Executes the command for My Addin",
                    true, 59, ref contextGUIDS,
                    (int)vsCommandStatus.vsCommandStatusSupported +
                    (int)vsCommandStatus.vsCommandStatusEnabled,
                    (int)vsCommandStyle.vsCommandStylePictAndText,
                    vsCommandControlType.vsCommandControlTypeButton);
 
                if ((command != null))
                {
                     CommandBarControl ctrl =
                    (CommandBarControl)command.AddControl(Cb, 1);
 
                    ctrl.TooltipText = "Executes the command for MyAddin";
                }
            }
            catch
            {
            }
        }
    }
}
 
public void QueryStatus(string commandName, vsCommandStatusTextWanted neededText, ref vsCommandStatus status, ref object commandText)
{
    if(neededText == vsCommandStatusTextWanted.vsCommandStatusTextWantedNone)
    {
        if(commandName.StartsWith("WhatsThisCommandBar.Connect.Addin"))
        {
             status = (vsCommandStatus)vsCommandStatus.vsCommandStatusSupported|vsCommandStatus.vsCommandStatusEnabled;
             return;
        }
    }
}

Hope this helps someone struggling to find the correct command bar name.

Mar 04

SSDT – Cut n Paste substituting CMDVARs

Another thing that slows me down with SSDT is testing stored procedure that use CMDVARS.

Consider this proc :

Prc1

If I wish to test that in SSMS, I can cut-n-paste it in then use SQLCMD mode and add in :setvar statements or simply replace the $(CmdSwitch) and $(Otherdb) with the appropriate values.  Ok for one or two values but what if there are lots ?  What if there is a simpler way  ? Well now there is:

Introducing “CmdVar Cut”

cmdvarcut

Selecting this option will place the text with the values substituted directly into your clipboard ready for pasting to SSMS.  As with Fast Deploy, the values are defined in Tools->Options->TSQL Smells->Deploy CMDVars.

The download for this is available from here http://dataidol.com/davebally/ssdt-tsql-smells-add-in/

Have fun

 

 

 

Mar 01

SSDT – Fast deploy

Let me state first of that I do really like SSDT, integration of a source control system of your choice and all its extra features make its use a real boon.

However……..

On my past few assignments SSDT has been heavily used, and deployment is done via Build/Deploy powershell scripts.  This is all very well and good and makes for predictable,repeatable deployments  but it does slow you down rather a lot: You make a change to a SQL artefact, then build which takes a minute or two, then deploy which once again takes a minute or two.  This over the course of a day adds up to rather a lot of time,  it also means that mostly development is done in SSMS then cut-n-pasted into SSDT on completion which really negates a lot of the advantages of SSDT and makes it no more useful than a historic system using Source safe as source control.

 

This got me to thinking that there should be a faster way to deploy during development, so have now extended TSQL Smells to do exactly that.  Right Clicking  on a file ( or folder ) will now show you the “Fast Deploy” option

FastDeploy

 

 

This will deploy the selected scripts directly to your selected SQL host without the kerfuffle of a full build-deploy cycle.  User feedback is given in the Output->TSQL Smells window, if nothing appears to be happening this should tell you why.

CMDVars are supported in the Tools->options->TsqlSmells-> Deploy CMD Vars, just enter them as key-value pairs.  These are kept in the registry so will persist over SSDT sessions,  you also set the database to connect to here.

You can download the latest version from here : http://1drv.ms/MEOp6H

Install instructions etc are available here : http://dataidol.com/davebally/ssdt-tsql-smells-add-in/

Let me know how you get on : parser@clearskysql.co.uk