Changing In-Memory Table definition without down time 4

One of the restrictions of a “memory optimised” table is that you cannot modify its definition once created. You may find that once your table has been in use for a period of time it may be apparent you don’t need one of the Hash indexes for example, or perhaps you made a hash (pardon the pun) of the BUCKET_COUNT, perhaps you need to change a Hash index into a Range because the average row chain is too high because of the low cardinality of the indexed columns.

Consider the table definition below:

create table mytable (
	id	uniqueidentifier not null 
		primary key nonclustered hash( id ) with ( bucket_count = 8192 ),
 
	account_id		int not null,
	trans_amount	decimal( 20, 2 ) not null,
	entry_date		datetime2 not null default( current_timestamp ),
 
		index nc_account_id nonclustered ( account_id ),
		index nc_entry_date nonclustered ( entry_date ),
 
	)  WITH ( MEMORY_OPTIMIZED=ON, 
			  DURABILITY=SCHEMA_AND_DATA 
			  );

Once created if you need to adjust the BUCKET_COUNT you’d have to copy the data out, drop and recreate the table and copy it back in, that causes down time for your application.

By utilising a feature that has been in the product for as long as I can remember, Partitioned Views, we can do some simple coding that allows you to create the new table alongside the old one, move rows across to the new table while the application is still alive and writing data.

Essentially we will use two tables – new and old, a flag to indicate which table we should be writing to for Insert, Update and Deletes and some INSTEAD OF triggers.

Unfortunately you need to put this in place when you set your original table up – so you pre-empt the likelihood you’ll have to do some form of maintenance.

We need a control table to flag where we should be writing to – which is the master table:

create table active_partition (
	part_id tinyint not null
	);
insert active_partition ( part_id )
    values ( 1 );

To apply this method to our example table at the top of this post we need two versions of the table each containing a new column plus a view.

create table mytable_p1 (
	part_id	tinyint not null default( 1 ),
 
	id	uniqueidentifier not null 
		primary key nonclustered hash( id ) with ( bucket_count = 8192 ),
 
	account_id		int not null,
	trans_amount	decimal( 20, 2 ) not null,
	entry_date		datetime2 not null default( current_timestamp ),
 
		index nc_account_id nonclustered ( account_id ),
		index nc_entry_date nonclustered ( entry_date ),
 
	)  WITH ( MEMORY_OPTIMIZED=ON, 
			  DURABILITY=SCHEMA_AND_DATA 
			  );
 
create table mytable_p2 (
	part_id	tinyint not null default( 2 ),
 
	id	uniqueidentifier not null 
		primary key nonclustered hash( id ) with ( bucket_count = 16384 ),
 
	account_id		int not null,
	trans_amount	decimal( 20, 2 ) not null,
	entry_date		datetime2 not null default( current_timestamp ),
 
		index nc_account_id nonclustered ( account_id ),
		index nc_entry_date nonclustered ( entry_date ),
 
	)  WITH ( MEMORY_OPTIMIZED=ON, 
			  DURABILITY=SCHEMA_AND_DATA 
			  );
go
 
create view mytable
as
	select part_id, id, account_id, 
		   trans_amount, entry_date
	from mytable_p1
	union all
	select part_id, id, account_id, 
		   trans_amount, entry_date
	from mytable_p2;
go

We are not quite ready, we need INSTEAD OF triggers to intercept the Insert, Update or Delete and depending on what partition you have set as master in the control table ( active_partition ) write to the correct physical memory optimised table.

create trigger trg_mytable_ins on mytable
	instead of insert
as
begin
	set nocount on;
 
	if ( select part_id
		 from active_partition
		 ) = 1
 
		INSERT mytable_p1 ( id, account_id, trans_amount, entry_date )
			SELECT id, account_id, trans_amount, entry_date
			FROM inserted
 
	else
		INSERT mytable_p2 ( id, account_id, trans_amount, entry_date )
			SELECT id, account_id, trans_amount, entry_date
			FROM inserted
 
end
go
 
create trigger trg_mytable_upd on mytable
	instead of update
as
begin
	set nocount on;
 
	UPDATE p
		SET account_id = i.account_id,
		    trans_amount = i.trans_amount,
		    entry_date = i.entry_date
	FROM mytable_p1 p with ( snapshot )	
		INNER JOIN inserted i on i.id = p.id
	UPDATE p
		SET account_id = i.account_id,
		    trans_amount = i.trans_amount,
		    entry_date = i.entry_date
	FROM mytable_p2 p with ( snapshot )	
		INNER JOIN inserted i on i.id = p.id
 
end
go
 
create trigger trg_mytable_del on mytable
	instead of delete
as
begin
	set nocount on;
 
	DELETE mytable_p1 with ( snapshot )
	WHERE id IN ( SELECT id FROM deleted )
 
	DELETE mytable_p2 with ( snapshot )
	WHERE id IN ( SELECT id FROM deleted )
 
end
go

That’s the INSERT, UPDATE and DELETE statements covered and working across the correct partitions, if its not already occurred to you then realise this is actually a technique you can also use to mix the use of a memory-optimised table and a storage based table for example in a current V archive type tiered structure (more on that in another post).

The next thing needed is the ability to move rows from one partition to another, basically we are going to move all the rows from say _p1 into _p2 so that we can drop _p1 and recreate it with the correct properties and then move all the rows back from _p2 into _p1.

Moving Rows

The trick here is to not run out of memory! Hekaton uses MVCC, the MV is the important thing here (Multi Version), when you DELETE a row you don’t actually immediately delete it, it’s flagged for DELETE in the Checkpoint File Pair (CFP) delta file whilst in memory the transaction end timestamp is marked. Garbage collection kicks in to remove the row from physical memory, but, that is not immediate. You might end up with double the amount of data plus any versions that are floating around because of updates too!

The first thing to do is to re-route row INSERT’s into the new partition.

UPDATE active_partition SET part_id = 2

Depending on the size of your data you might want to batch trickle move rows over a period of a few hours, for simplicity I’m doing it in one go (see below).

SELECT id
INTO #to_move
FROM mytable_p1;
 
BEGIN TRAN;
 
INSERT mytable_p2 ( id, part_id, account_id, trans_amount, entry_date )
	SELECT id, part_id, account_id, trans_amount, entry_date
	FROM mytable_p1 WITH ( SNAPSHOT )
	WHERE id IN ( SELECT id
		      FROM #to_move );
 
DELETE mytable_p1 WITH ( SNAPSHOT )
WHERE id IN ( SELECT id
	      FROM #to_move );
 
COMMIT TRAN;

The partition mytable_p1 should now be empty – make sure you check because you are about to drop the table!

The reference to mytale_p1 needs to be removed, the first thing to do is to change the Partitioned View and remove the reference to mytable_p1 however triggers exist on the view so you need to remove those first and then alter the view:

drop trigger trg_mytable_del;
drop trigger trg_mytable_ins;
drop trigger trg_mytable_upd;
go
alter view mytable
as
	select part_id, id, account_id, 
		   trans_amount, entry_date
	from mytable_p2;

The reference to mytable_p1 has now been removed so you are free to drop and create that table how you please.

Putting rows back it’s just a case of putting the triggers and view back as described at the start of the article, remember to change the part_id in active_partition back to 1, lastly move the rows from _p2 back into _p1.

That’s about it. One thing to note – this will not work with native compilation stored procedures (SQL 2014 RTM).

UPDATE 12th June 2014, 09:23

I forgot to put the WITH ( SNAPSHOT ) on the update and delete instead of triggers – that is required because WITH ( SNAPSHOT ) is required when the statement is not auto-commit.