April 23, 2014

# SQL Server bitmap operators, bitmasks and bit arrays ^{1}

In this post I cover what bitmap and bitmasks are, before I can do that I need to make sure you are up-to speed on binary, base 2 and how the bits are layed out in data.

**Binary**

Binary (Base 2) is used throughout computer systems, the Windows platform amongst others relies on it.

Base 2 is represented by one of more bits, each bit can be 0 or 1, when a bit is set to 1 it means that bit is enabled/on/set/flipped – however you want to put it!

Note, the least significant bit (the one on the far right) is the start of the bit-array, we will align with the fact that array bounds usually start at 0 and move from right to left with each bit travelling left having successive increasing value.

*Example:*

bit-number: 2 1 0

decimal if set: 4 2 1

Dec Binary

0 000

1 001

2 010

3 011

4 100

5 101

6 110

Looking at the above example, to encode decimal 4 we simply set bit 3 to 1 and the others to 0.

**Binary AND / OR**

SQL Server uses the symbol & to indicate AND, and | to indicate OR; Logical rules determine that:

1 | 1 is 1

1 | 0 is 1

0 | 1 is 1

0 | 0 is 0

1 & 1 is 1

1 & 0 is 0

0 & 1 is 0

0 & 0 is 0

**Bit-array**

There are 8 bits in a byte, for the fun of it there are 4 bits in a nibble! I’m not going to talk about words and double words – most folk nowadays would probably give you a blank look, but, everybody knows bytes – KISS (Keep it Simple Stupid) applies!

A bit-array is simply that: an array of bits, just like you’d have an array of integers, it’s all in the data type. Ok, in storage terms because the minimum storage is 8 bytes and everything works in bytes and multiples thereof then storing 4 bits will still actually take 8 bits.

**Bitmap**

A bitmap simply “maps” a domain of values e.g. the integer values 0 to 255 to a range of bits e.g. {128, 64, 32, 16, 8, 4, 2, 1}.

Given a domain of values, for example Genders {Male, Female, Unknown}; the bitmap will be stored as a bit-array, thus, each unique value has a single entry in that array – what on earth do I mean by that?

To store 3 unique values as decimal i.e. 1, 2 and 3 we need just 2 bits i.e. 01 (decimal 1), 10 (decimal 2) and 11 (decimal 3).

To store 3 unique values in a bitmap (bit-array) you need 3 bits i.e. bitarray position 0 (decimal 1), bit array position 1 (decimal 2), bit array position 2 (decimal 4) which gives the 3 unique values 001, 010 and 100.

**BitMask**

A bitmask is simply a way of masking off a number of bits in a bitmap you wish to compare using a logical AND – matching elements between two arrays, for example:

Element BitMap BitMask

0 0 0

1 1 1

2 0 0

3 0 0

4 1 1

5 0 0

6 0 0

7 1 1

We can see that the BitMap matches the BitMask, so the target value in the “bit-mask” is true.

You may already have gathered that you must test all elements of the array otherwise you may get a partial match resulting in a false positive, for example if you only tested the first 4 elements of the array.

(SFTW) SQL Server Links 25/04/14 • John Sansom

April 24, 2014@ 6:06 pm[…] SQL Server bitmap operators, bitmasks and bit arrays – Tony Rogerson (Blog|Twitter) […]