SQL Server bitmap operators, bitmasks and bit arrays 1

In this post I cover what bitmap and bitmasks are, before I can do that I need to make sure you are up-to speed on binary, base 2 and how the bits are layed out in data.

Binary

Binary (Base 2) is used throughout computer systems, the Windows platform amongst others relies on it.

Base 2 is represented by one of more bits, each bit can be 0 or 1, when a bit is set to 1 it means that bit is enabled/on/set/flipped – however you want to put it!

Note, the least significant bit (the one on the far right) is the start of the bit-array, we will align with the fact that array bounds usually start at 0 and move from right to left with each bit travelling left having successive increasing value.

Example:

bit-number:         2           1           0
decimal if set:     4            2           1

Dec     Binary

0         000
1          001
2          010
3          011
4          100
5          101
6          110

Looking at the above example, to encode decimal 4 we simply set bit 3 to 1 and the others to 0.

Binary AND / OR

SQL Server uses the symbol & to indicate AND, and | to indicate OR; Logical rules determine that:

1 | 1 is 1
1 | 0 is 1
0 | 1 is 1
0 | 0 is 0

1 & 1 is 1
1 & 0 is 0
0 & 1 is 0
0 & 0 is 0

Bit-array

There are 8 bits in a byte, for the fun of it there are 4 bits in a nibble! I’m not going to talk about words and double words – most folk nowadays would probably give you a blank look, but, everybody knows bytes – KISS (Keep it Simple Stupid) applies!

A bit-array is simply that: an array of bits, just like you’d have an array of integers, it’s all in the data type. Ok, in storage terms because the minimum storage is 8 bytes and everything works in bytes and multiples thereof then storing 4 bits will still actually take 8 bits.

Bitmap

A bitmap simply “maps” a domain of values e.g. the integer values 0 to 255 to a range of bits e.g. {128, 64, 32, 16, 8, 4, 2, 1}.

Given a domain of values, for example Genders {Male, Female, Unknown}; the bitmap will be stored as a bit-array, thus, each unique value has a single entry in that array – what on earth do I mean by that?

To store 3 unique values as decimal i.e. 1, 2 and 3 we need just 2 bits i.e. 01 (decimal 1), 10 (decimal 2) and 11 (decimal 3).

To store 3 unique values in a bitmap (bit-array) you need 3 bits i.e. bitarray position 0 (decimal 1), bit array position 1 (decimal 2), bit array position 2 (decimal 4) which gives the 3 unique values 001, 010 and 100.

BitMask

A bitmask is simply a way of masking off a number of bits in a bitmap you wish to compare using a logical AND – matching elements between two arrays, for example:

Element      BitMap        BitMask

0                        0                             0
1                        1                             1
2                        0                             0
3                        0                             0
4                        1                             1
5                        0                             0
6                        0                             0
7                        1                             1

We can see that the BitMap matches the BitMask, so the target value in the “bit-mask” is true.

You may already have gathered that you must test all elements of the array otherwise you may get a partial match resulting in a false positive, for example if you only tested the first 4 elements of the array.