Bitsets

You don’t always want the most general representation for everything. If your domain is sufficiently restricted, you can use a special purpose representation for collections of your things. Bitsets are a classic example.

Introduction

If you need to represent a dictionary where the values are small integers only, use an array where the indexes are the keys and the corresponding values are the array contents.

If you need to represent a set where the values are small integers only, you can use a boolean array.

Example: The set {1,5,6} can be represented with the array

    ┌───────┐
  0 │ false │
    ├───────┤
  1 │ true  │
    ├───────┤
  2 │ false │
    ├───────┤
  3 │ false │
    ├───────┤
  4 │ false │
    ├───────┤
  5 │ true  │
    ├───────┤
  6 │ true  │
    ├───────┤
  7 │ false │
    └───────┘

Of course you don't want to do this when the keys are arbitrarily large integers. (Though there are interesting data structures for sparse arrays.)

Motivation

When representing sets, you don't really need one machine word to store a boolean: you only need one bit. We can pack the bits like so:

Example: The set {3,6,14,30} can be represented by

    01000000000000000100000001001000

which is 1073750088 in decimal. If the range of values goes beyond 32 bits, a ByteArray will suffice.

Operations

Operations are incredibly fast. All operate in constant time provided the bitset fits in one word. If unbounded then, well, they're linear, but with a pretty small constant factor, making them useful even for large sets.

Add items (union): use bitwise OR
Remove items (difference): use bitwise AND-NOT
Intersection: use bitwise AND
Complement: use bitwise NOT

Exercise: Create a bitset class.

Summary

We’ve covered:

What a bitset is
Examples of bitsets
When bitsets can be used
How bitset operations are carried out with logical operations