bitvec
A newtype over Bool
with a better Vector
instance: 8x less memory, up to 1000x faster.
The vector
package represents unboxed arrays of Bool
spending 1 byte (8 bits) per boolean.
This library provides a newtype wrapper Bit
and a custom instance
of unboxed Vector
, which packs bits densely,
achieving 8x less memory footprint.
The performance stays mostly the same;
the most significant degradation happens for random writes
(up to 10% slower).
On the other hand, for certain bulk bit operations
Vector Bit
is up to 1000x faster than Vector Bool
.
Thread safety
Data.Bit
is faster, but writes and flips are threadunsafe.
This is because naive updates are not atomic:
read the whole word from memory,
then modify a bit, then write the whole word back.
Data.Bit.ThreadSafe
is slower (up to 20%),
but writes and flips are threadsafe.
Quick start
Consider the following (very naive) implementation of
the sieve of Eratosthenes. It returns a vector with True
at prime indices and False
at composite indices.
import Control.Monad
import Control.Monad.ST
import qualified Data.Vector.Unboxed as U
import qualified Data.Vector.Unboxed.Mutable as MU
eratosthenes :: U.Vector Bool
eratosthenes = runST $ do
let len = 100
sieve < MU.replicate len True
MU.write sieve 0 False
MU.write sieve 1 False
forM_ [2 .. floor (sqrt (fromIntegral len))] $ \p > do
isPrime < MU.read sieve p
when isPrime $
forM_ [2 * p, 3 * p .. len  1] $ \i >
MU.write sieve i False
U.unsafeFreeze sieve
We can switch from Bool
to Bit
just by adding newtype constructors:
import Data.Bit
import Control.Monad
import Control.Monad.ST
import qualified Data.Vector.Unboxed as U
import qualified Data.Vector.Unboxed.Mutable as MU
eratosthenes :: U.Vector Bit
eratosthenes = runST $ do
let len = 100
sieve < MU.replicate len (Bit True)
MU.write sieve 0 (Bit False)
MU.write sieve 1 (Bit False)
forM_ [2 .. floor (sqrt (fromIntegral len))] $ \p > do
Bit isPrime < MU.read sieve p
when isPrime $
forM_ [2 * p, 3 * p .. len  1] $ \i >
MU.write sieve i (Bit False)
U.unsafeFreeze sieve
Bit
based implementation requires 8x less memory to store
the vector. For large sizes it allows to crunch more data in RAM
without swapping. For smaller arrays it helps to fit into
CPU caches.
> listBits eratosthenes
[2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71,73,79,83,89,97]
There are several highlevel helpers, digesting bits in bulk,
which makes them up to 64x faster than respective counterparts
for Vector Bool
. One can query population count (popcount)
of a vector (giving us the primecounting function):
> countBits eratosthenes
25
And vice versa, query an address of the nth set bit
(which corresponds to the nth prime number here):
> nthBitIndex (Bit True) 10 eratosthenes
Just 29
One may notice that the order of the inner traversal by i
does not matter and get tempted to run it in several parallel threads.
In this case it is vital to switch from Data.Bit
to Data.Bit.ThreadSafe
,
because the former is threadunsafe with regards to writes.
There is a moderate performance penalty (up to 20%)
for using the threadsafe interface.
Sets
Bit vectors can be used as a blazingly fast representation of sets
as long as their elements are Enum
eratable and sufficiently dense,
leaving IntSet
far behind.
For example, consider three possible representations of a set of Word16
:
 As an
IntSet
with a readily available union
function.
* As a 64klong unboxed Vector Bool
, implementing union as zipWith ()
.
 As a 64klong unboxed
Vector Bit
, implementing union as zipBits (..)
.
In our benchmarks (see bench
folder) for nottoosparse sets
the union of Vector Bit
evaluates 24x36x faster than the union of IntSet
and stunningly outperforms Vector Bool
500x1000x.
Binary polynomials
Binary polynomials are polynomials with coefficients modulo 2.
Their applications include coding theory and cryptography.
While one can successfully implement them with poly
package,
operating on UPoly Bit
,
this package provides even faster arithmetic routines
exposed via F2Poly
data type and its instances.
> :set XBinaryLiterals
>  (1 + x) (1 + x + x^2) = 1 + x^3 (mod 2)
> 0b11 * 0b111 :: F2Poly
F2Poly {unF2Poly = [1,0,0,1]}
Use fromInteger
/ toInteger
to convert binary polynomials
from Integer
to F2Poly
and back.
Package flags
This package supports the following flags to facilitate dependency management.
Disabling them does not diminish bitvec
's capabilities, but makes certain operations slower.

Flag integergmp
, enabled by default.
Depend on integergmp
package and use it to speed up operations on binary polynomials.
Normally integergmp
is shipped with core libraries anyways, so there is little to gain
from disabling it, unless you use a custom build of GHC.

Flag libgmp
, enabled by default.
Link against GMP library and use it to for ultimate performance of
zipBits
, invertBits
and countBits
. GMP is readily available on most machines
(brew install gmp
on macOS), but you may find useful to disable this flag working
with exotic setup.

Flag bmi2
, disabled by default, experimental.
Depend on bitsextra
package and use it for nthBitIndex
.
This is supposed to be faster, but have not been properly polished yet.
Similar packages

bv
and
bvlittle
do not offer mutable vectors.

array
is memoryefficient for Bool
, but lacks
a handy Vector
interface and is not threadsafe.