hw-rankselect: Rank-select

[ bsd3, conduit, data, library, program ] [ Propose Tags ]

Please see README.md


[Skip to Readme]
Versions 0.0.0.1, 0.0.0.2, 0.0.0.3, 0.0.0.4, 0.0.0.5, 0.0.0.6, 0.1.0.0, 0.1.0.1, 0.2.0.0, 0.2.0.1, 0.3.0.0, 0.4.0.0, 0.5.0.0, 0.6.0.0, 0.7.0.0, 0.8.0.0, 0.8.0.1, 0.8.0.2, 0.9.0.0, 0.10.0.0, 0.10.0.1, 0.10.0.2, 0.10.0.3, 0.11.0.0
Dependencies base (>=4.7 && <5), deepseq, directory, hw-balancedparens (>=0.1.0.0), hw-bits (>=0.6.0.0), hw-prim (>=0.4.0.3), hw-rankselect, hw-rankselect-base (>=0.2.0.0), mmap, vector [details]
License BSD-3-Clause
Copyright 2016 John Ky
Author John Ky
Maintainer newhoggy@gmail.com
Category Data, Conduit
Home page http://github.com/haskell-works/hw-rankselect#readme
Bug tracker https://github.com/haskell-works/hw-rankselect/issues
Source repo head: git clone https://github.com/haskell-works/hw-rankselect
Uploaded by haskellworks at Mon Apr 9 13:19:07 UTC 2018
Distributions LTSHaskell:0.10.0.3, NixOS:0.11.0.0, Stackage:0.10.0.3
Executables hw-rankselect-exe
Downloads 3803 total (25 in the last 30 days)
Rating 2.0 (votes: 1) [estimated by rule of succession]
Your Rating
  • λ
  • λ
  • λ
Status Docs available [build log]
Last success reported on 2018-04-13 [all 1 reports]
Hackage Matrix CI

Modules

[Index]

Flags

NameDescriptionDefaultType
bmi2

Enable bmi2 instruction set

DisabledAutomatic

Use -f <flag> to enable a flag, or -f -<flag> to disable that flag. More info

Downloads

Maintainer's Corner

For package maintainers and hackage trustees


Readme for hw-rankselect-0.11.0.0

[back to package description]

hw-rankselect

CircleCI

Efficient rank and select operations on large bit-vectors.

This library will use support for some BMI2 CPU instructions on some x86 based CPUs if compiled with the appropriate flags on ghc-8.4.1 or later.

Rank and select

This library provides the following functions on indexed bit-vectors:

  • rank1
  • rank0
  • select1
  • select0

The supported indexed bit-vector types are:

  • Poppy512
  • CsPoppy

Constructing and using an indexed bit-vector in the repl

The example below constructs an indexed bit-vector from a string and runs rank and select query operations on it. The bits in a string are in little-endian and can be of arbitrary length. The resulting bit-vector will be padded with 0-bits until the next 64-bit boundary.

$ stack repl  --flag bits-extra:bmi2 --flag hw-rankselect-base:bmi2 --flag hw-rankselect:bmi2
λ> import HaskellWorks.Data.Bits.BitRead
λ> let bs = fromJust $ bitRead "10010010" :: CsPoppy
bs :: CsPoppy
λ> select1 bs 1
1
λ> select1 bs 2
4
λ> select1 bs 3
7
λ> rank1 bs 7
3
λ> rank1 bs 4
2
λ> rank1 bs 1
1

A valid bit string contains zero or more characters. Characters other than 1 and 0 are permitted, but are ignored. For example spaces can be used to group bits for clarity.

λ> let bs = fromJust $ bitRead "" :: CsPoppy
bs :: CsPoppy
λ> let bs = fromJust $ bitRead "10010010 10010010" :: CsPoppy
bs :: CsPoppy

Whilst the use of a bit string is convenient for the repl, for performance reasons, it is more typical to construct an indexed bit-vector from a 64-bit word vector:

> import qualified Data.Vector.Storable as DVS
λ> let bs = makeCsPoppy (DVS.fromList [23, 73, 55])
bs :: CsPoppy

Compilation

It is sufficient to build, test and benchmark the library as follows for basic performance. The library will be compiled to use broadword implementation of rank & select, which has reasonable performance.

stack build
stack test
stack bench

To target the BMI2 instruction set, add the bmi2 flag:

stack build --flag bits-extra:bmi2 --flag hw-rankselect-base:bmi2 --flag hw-rankselect:bmi2
stack test  --flag bits-extra:bmi2 --flag hw-rankselect-base:bmi2 --flag hw-rankselect:bmi2
stack bench --flag bits-extra:bmi2 --flag hw-rankselect-base:bmi2 --flag hw-rankselect:bmi2

Benchmark results

The following benchmark shows the kinds of performance gain that can be expected from enabling the BMI2 instruction set for CPU targets that support them:

benchmarking data/48-mb-bitvector/CsPoppy Select1
time                 3.636 μs   (3.613 μs .. 3.669 μs)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 3.618 μs   (3.607 μs .. 3.636 μs)
std dev              46.03 ns   (30.79 ns .. 78.23 ns)
benchmarking data/48-mb-bitvector/CsPoppy Select1
time                 1.969 μs   (1.959 μs .. 1.982 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 1.973 μs   (1.963 μs .. 1.986 μs)
std dev              38.41 ns   (26.87 ns .. 59.08 ns)
variance introduced by outliers: 21% (moderately inflated)```

References