data-dispersal: Space-efficient and privacy-preserving data dispersal algorithms.

This is a package candidate release! Here you can preview how this package release will appear once published to the main package index (which can be accomplished via the 'maintain' link below). Please note that once a package has been published to the main package index it cannot be undone! Please consult the package uploading documentation for more information.

[maintain] [Publish]


Given a ByteString of length D, we encode the ByteString as a list of n Fragments, each containing a ByteString of length O(D/m). Then, each fragment could be stored on a separate machine to obtain fault-tolerance: Even if all but m of these machines crash, we can still reconstruct the original ByteString out of the remaining m fragments. Note that the total space requirement of the m fragments is m * O(D/m)=O(D), which is clearly space-optimal. The total space required for the n fragments is O((n/m)*D). Note that m and n can be chosen to be of the same order, so the asymptotic storage overhead for getting good fault-tolerance increases only by a constant factor.

GHCi Example:

> :m + Data.IDA
> let msg = Data.ByteString.Char8.pack "my really important data"
> let fragments = encode 5 15 msg
-- Now we could distributed the fragments on different sites to add some
-- fault-tolerance.
> let frags' = drop 5 $ take 10 fragments -- let's pretend that 10 machines crashed
-- Let's look at the 5 fragments that we have left:
> mapM_ (Prelude.putStrLn . show)  frags'
-- Space-efficiency: Note that the length of each of the 5 fragments is 5
-- and our original message has length 24.
> decode frags'
"my really important data"

Encrypted Fragments:

The module Data.IDA contains an information dispersal algorithm that produces space-optimal fragments. However, the knowledge of 1 or more fragments might allow an adversary to deduce some information about the original data. The module Crypto.IDA combines information dispersal with secret sharing: the knowledge of up to m-1 fragments does not leak any information about the original data.

This could be useful in scenarios where we need to store data at untrusted storage sites: To this end, we store one encrypted fragment at each site. If at most m-1 of these untrusted sites collude, they will still be unable to obtain any information about the original data. The added security comes at the price of a slightly increased fragment size (by an additional constant 32 bytes) and an additional overhead in the running time of the encoding/decoding process. The algorithm is fully described in module Crypto.IDA.


Suppose that we have N machines and encode our data as 2log(N) fragments with reconstruction threshold m = log(N). Let's assume that we store each fragment on a separate machine and each machine fails (independently) with probability at most 0.5.

This library is based on the following works:


Change log None available
Dependencies AES (>=0.2.9), array (>=, base (>=4.6 && <5), binary (>=, bytestring (>=, entropy (>=0.3.2), finite-field (>=0.8.0), matrix (>=, secret-sharing (>=, syb (>=0.4.0), vector (>= [details]
License LGPL-2.1-only
Copyright Peter Robinson 2014
Author Peter Robinson <>
Category Data, Cryptography
Home page
Uploaded by PeterRobinson at 2014-10-05T17:24:30Z




Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees