name: data-dispersal -- The package version. See the Haskell package versioning policy (PVP) -- for standards guiding when and how versions should be incremented. -- http://www.haskell.org/haskellwiki/Package_versioning_policy -- PVP summary: +-+------- breaking API changes -- | | +----- non-breaking API additions -- | | | +--- code changes with no API change version: 1.0.0.0 synopsis: Space-efficient and privacy-preserving data dispersal algorithms. description: This library provides space-efficient (m,n)-information dispersal algorithms (IDAs). . Given a ByteString @bstr@ of length @D@, we encode @bstr@ as a list @fs@ of @n@ 'Fragment's, each containing a ByteString of length @O(D/m)@. Then, each fragment in @fs@ could be stored on a separate machine for fault-tolerance. Even if up to @n-m@ of these machines crash, we can still reconstruct the original ByteString out of the remaining m fragments. The total space required for the n fragments is @O((n/m)*D)@. Note that @m@ and @n@ are roughly in the same order, so the actual storage overhead for getting good fault-tolerance increases only by a constant factor. . The module @Data.IDA@ contains the basic information dispersal algorithm. The module @Crypto.IDA@ augments the dispersal scheme by combining it with secret sharing, i.e., the knowledge of up to @m-1@ fragments does not leak any information about the original data. See "Crypto.IDA" for details. . /GHCi Example:/ . > > :m + Data.IDA > > let msg = Data.ByteString.Char8.pack "my really important data" > > let fragments = encode 5 15 msg > -- Now we could distributed the fragments on different sites to add some > -- fault-tolerance. > > let frags' = drop 5 $ take 10 fragments -- let's pretend that 10 machines crashed > > decode frags' > "my really important data" . /Fault-Tolerance:/ . Suppose that we have @N@ machines and encode our data as @2log(N)@ fragments with reconstruction threshold m = @log(N)@. Let's assume that we store each fragment on a separate machine and each machine fails (independently) with probability at most 0.5. . * What is the probability of our data being safe? @Pr[ at most n-m machines crash ] >= 1-0.5^(log(N)) = 1-N^(-1).@ . * What is the overhead in terms of space that we pay for this level of fault-tolerance? We have n fragments, each of size D\/m, so the total space is @n * D\/ m = 2D.@ In other words, we can guarantee that the data survives with high probability by increasing the required space by a constant factor. . This library is based on the following works: . * \"Efficient Dispersal of Information for Security, Load Balancing, and Fault Tolerance\", by Michael O. Rabin, JACM 1989. . * \"How to share a secret.\" by Adi Shamir. In Communications of the ACM 22 (11): 612–613, 1979. . * \"Secret Sharing Made Short\" Hugo Krawczyk. CRYPTO 1993: 136-146 license: LGPL-2.1 license-file: LICENSE author: Peter Robinson maintainer: peter.robinson@monoid.at copyright: Peter Robinson 2014 category: Data, Cryptography build-type: Simple cabal-version: >=1.8 homepage: http://monoid.at/code library hs-source-dirs: src exposed-modules: Data.IDA Data.IDA.Internal Data.IDA.FiniteField Crypto.IDA build-depends: base ==4.6.* ,array >= 0.4.0.1 ,vector >= 0.10.11.0 ,binary >= 0.7.2.1 ,bytestring >= 0.10.0.2 ,syb >= 0.4.0 ,binary >= 0.5.1.1 ,finite-field >= 0.8.0 ,matrix >= 0.3.4.0 ,AES >= 0.2.9 ,entropy >= 0.3.2 ,secret-sharing >= 1.0.0.0 ghc-options: -Wall test-suite Main type: exitcode-stdio-1.0 x-uses-tf: true build-depends: base >= 4 && < 5 ,QuickCheck >= 2.4 ,test-framework >= 0.4.1 ,test-framework-quickcheck2 ,array >= 0.4.0.1 ,vector >= 0.10.11.0 ,spool >= 0.1 ,binary >= 0.7.2.1 ,bytestring >= 0.10.0.2 ,syb >= 0.4.0 hs-source-dirs: src, tests main-is: Tests.hs