streamly-lmdb: Stream data to or from LMDB databases using the streamly library.

[ bsd3, database, library, streaming, streamly ] [ Propose Tags ]

Please see the README on GitHub at https://github.com/shlok/streamly-lmdb#readme


[Skip to Readme]
Versions [RSS] [faq] 0.0.1, 0.0.1.1, 0.1.0, 0.2.0, 0.2.1, 0.3.0
Change log ChangeLog.md
Dependencies async (>=2.2.2 && <2.3), base (>=4.7 && <5), bytestring (>=0.10.10.0 && <0.11), streamly (==0.8.*) [details]
License BSD-3-Clause
Copyright 2021 Shlok Datye
Author Shlok Datye
Maintainer sd-haskell@quant.is
Category Database, Streaming, Streamly
Home page https://github.com/shlok/streamly-lmdb#readme
Bug tracker https://github.com/shlok/streamly-lmdb/issues
Source repo head: git clone https://github.com/shlok/streamly-lmdb
Uploaded by shlok at 2021-07-24T11:10:14Z
Distributions NixOS:0.2.1
Downloads 460 total (17 in the last 30 days)
Rating (no votes yet) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Hackage Matrix CI
Docs uploaded by user
Build status unknown [no reports yet]

Modules

[Index] [Quick Jump]

Downloads

Maintainer's Corner

For package maintainers and hackage trustees

Candidates


Readme for streamly-lmdb-0.3.0

[back to package description]

streamly-lmdb

Hackage CI

Stream data to or from LMDB databases using the Haskell streamly library.

Requirements

Install LMDB on your system:

  • Debian Linux: sudo apt-get install liblmdb-dev.
  • macOS: brew install lmdb.

Quick start

{-# LANGUAGE OverloadedStrings #-}

module Main where

import Streamly.External.LMDB
    (Limits (mapSize), WriteOptions (writeTransactionSize), defaultLimits,
    defaultReadOptions, defaultWriteOptions, getDatabase, openEnvironment,
    readLMDB, tebibyte, writeLMDB)
import qualified Streamly.Prelude as S

main :: IO ()
main = do
    -- Open an environment. There should already exist a file or
    -- directory at the given path. (Empty for a new environment.)
    env <- openEnvironment "/path/to/lmdb-database" $
            defaultLimits { mapSize = tebibyte }

    -- Get the main database.
    -- Note: It is common practice with LMDB to create the database
    -- once and reuse it for the remainder of the program’s execution.
    db <- getDatabase env Nothing

    -- Stream key-value pairs into the database.
    let fold' = writeLMDB db defaultWriteOptions { writeTransactionSize = 1 }
    let writeStream = S.fromList [("baz", "a"), ("foo", "b"), ("bar", "c")]
    _ <- S.fold fold' writeStream

    -- Stream key-value pairs out of the
    -- database, printing them along the way.
    -- Output:
    --     ("bar","c")
    --     ("baz","a")
    --     ("foo","b")
    let unfold' = readLMDB db defaultReadOptions
    let readStream = S.unfold unfold' undefined
    S.mapM_ print readStream

Benchmarks

See bench/README.md. Summary (with rough figures from our machine):

  • Reading. For reading a fully cached LMDB database, this library (when unsafeReadLMDB is used instead of readLMDB) has a 10 ns/pair overhead compared to plain Haskell IO code, which has another 10 ns/pair overhead compared to C. (The first two being similar fulfills the promise of streamly and stream fusion.) We deduce that if your total workload per pair takes longer than 20 ns, your bottleneck will not be your usage of this library as opposed to C.
  • Writing. Writing with plain Haskell IO code and with this library is, respectively, 10% and 20% slower than writing with C. We have not dug further into these differences because this write performance is currently good enough for our purposes.

Linode; Debian 10, Dedicated 32GB: 16 CPU, 640GB Storage, 32GB RAM.