name: unagi-chan version: 0.1.0.0 synopsis: Fast and scalable concurrent queues for x86, with a Chan-like API description: This library provides implementations of concurrent FIFO queues (for both general boxed and primitive unboxed values) that are fast, perform well under contention, and offer a Chan-like interface. The library may be of limited usefulness outside of x86 architectures where the fetch-and-add instruction is not available. . Here is an example benchmark measuring the time taken to write and then read 100,000 messages, with work divided amongst increasing number of readers and writers (time in ms), comparing against the top-performing queues in the standard libraries. . <> . And here is a view on just the unagi implementations. . <> . license: BSD3 license-file: LICENSE author: Brandon Simmons maintainer: brandon.m.simmons@gmail.com category: Concurrency build-type: Simple cabal-version: >=1.10 -- currently uploaded to imgur; move to this eventually --extra-doc-files: images/*.png --cabal-version: >=1.18 flag dev default: False manual: True source-repository head type: git location: https://github.com/jberryman/unagi-chan.git branch: master library hs-source-dirs: src exposed-modules: Control.Concurrent.Chan.Unagi , Control.Concurrent.Chan.Unagi.Unboxed other-modules: Control.Concurrent.Chan.Unagi.Internal , Control.Concurrent.Chan.Unagi.Unboxed.Internal , Utilities , Data.Atomics.Counter.Fat ghc-options: -Wall -funbox-strict-fields build-depends: base < 5 , atomic-primops==0.6.0.5 , primitive>=0.5.3 default-language: Haskell2010 -- We'll need some additional barriers for correctness: if !arch(i386) && !arch(x86_64) cpp-options: -DNOT_x86 -- Potential implementations roadmap (or we might just stick with this design -- for this package): -- -- - fixed size MutableArray of purely-functional dequeues ("Tako") (fetch-and-add, then CAS) -- - variant replacing CAS with blocking turn-taking, also play with leap-frogging cache-lines -- - variant in STM (how to safely do the initial incrCounter at most once though?) -- would also let us separate read and write buckets. -- - bounded Tako variant -- - maybe implement "Fast Concurrent Queues for x86 Processors" by Morrison & Afek (non-blocking, probably more clever) -- - Also looks like a similar (but lockfree, as above) counter-based queue has been developed by FB: -- https://github.com/facebook/folly/blob/master/folly/MPMCQueue.h -- - boxed Unagi variant avoiding CAS with read simply spinning a few times and then calling yield, or something else -- Please just build tests and run: -- $ time ./dist/build/test/test -- Doing `cabal test` takes forever for some reason. test-suite test type: exitcode-stdio-1.0 ghc-options: -Wall -funbox-strict-fields ghc-options: -O2 -rtsopts -threaded -with-rtsopts=-N ghc-options: -fno-ignore-asserts -- I guess we need to put 'src' here to get access to Internal modules hs-source-dirs: tests, src main-is: Main.hs build-depends: base , primitive>=0.5.3 , atomic-primops==0.6.0.5 , containers default-language: Haskell2010 -- compare benchmarks with Chan, TQueue, and (eventually) lockfree-queue? flag compare-benchmarks default: False manual: True benchmark single type: exitcode-stdio-1.0 ghc-options: -Wall -O2 -threaded -funbox-strict-fields -fforce-recomp -rtsopts hs-source-dirs: benchmarks default-language: Haskell2010 default-extensions: CPP build-depends: base , unagi-chan , criterion if flag(compare-benchmarks) cpp-options: -DCOMPARE_BENCHMARKS build-depends: stm -- , lockfree-queue main-is: single.hs ghc-options: -with-rtsopts=-N1 -- To run comparison benchmark used in graph above, run: -- $ cabal configure --enable-benchmarks -fcompare-benchmarks -- $ cabal bench multi --benchmark-option=-omulti3.html --benchmark-option='Demo' benchmark multi type: exitcode-stdio-1.0 ghc-options: -Wall -O2 -threaded -funbox-strict-fields -fforce-recomp -rtsopts hs-source-dirs: benchmarks default-language: Haskell2010 default-extensions: CPP build-depends: base , unagi-chan , criterion if flag(compare-benchmarks) cpp-options: -DCOMPARE_BENCHMARKS build-depends: stm -- , lockfree-queue main-is: multi.hs ghc-options: -with-rtsopts=-N build-depends: async -- for profiling, checking out core, etc executable dev-example -- for n in `find dist/build/core-example/core-example-tmp -name '*dump-simpl'`; do cp $n "core-example/$(basename $n).$(git rev-parse --abbrev-ref HEAD)"; done if !flag(dev) buildable: False ghc-options: -ddump-to-file -ddump-simpl -dsuppress-module-prefixes -dsuppress-uniques -ddump-core-stats -ddump-inlinings ghc-options: -O2 -rtsopts -- Either do threaded for eventlogging and simple timing... --ghc-options: -threaded -with-rtsopts=-N2 --ghc-options: -eventlog -- ...or do non-threaded runtime ghc-prof-options: -fprof-auto --Relevant profiling RTS settings: -xt -- TODO also check out +RTS -A10m, and look at output of -sstderr hs-source-dirs: core-example main-is: Main.hs build-depends: base , stm , unagi-chan default-language: Haskell2010