streamly: Beautiful Streaming, Concurrent and Reactive Composition

[ bsd3, concurrency, control, library, reactivity, streaming ] [ Propose Tags ]

Streamly, short for streaming concurrently, is a simple yet powerful streaming library with concurrent merging and concurrent nested looping support. A stream is just like a list except that it is a list of monadic actions rather than pure values. Streamly streams can be generated, consumed, combined, or transformed serially or concurrently. We can loop over a stream serially or concurrently. We can also have serial or concurrent nesting of loops. For those familiar with list transformer concept streamly is a concurrent list transformer. Streamly uses standard composition abstractions. Concurrent composition is just the same as serial composition except that we use a simple combinator to request a concurrent composition instead of serial. The programmer does not have to be aware of threads, locking or synchronization to write scalable concurrent programs.

Streamly provides functionality that is equivalent to streaming libraries like pipes and conduit but with a simple list like API. The streaming API of streamly is close to the monadic streams API of the vector package and similar in concept to the streaming package. In addition to the streaming functionality, streamly subsumes the functionality of list transformer libraries like pipes or list-t and also the logic programming library logict. On the concurrency side, it subsumes the functionality of the async package. Because it supports streaming with concurrency we can write FRP applications similar in concept to Yampa or reflex.

Streamly has excellent performance, see streaming-benchmarks for a comparison of popular streaming libraries on micro-benchmarks. For file IO, currently the library provides only one API to stream the lines in the file as Strings. Future versions will provide better streaming file IO options. Streamly interworks with the popular streaming libraries, see the interworking section in Streamly.Tutorial.

Where to find more information:


[Skip to Readme]
Versions 0.1.0, 0.1.1, 0.1.2, 0.2.0, 0.2.1, 0.3.0, 0.4.0, 0.4.1, 0.5.0, 0.5.1
Change log Changelog.md
Dependencies atomic-primops (==0.8.*), base (>=4.8 && <5), containers (==0.5.*), exceptions (>=0.8 && <0.11), lifted-base (==0.2.*), lockfree-queue (>=0.2.3 && <0.3), monad-control (>=1.0 && <2), mtl (>=2.2 && <3), semigroups (==0.18.*), stm (>=2.4.3 && <2.5), transformers (>=0.4 && <0.6), transformers-base (==0.4.*) [details]
License BSD-3-Clause
Copyright 2017 Harendra Kumar
Author Harendra Kumar
Maintainer harendra.kumar@gmail.com
Category Control, Concurrency, Streaming, Reactivity
Home page https://github.com/composewell/streamly
Bug tracker https://github.com/composewell/streamly/issues
Source repo head: git clone https://github.com/composewell/streamly
Uploaded by harendra at Sun May 13 22:54:12 UTC 2018
Distributions LTSHaskell:0.3.0, NixOS:0.5.1, Stackage:0.5.1
Executables CirclingSquare, AcidRain, MergeSort, ListDir, SearchQuery, chart-nested, chart-linear
Downloads 901 total (51 in the last 30 days)
Rating 2.25 (votes: 2) [estimated by rule of succession]
Your Rating
  • λ
  • λ
  • λ
Status Docs available [build log]
Last success reported on 2018-05-13 [all 1 reports]
Hackage Matrix CI

Modules

[Index]

Flags

NameDescriptionDefaultType
dev

Build development version

DisabledManual
examples

Build examples

DisabledManual
examples-sdl

Include examples that use SDL dependency

DisabledManual

Use -f <flag> to enable a flag, or -f -<flag> to disable that flag. More info

Downloads

Maintainer's Corner

For package maintainers and hackage trustees


Readme for streamly-0.2.0

[back to package description]

Streamly

Streaming Concurrently

Streamly, short for streaming concurrently, is a simple yet powerful streaming library with concurrent merging and concurrent nested looping support. A stream is just like a list except that it is a list of monadic actions rather than pure values. Streamly streams can be generated, consumed, combined, or transformed serially or concurrently. We can loop over a stream serially or concurrently. We can also have serial or concurrent nesting of loops. For those familiar with the list transformer concept streamly is a concurrent list transformer. Streamly uses standard composition abstractions. Concurrent composition is just the same as serial composition except that we use a simple combinator to request a concurrent composition instead of serial. The programmer does not have to be aware of threads, locking or synchronization to write scalable concurrent programs.

Streamly provides functionality that is equivalent to streaming libraries like pipes and conduit but with a list like API. The streaming API of streamly is close to the monadic streams API of the vector package and similar in concept to the streaming package. In addition to providing streaming functionality, streamly subsumes the functionality of list transformer libraries like pipes or list-t and also the logic programming library logict. On the concurrency side, it subsumes the functionality of the async package. Because it supports streaming with concurrency we can write FRP applications similar in concept to Yampa or reflex. To understand the streaming library ecosystem and where streamly fits in you may want to read streaming libraries as well. Also see the Comparison with Existing Packages section in the streamly tutorial.

Why use streamly?

  • Simple list like streaming API, if you know how to use lists then you know how to use streamly.
  • Powerful yet simple and scalable concurrency. Concurrency is not intrusive, concurrent programs are written exactly the same way as non-concurrent ones. There is no other package that provides such high level, simple and flexible concurrency support.
  • It is a general programming framework providing you all the necessary tools to solve a wide range of programming problems, unifying the functionality provided by several disparate packages in a concise and simple API.
  • Best in class performance. See streaming-benchmarks for a comparison of popular streaming libraries on micro-benchmarks.

For more information, see:

  • Streamly.Tutorial module in the haddock documentation for a detailed introduction
  • examples directory in the package for some simple practical examples

Streaming Pipelines

Unlike pipes or conduit and like vector and streaming streamly composes stream data instead of stream processors (functions). A stream is just like a list and is explicitly passed around to functions that process the stream. Therefore, no special operator is needed to join stages in a streaming pipeline, just the standard forward ($) or reverse (&) function application operator is enough. Combinators are provided in Streamly.Prelude to transform or fold streams.

This snippet reads numbers from stdin, prints the squares of even numbers and exits if an even number more than 9 is entered.

import Streamly
import qualified Streamly.Prelude as S
import Data.Function ((&))

main = runStream $
       S.repeatM getLine
     & fmap read
     & S.filter even
     & S.takeWhile (<= 9)
     & fmap (\x -> x * x)
     & S.mapM print

Serial and Concurrent Merging

Semigroup and Monoid instances can be used to fold streams serially or concurrently. In the following example we are composing ten actions in the stream each with a delay of 1 to 10 seconds, respectively. Since all the actions are concurrent we see one output printed every second:

import Streamly
import qualified Streamly.Prelude as S
import Control.Concurrent (threadDelay)

main = S.toList $ parallely $ foldMap delay [1..10]
 where delay n = S.once $ threadDelay (n * 1000000) >> print n

Streams can be combined together in many ways. We are providing some examples below, see the tutorial for more ways. We will use the following delay function in the examples to demonstrate the concurrency aspects:

import Streamly
import qualified Streamly.Prelude as S
import Control.Concurrent

delay n = S.once $ do
    threadDelay (n * 1000000)
    tid <- myThreadId
    putStrLn (show tid ++ ": Delay " ++ show n)

Serial

main = runStream $ delay 3 <> delay 2 <> delay 1
ThreadId 36: Delay 3
ThreadId 36: Delay 2
ThreadId 36: Delay 1

Parallel

main = runStream . parallely $ delay 3 <> delay 2 <> delay 1
ThreadId 42: Delay 1
ThreadId 41: Delay 2
ThreadId 40: Delay 3

Nested Loops (aka List Transformer)

The monad instance composes like a list monad.

import Streamly
import qualified Streamly.Prelude as S

loops = do
    x <- S.fromFoldable [1,2]
    y <- S.fromFoldable [3,4]
    S.once $ putStrLn $ show (x, y)

main = runStream loops
(1,3)
(1,4)
(2,3)
(2,4)

Concurrent Nested Loops

To run the above code with demand-driven depth first concurrency i.e. each iteration in the loops can run concurrently depending on the consumer rate:

main = runStream $ asyncly $ loops

To run it with demand driven breadth first concurrency:

main = runStream $ wAsyncly $ loops

To run it with strict concurrency irrespective of demand:

main = runStream $ parallely $ loops

To run it serially but interleaving the outer and inner loop iterations (breadth first serial):

main = runStream $ wSerially $ loops

Magical Concurrency

Streams can perform semigroup (<>) and monadic bind (>>=) operations concurrently using combinators like asyncly, parallelly. For example, to concurrently generate squares of a stream of numbers and then concurrently sum the square roots of all combinations of two streams:

import Streamly
import qualified Streamly.Prelude as S

main = do
    s <- S.sum $ asyncly $ do
        -- Each square is performed concurrently, (<>) is concurrent
        x2 <- foldMap (\x -> return $ x * x) [1..100]
        y2 <- foldMap (\y -> return $ y * y) [1..100]
        -- Each addition is performed concurrently, monadic bind is concurrent
        return $ sqrt (x2 + y2)
    print s

Of course, the actions running in parallel could be arbitrary IO actions. For example, to concurrently list the contents of a directory tree recursively:

import Path.IO (listDir, getCurrentDir)
import Streamly
import qualified Streamly.Prelude as S

main = runStream $ asyncly $ getCurrentDir >>= readdir
   where readdir d = do
            (dirs, files) <- S.once $ listDir d
            S.once $ mapM_ putStrLn $ map show files
            -- read the subdirs concurrently, (<>) is concurrent
            foldMap readdir dirs

In the above examples we do not think in terms of threads, locking or synchronization, rather we think in terms of what can run in parallel, the rest is taken care of automatically. When using asyncly the programmer does not have to worry about how many threads are to be created they are automatically adjusted based on the demand of the consumer.

The concurrency facilities provided by streamly can be compared with OpenMP and Cilk but with a more declarative expression.

Reactive Programming (FRP)

Streamly is a foundation for first class reactive programming as well by virtue of integrating concurrency and streaming. See AcidRain.hs for a console based FRP game example and CirclingSquare.hs for an SDL based animation example.

Performance

Streamly has best in class performance even though it generalizes streaming to concurrent composition that does not mean it sacrifices non-concurrent performance. See streaming-benchmarks for detailed performance comparison with regular streaming libraries and the explanation of the benchmarks. The following graphs show a summary, the first one measures how four pipeline stages in a series perform, the second one measures the performance of individual stream operations; in both cases the stream processes a million elements:

Composing Pipeline Stages All Operations at a Glance

Contributing

The code is available under BSD-3 license on github. Join the gitter chat channel for discussions. You can find some of the todo items on the github wiki. Please ask on the gitter channel or contact the maintainer directly for more details on each item. All contributions are welcome!

This library was originally inspired by the transient package authored by Alberto G. Corona.