# Streamly: The Swiss Army Knife for Haskell Streamly provides a comprehensive suite of functionality through a **consistent, concise, and modular interface**, backed by **high-performance APIs**. Designed from the ground up, it **minimizes abstractions** while **maximizing functionality and efficiency**. By unifying features from many disparate Haskell packages on Hackage—and often outperforming them—**Streamly acts as a Swiss army knife for Haskell programmers**. Importantly, Streamly does **not** aim to replicate functionality that already exists elsewhere. Instead, it focuses on creating **foundational building blocks** that naturally deliver **broad, unified, and concise capabilities** with **peak performance**. This approach embodies the Haskell principle of **“don’t repeat yourself” (DRY)**, addressing a gap often overlooked in the Haskell library ecosystem. Below, we provide an overview of Streamly’s functionality and show how its modular building blocks unify disparate features into a single, coherent framework. Streamly offers a unified alternative to several existing packages. For new projects, it can reduce dependency bloat, enable concise and expressive solutions to domain-specific problems, and often deliver better performance out of the box. ## Streaming Streamly offers **high-performance, feature-rich streaming** and data flow programming operations. We are briefly mentioning some of the key capabilities here among many others. ### Three Core Abstractions Streamly provides three types of streams, each one allows transformation and stateful processing of sequential streams of data but they differ in how they can be composed: - **Producers (`Stream`)** – these are essentially generators or source of data streams, multiple sources can be combined to create a single source. - **Transformers (`Scanl`)** – these are essentially pipes connecting sources to sources or sinks, multiple scans can be combined together such that a source stream can be split over multiple pipes and the resulting streams can be the combined back into a single source stream. - **Consumers (`Fold`)** – these are essentially final consumers or sinks of data, multiple consumers can be combined together into a single consumer such that a source stream can be distributed to all consumers and the results can be combined. **Parsers** are a specialization of folds that support backtracking and errors handling. Folds and parsers are **first-class building blocks** for stateful data processing. Grouping, chunking and many other operations over streams, scans and folds are implemented using folds and parsers — making the library extremely versatile, modular, and yet high-performing due to stream fusion. ### Nested and Structured Stream Processing Streamly supports **nested stream processing** (the functional equivalent of nested loops) in several styles: - **Depth-first** – inner loops run to completion before the next outer loop iteration. - **Breadth-first** – all outer loop iterations advance one step at a time. - **Fair interleaving** – inner and outer loops advance fairly in parallel. With `concatIterate`, the output of a stream can be fed back into its input, allowing recursion and control flow at the stream level. It can be used for traversing **trees and graphs breadth-first or depth-first**. `mergeMapWith` merges streams pairwise to form balanced merge trees, for example you can merge sorted data from many files using this combinator. ### Sliding Windows and Sessions **Sliding windows** are an integral part of scans and folds, enabling incremental window based computations for use cases such as **time-series analysis, statistical metrics, and trading indicators**—without recomputing the results over the entire dataset. Elements in streams can be **classified by keys** to route them to different sessions, fold each substream differently in its own session, and generate a stream of end results from these sessions. ### Expressiveness and Performance These modular abstractions allow complex problems to be expressed in **succinct and idiomatic way**. For example, merge sort in Streamly is implemented in just a few lines of idiomatic code using only general-purpose building blocks—yet the performance matches or exceeds the highly optimized list implementation in base. Streamly provides a unified, expressive, and high-level alternative to stream processing packages like `streaming`, `conduit`, `pipes`, and `foldl`, with unmatched performance. ## Concurrency Streamly is built for concurrency, all abstractions are designed for concurrency, it provides native concurrent evaluation of streams, scans and folds. All the ways in which stream processing pipelines can be composed with serial processing can also be composed with concurrent processing. For example, you can: - Map a function over a stream concurrently. - generate multiple streams in parallel and merge the results in a single stream. - Split streams, scan each branch concurrently, and combine them back into a single stream. - Distribute a stream to different concurrent folds and merge back the results. - Traverse trees or graphs, feed output streams back to input for recursive processing, concurrently. - Perform time-based operations such as sampling, throttling, or debouncing. - Group items over time intervals or intersperse actions in a stream periodically. Concurrent streaming with time based operations can be used to implement programs based on functional reactive programming model. The operations you can perform with Streamly often overlap with what libraries like **Apache Flink** provide, but in a **lightweight, unified, and purely functional way**. With all these facilities streamly allows you to express your domain specific problems much more succinctly and effortlessly. For example, the concurrent directory traversal written with streamly is just a few lines of code, built from low level basic building blocks, and outperforms rust based implementations. Streamly is much more higher level alternative to low level packages like async, pipes-concurrency, streamling-concurrency etc. Take a look at the `Streamly.Data.Stream.Prelude`, `Streamly.Data.Scanl.Prelude`, `Streamly.Data.Fold.Prelude` modules in the streamly package. ## Reactive Programming Streamly supports reactive (time domain) programming because of its support for declarative concurrency. Please see the `Streamly.Data.Stream.Prelude` module for time-specific and time based sampling combinators. Reactive programming is modeled beautifully using concurrent streaming in streamly. It involves generation of streams of events, merging concurrent streams and processing events concurrently. Streamly provides native high-level facilities to do all this easily. The examples [AcidRain.hs][] and [CirclingSquare.hs][] demonstrate reactive programming using [Streamly][]. [AcidRain.hs]: https://github.com/composewell/streamly-examples/tree/master/examples/AcidRain.hs [CirclingSquare.hs]: https://github.com/composewell/streamly-examples/tree/master/examples/CirclingSquare.hs ## Parsing Parsing in Streamly is first-class, high-performance, and integrated with streams. Streamly treats parsers as core building blocks for data processing. Many tasks—such as **sorting, chunking, and structured transformations**—are expressed naturally with folds and parsers. Parsers are not restricted to byte streams, any type of streams can be parsed. It provides full-fledged parsing functionality equivalent to the standard parser combinator libraries like `parsec`. Performance is equivalent or better than some of the best performing libraries like `attoparsec`. ## Prompt Resource Cleanup One of the many challenges in a concurrent environment is safe, correct and timely resource cleanup in all cases and races. Streamly provides built-in prompt resource cleanup operations in a concurrent environment. It provides basic IO level building blocks as well as stream based resource management combinators, everything that you can do with `resourcet` is built into the library. ## Lists and Strings Streamly modules `Streamly.Data.Stream` and `Streamly.Data.Fold` cover the entire functionality of `Data.List` from the `base` package. While lists are sufficient and easy to use for many cases, it can be problematic when you need to **interleave IO effects** such as tracing or debug prints. Streamly streams provide the same interface as lists but allow you to **inject IO effects anywhere** without buffering the entire dataset in memory. Additionally, **Streamly streams often outperform lists** when combining transformations. By virtue of its modularity it is able to express the string (stream) splitting operations similar to the `split` package without any custom implementation. In the same manner it implements search operations similar to the `stringsearch` package. For string search, `splitOnSeq` and `takeEndBySeq` in Streamly use the **Rabin-Karp algorithm** for fast pattern matching. Benchmarks show performance comparable to the Rust `ripgrep` tool. `Streamly.Unicode.String` also provides string interpolation. ## Non-determinism and List Transformers Streamly provides full list transformer functionality (e.g. equivalent to the `ListT` from list-t package) and more. It does not provide a Monad instance for streams, as there are many possible ways to define the bind operation. Instead, it offers bind-style operations such as 'concatFor', 'concatForM', and their variants (e.g. fair interleaving and breadth-first nesting). These can be used for convenient ListT-style stream composition. Additionally, it provides applicative-style cross product operations like 'cross' and its variants which are many times faster than the monad style operations. ## Logic Programming Streamly does not provide a 'LogicT'-style Monad instance, but it offers comprehensive logic programming functionality. Operations like 'fairCross' and 'fairConcatFor' nest outer and inner streams fairly, ensuring that no stream is starved when exploring cross products. This enables balanced exploration across all dimensions in backtracking problems, while also supporting infinite streams without starvation. It effectively replaces the core functionality of 'LogicT' from the @logict@ package, with significantly better performance. In particular, it avoids the quadratic slowdown seen with @observeMany@, and the applicative 'fairCross' runs many times faster, achieving loop nesting performance comparable to C. ## Arrays Streamly provides comprehensive array functionality. Arrays are equal partners to streams in general purpose programming. Arrays in streamly can be mutable or immutable, boxed or unboxed, pinned or unpinned. Thus they unify the functionality provided by the `bytestring`, `text` and `vector` packages. The API is integrated with streams, it is concise and modular because all the processing is done via streams. There is no need of thinking about lazy bytestring vs strict bytestring, bytestring vs text, text vs lazy text. Lazy in streamly is equivalent to stream processing or stream of arrays, and strict is equivalent to arrays. The `streamly-bytestring` package provides interoperation of streamly arrays with `bytestring`. `streamly-text` package provides interoperability with `text`. Streamly also implements ring arrays which are essentially mutable circular buffers. Ring Arrays help in implementing sliding windows efficiently and conveniently. Streamly can potentially replace most array like package for example `array`, `primitive`, `vector`, `vector-algorithms`, `ring-buffer`, `bytestring`. ## Serialization Binary serialization in streamly is built into arrays. Unboxed arrays are in fact store data in binary serialized form. The `Unbox` and `Serialize` type classes provide fast binary serialization and deserialization. The `MutByteArray`, `MutArray` and `Array` modules provide functions that can serialize Haskell data types using the `Unbox` and `Serialize` type classes. Thus arrays in streamly are high performance alternative to serialization packages like `binary`, `cereal`, `store` etc. ## File System Streamly provides native, high-performance streaming APIs for file system operations. There is also a module for file path representation which leverages streamly arrays for high performance and safe path operations. Path module is designed to allow gradual typing, you can choose to use untyped path, absolute vs relative distinction, file vs dir distinction or all four. The typed versions are not released yet but available as internal modules. Paths and directory modules in streamly can be used as an alternative for `filepath`, `directory`, and `path` packages. ## Network Streamly provides native streaming APIs for network operations as a higher level alternative to the `network` package functionality. ## Clock and Time Streamly provides clock, time and timer APIs for high-performance streaming use. See the `Streamly.Internal.Data.Time.Clock`, `Streamly.Internal.Data.Time.Units` modules. These are currently internal but will be released in a future release. ## Compatibility Streamly can interwork with other packages in the Haskell ecosystem providing similar functionality. Following is a list of examples or packages providing interconversion: | Interop with | Example, package | |----------------|-----------------------------------------------------------------------| | streaming | [streamly-examples](https://github.com/composewell/streamly-examples) | | pipes | [streamly-examples](https://github.com/composewell/streamly-examples) | | conduit | [streamly-examples](https://github.com/composewell/streamly-examples) | | bytestring | [streamly-bytestring](https://github.com/psibi/streamly-bytestring) | | text | [streamly-text](https://github.com/composewell/streamly-text) | | filepath | [streamly-filepath](https://github.com/composewell/streamly-filepath) | See the `streamly-ecosystem` chapter for more details.