legion- Distributed, stateful, homogeneous microservice framework.

Safe HaskellNone




Legion is a mathematically sound framework for writing horizontally scalable user applications. Historically, horizontal scalability has been achieved via the property of statelessness. Programmers would design their applications to be free of any kind of persistent state, avoiding the problem of distributed state management. This almost never turns out to really be possible, so programmers achieve "statelessness" by delegating application state management to some kind of external, shared database -- which ends up having its own scalability problems.

In addition to scalability problems, which modern databases (especially NoSQL databases) have done a good job of solving, there is another, more fundamental problem facing these architectures: The application is not really stateless.

Legion is a Haskell framework that abstracts state partitioning, data replication, request routing, and cluster rebalancing, making it easy to implement large and robust distributed data applications.

Examples of services that rely on partitioning include ElasticSearch, Riak, DynamoDB, and others. In other words, almost all scalable databases.


Using Legion

Starting the Legion Runtime

While this section is being worked on, you can check out the legion-discovery project for an example of a stateful web services that advantage of Legion's ability to define your own operations on your data. Take a look at `Network.Legion.Discovery.App` to see where the magic of defining a Legion application happens. The rest of the code is mostly just standard HTTP-interface-written-in-Haskell, and requests sent to the Legion runtime.

forkLegionary Source #


:: (LegionConstraints e o s, MonadLoggerIO io) 
=> Persistence e o s

The persistence layer used to back the legion framework.

-> RuntimeSettings

Settings and configuration of the legion framework.

-> StartupMode 
-> io (Runtime e o) 

Forks the legion framework in a background thread, and returns a way to send user requests to it and retrieve the responses to those requests.

  • e is the type of request your application will handle. e stands for "event".
  • o is the type of response produced by your application. o stands for "output"
  • s is the type of state maintained by your application. More precisely, it is the type of the individual partitions that make up your global application state. s stands for "state".

data StartupMode Source #

This defines the various ways a node can be spun up.



Indicates that we should bootstrap a new cluster at startup. The persistence layer may be safely pre-populated because the new node will claim the entire keyspace.

JoinCluster SockAddr

Indicates that the node should try to join an existing cluster, either by starting fresh, or by recovering from a shutdown or crash.

data Runtime e o Source #

This type represents a handle to the runtime environment of your Legion application. This allows you to make requests and access the partition index.

Runtime is an opaque structure. Use makeRequest to access it.

Runtime Configuration

The legion framework has several operational parameters which can be controlled using configuration. These include the address binding used to expose the cluster management service endpoint and what file to use for cluster state journaling.

data RuntimeSettings Source #

Settings used when starting up the legion framework runtime.




  • peerBindAddr :: SockAddr

    The address on which the legion framework will listen for rebalancing and cluster management commands.

  • joinBindAddr :: SockAddr

    The address on which the legion framework will listen for cluster join requests.

  • adminHost :: HostPreference

    The host address on which the admin service should run.

  • adminPort :: Port

    The host port on which the admin service should run.

Making Runtime Requests

makeRequest :: MonadIO io => Runtime e o -> PartitionKey -> e -> io o Source #

Send a user request to the legion runtime.

search :: MonadIO io => Runtime e o -> SearchTag -> Source io IndexRecord Source #

Send a search request to the legion runtime. Returns results that are strictly greater than the provided SearchTag.

Implementing a Legion Application

Whenever you use Legion to develop a distributed application, your application is going to be divided into two major parts, the stateless part, and the stateful part. The stateless part is going to be the context in which a legion node is running -- probably a web server if you are exposing your application as a web service. Legion itself is focused mainly on the stateful part, and it will do all the heavy lifting on that side of things. However, it is worth mentioning a few things about the stateless part before we move on.

The unit of state that Legion knows about is called a "partition". Each partition is identified by a PartitionKey, and it is replicated across the cluster. Each partition acts as the unit of state for handling stateful user requests which are routed to it based on the PartitionKey associated with the request. What the stateful part of Legion is not able to do is figure out what partition key is associated with the request in the first place. This is a function of the stateless part of the application. Generally speaking, the stateless part of your application is going to be responsible for

  • Starting up the Legion runtime using forkLegionary.
  • Identifying the partition key to which a request should be applied (e.g. maybe this is some component of a URL, or else an identifier stashed in a browser cookie).
  • Marshalling application requests into requests to the Legion runtime.
  • Marshalling the Legion runtime response into an application response.

Legion doesn't really address any of these things, mainly because there are already plenty of great ways to write stateless services. What Legion does provide is a runtime that can be embedded in the stateless part of your application, that transparently handles all of the hard stateful stuff, like replication, rebalancing, request routing, etc.

The only thing required to implement a legion service is to implement a few typeclasses and call forkLegionary. The state-aware part of your application will live mostly within the request handler, which is implemented via a typeclass Event.

class Event e o s | e -> s o where
  apply :: e -> s -> (o, s)

If you look at apply, you will see that it is abstract over the type variables e, o, and s. These are the types your application has to fill in. e stands for "event", which is the type of requests your application accepts; o stands for "output", which is the type of responses your application will generate in response to those requests, and s stands for "state", which is the application state that each partition can assume.

Implementing a request handler is pretty straight forward, but there is a little bit more to it than meets the eye. If you look at forkLegionary, you will see a constraint named LegionConstraints e o s, which is short-hand for a long list of typeclasses that your e, o, and s types are going to have to implement.

The persistence layer provides the framework with a way to store the various partition states. This allows you to choose any number of persistence strategies, including only in memory, on disk, or in some external database.

See newMemoryPersistence and diskPersistence if you need to get started quickly with an in-memory persistence layer.


Legion gives you a way to index your partitions so that you can find partitions that have certain characteristics without having to know the partition key a priori. Conceptually, the "index" is a single, global, ordered list of IndexRecords. The search function allows you to scroll forward through this list at will.

Indexing is implemented by instantiating the Indexable typeclass for your state type.

class Indexable s where
  indexEntries :: s -> Set Tag

The tags returned by indexEntries is used to construct a set of zero or more IndexRecords. For each Tag returned by indexEntries, an IndexRecord is generated such that:

IndexRecord {irTag = <your tag>, irKey = <partition key>}

class Indexable s where Source #

This typeclass provides the ability to index partition states.

Minimal complete definition



indexEntries :: s -> Set Tag Source #

A way of indexing partitions so that they can be found without knowing the partition key. An index entry for the partition will be created under each of the tags returned by this function.

type LegionConstraints e o s = (Event e o s, Indexable s, Default s, Binary e, Binary o, Binary s, Show e, Show o, Show s, Eq e) Source #

This is a more convenient way to write the somewhat unwieldy set of constraints

  Event e o s, Default s, Binary e, Binary o, Binary s, Show e,
  Show o, Show s, Eq e

data Persistence e o s Source #

The type of a user-defined persistence strategy used to persist partition states. See newMemoryPersistence or diskPersistence if you need to get started quickly.




class Event e o s | e -> s o where Source #

The class which allows for event application.

Minimal complete definition



apply :: e -> s -> (o, s) Source #

Apply an event to a state value. *This function MUST be total!!!*

newtype Tag Source #

A tag is a value associated with a partition state that can be used to look up a partition key.





Eq Tag Source # 


(==) :: Tag -> Tag -> Bool #

(/=) :: Tag -> Tag -> Bool #

Ord Tag Source # 


compare :: Tag -> Tag -> Ordering #

(<) :: Tag -> Tag -> Bool #

(<=) :: Tag -> Tag -> Bool #

(>) :: Tag -> Tag -> Bool #

(>=) :: Tag -> Tag -> Bool #

max :: Tag -> Tag -> Tag #

min :: Tag -> Tag -> Tag #

Show Tag Source # 


showsPrec :: Int -> Tag -> ShowS #

show :: Tag -> String #

showList :: [Tag] -> ShowS #

IsString Tag Source # 


fromString :: String -> Tag #

Binary Tag Source # 


put :: Tag -> Put #

get :: Get Tag #

putList :: [Tag] -> Put #

Other Types

data SearchTag Source #

This data structure describes where in the index to start scrolling.



data PartitionPowerState e o s Source #

This is an opaque representation of your application's partition state. Internally, this represents the complete, nondeterministic set of states the partition can be in as a result of concurrency, eventual consistency, and all the other distributed systems reasons your partition state might have more than one value.

You can save these guys to disk in your Persistence layer by using its Binary instance.


newMemoryPersistence :: IO (Persistence e o s) Source #

A convenient memory-based persistence layer. Good for testing or for applications (like caches) that don't have durability requirements.

diskPersistence Source #


:: (Binary e, Binary s) 
=> FilePath

The directory under which partition states will be stored.

-> Persistence e o s 

A convenient way to persist partition states to disk.