Safe Haskell | None |
---|---|
Language | Haskell2010 |
Legion is a mathematically sound framework for writing horizontally scalable user applications. Historically, horizontal scalability has been achieved via the property of statelessness. Programmers would design their applications to be free of any kind of persistent state, avoiding the problem of distributed state management. This almost never turns out to really be possible, so programmers achieve "statelessness" by delegating application state management to some kind of external, shared database -- which ends up having its own scalability problems.
In addition to scalability problems, which modern databases (especially NoSQL databases) have done a good job of solving, there is another, more fundamental problem facing these architectures: The application is not really stateless.
Legion is a Haskell framework that abstracts state partitioning, data replication, request routing, and cluster rebalancing, making it easy to implement large and robust distributed data applications.
Examples of services that rely on partitioning include ElasticSearch, Riak, DynamoDB, and others. In other words, almost all scalable databases.
- forkLegionary :: (LegionConstraints e o s, MonadLoggerIO io) => Persistence e o s -> RuntimeSettings -> StartupMode -> io (Runtime e o)
- data StartupMode
- data Runtime e o
- data RuntimeSettings = RuntimeSettings {}
- makeRequest :: MonadIO io => Runtime e o -> PartitionKey -> e -> io o
- search :: MonadIO io => Runtime e o -> SearchTag -> Source io IndexRecord
- class Indexable s where
- type LegionConstraints e o s = (Binary e, Binary o, Binary s, Default s, Eq e, Event e o s, Indexable s, Show e, Show o, Show s, ToJSON s)
- data Persistence e o s = Persistence {
- getState :: PartitionKey -> IO (Maybe (PartitionPowerState e o s))
- saveState :: PartitionKey -> Maybe (PartitionPowerState e o s) -> IO ()
- list :: Source IO (PartitionKey, PartitionPowerState e o s)
- class Event e o s | e -> s o where
- newtype Tag = Tag {
- unTag :: ByteString
- data SearchTag = SearchTag {
- stTag :: Tag
- stKey :: Maybe PartitionKey
- data IndexRecord = IndexRecord {
- irTag :: Tag
- irKey :: PartitionKey
- newtype PartitionKey = K {}
- type PartitionPowerState e o s = PowerState PartitionKey s Peer e o
- newMemoryPersistence :: IO (Persistence e o s)
- diskPersistence :: (Binary e, Binary s) => FilePath -> Persistence e o s
Using Legion
Starting the Legion Runtime
While this section is being worked on, you can check out the legion-discovery project for an example of a stateful web services that advantage of Legion's ability to define your own operations on your data. Take a look at `Network.Legion.Discovery.App` to see where the magic of defining a Legion application happens. The rest of the code is mostly just standard HTTP-interface-written-in-Haskell, and requests sent to the Legion runtime.
:: (LegionConstraints e o s, MonadLoggerIO io) | |
=> Persistence e o s | The persistence layer used to back the legion framework. |
-> RuntimeSettings | Settings and configuration of the legion framework. |
-> StartupMode | |
-> io (Runtime e o) |
Forks the legion framework in a background thread, and returns a way to send user requests to it and retrieve the responses to those requests.
e
is the type of request your application will handle.e
stands for "event".o
is the type of response produced by your application.o
stands for "output"s
is the type of state maintained by your application. More precisely, it is the type of the individual partitions that make up your global application state.s
stands for "state".
data StartupMode Source #
This defines the various ways a node can be spun up.
NewCluster | Indicates that we should bootstrap a new cluster at startup. The persistence layer may be safely pre-populated because the new node will claim the entire keyspace. |
JoinCluster SockAddr | Indicates that the node should try to join an existing cluster, either by starting fresh, or by recovering from a shutdown or crash. |
This type represents a handle to the runtime environment of your Legion application. This allows you to make requests and access the partition index.
Runtime
is an opaque structure. Use makeRequest
to access it.
Runtime Configuration
The legion framework has several operational parameters which can be controlled using configuration. These include the address binding used to expose the cluster management service endpoint and what file to use for cluster state journaling.
data RuntimeSettings Source #
Settings used when starting up the legion framework runtime.
RuntimeSettings | |
|
Making Runtime Requests
makeRequest :: MonadIO io => Runtime e o -> PartitionKey -> e -> io o Source #
Send a user request to the legion runtime.
search :: MonadIO io => Runtime e o -> SearchTag -> Source io IndexRecord Source #
Send a search request to the legion runtime. Returns results that are
strictly greater than the provided SearchTag
.
Implementing a Legion Application
Whenever you use Legion to develop a distributed application, your application is going to be divided into two major parts, the stateless part, and the stateful part. The stateless part is going to be the context in which a legion node is running -- probably a web server if you are exposing your application as a web service. Legion itself is focused mainly on the stateful part, and it will do all the heavy lifting on that side of things. However, it is worth mentioning a few things about the stateless part before we move on.
The unit of state that Legion knows about is called a "partition". Each
partition is identified by a PartitionKey
, and it is replicated across
the cluster. Each partition acts as the unit of state for handling
stateful user requests which are routed to it based on the PartitionKey
associated with the request. What the stateful part of Legion is
not able to do is figure out what partition key is associated with
the request in the first place. This is a function of the stateless
part of the application. Generally speaking, the stateless part of
your application is going to be responsible for
- Starting up the Legion runtime using
forkLegionary
. - Identifying the partition key to which a request should be applied (e.g. maybe this is some component of a URL, or else an identifier stashed in a browser cookie).
- Marshalling application requests into requests to the Legion runtime.
- Marshalling the Legion runtime response into an application response.
Legion doesn't really address any of these things, mainly because there are already plenty of great ways to write stateless services. What Legion does provide is a runtime that can be embedded in the stateless part of your application, that transparently handles all of the hard stateful stuff, like replication, rebalancing, request routing, etc.
The only thing required to implement a legion service is to implement
a few typeclasses and call forkLegionary
. The state-aware part of
your application will live mostly within the request handler, which
is implemented via a typeclass Event
.
class Event e o s | e -> s o where apply :: e -> s -> (o, s)
If you look at apply
, you will see that it is abstract over the type
variables e
, o
, and s
. These are the types your application
has to fill in. e
stands for "event", which is the type of requests
your application accepts; o
stands for "output", which is the type of
responses your application will generate in response to those requests,
and s
stands for "state", which is the application state that each
partition can assume.
Implementing a request handler is pretty straight forward, but
there is a little bit more to it than meets the eye. If you look at
forkLegionary
, you will see a constraint named
, which is short-hand for a long list of typeclasses that your
LegionConstraints
e o se
, o
, and s
types are going to have to implement.
The persistence layer provides the framework with a way to store the various partition states. This allows you to choose any number of persistence strategies, including only in memory, on disk, or in some external database.
See newMemoryPersistence
and diskPersistence
if you need to get
started quickly with an in-memory persistence layer.
Indexing
Legion gives you a way to index your partitions so that you can find
partitions that have certain characteristics without having to know
the partition key a priori. Conceptually, the "index" is a single,
global, ordered list of IndexRecord
s. The search
function allows
you to scroll forward through this list at will.
Indexing is implemented by instantiating the Indexable
typeclass
for your state type.
class Indexable s where indexEntries :: s -> Set Tag
The tags returned by indexEntries
is used to construct a set of zero
or more IndexRecord
s. For each Tag
returned by indexEntries
,
an IndexRecord
is generated such that:
IndexRecord {irTag = <your tag>, irKey = <partition key>}
class Indexable s where Source #
This typeclass provides the ability to index partition states.
indexEntries :: s -> Set Tag Source #
A way of indexing partitions so that they can be found without knowing the partition key. An index entry for the partition will be created under each of the tags returned by this function.
type LegionConstraints e o s = (Binary e, Binary o, Binary s, Default s, Eq e, Event e o s, Indexable s, Show e, Show o, Show s, ToJSON s) Source #
This is a more convenient way to write the somewhat unwieldy set of constraints
( Binary e, Binary o, Binary s, Default s, Eq e, Event e o s, Indexable s, Show e, Show o, Show s, ToJSON s )
The ToJSON s
requirement is strictly for servicing the admin web
endpoints.
data Persistence e o s Source #
The type of a user-defined persistence strategy used to persist
partition states. See newMemoryPersistence
or
diskPersistence
if you need to get started quickly.
Persistence | |
|
A tag is a value associated with a partition state that can be used to look up a partition key.
Tag | |
|
Other Types
This data structure describes where in the index to start scrolling.
data IndexRecord Source #
This data structure describes a record in the index.
IndexRecord | |
|
newtype PartitionKey Source #
This is how partitions are identified and referenced.
type PartitionPowerState e o s = PowerState PartitionKey s Peer e o Source #
A representation of all possible partition states.
Utils
newMemoryPersistence :: IO (Persistence e o s) Source #
A convenient memory-based persistence layer. Good for testing or for applications (like caches) that don't have durability requirements.
:: (Binary e, Binary s) | |
=> FilePath | The directory under which partition states will be stored. |
-> Persistence e o s |
A convenient way to persist partition states to disk.