Safe Haskell | None |
---|---|
Language | Haskell2010 |
This module provides a CRDT data structure that collects and applies operations (called "events") that mutate an underlying data structure (like folding).
In addition to mutating the underlying data, each operation can also produce an output that can be obtained by the client. The output can be either totally consistent across all replicas (which is slower), or it can be returned immediately and possibly reflect an inconsistent state.
The EventFold
name derives from a loose analogy to folding over a list of
events using plain old foldl
. The component parts of foldl
are:
- A binary operator, analogous to
apply
. - An accumulator value, analogous to
infimumValue
. - A list of values to fold over, loosely analogous to "the list of
all future calls to
event
". - A return value. There is no real analogy for the "return value".
Similarly to how you never actually obtain a return value if you
try to
foldl
over an infinite list,EventFold
s are meant to be long-lived objects that accommodate an infinite number of calls toevent
. What you can do is inspect the current value of the accumulator usinginfimumValue
, or the "projected" value of the accumulator usingprojectedValue
(where "projected" means "taking into account all of the currently known calls toevent
that have not yet been folded into the accumulator, and which may yet turn out to to have other events inserted into the middle or beginning of the list").
The EventFold
value itself can be thought of as an intermediate,
replicated, current state of the fold of an infinite list of events
that has not yet been fully generated. So you can, for instance,
check the current accumulator value.
In a little more detail, consider the type signature of foldl
(for lists).
foldl :: (b -> a -> b) -- Analogous to 'apply', where 'a' is your 'Event' -- instance, and 'b' is 'State a'. -> b -- Loosely analogous to 'infimumValue' where -- progressive applications are accumulated. (I -- know that in the type signature of 'foldl' -- this is the "starting value", but imagine that -- for a recursive implementation of 'foldl', -- the child call's "starting value" is the parent -- call's accumulated value.) -> [a] -- Analogous to all outstanding or future calls to -- 'event'. -> b
Synopsis
- new :: (Default (State e), Ord p) => o -> p -> EventFold o p e
- event :: (Ord p, Event e) => p -> e -> EventFold o p e -> (Output e, EventId p, UpdateResult o p e)
- fullMerge :: (Eq (Output e), Eq e, Eq o, Event e, Ord p) => p -> EventFold o p e -> EventFold o p e -> Either (MergeError o p e) (UpdateResult o p e)
- data UpdateResult o p e = UpdateResult {
- urEventFold :: EventFold o p e
- urOutputs :: Map (EventId p) (Output e)
- urNeedsPropagation :: Bool
- events :: Ord p => p -> EventFold o p e -> Diff o p e
- diffMerge :: (Eq (Output e), Eq e, Eq o, Event e, Ord p) => p -> EventFold o p e -> Diff o p e -> Either (MergeError o p e) (UpdateResult o p e)
- data MergeError o p e
- = DifferentOrigins o o
- | DiffTooNew (EventFold o p e) (Diff o p e)
- | DiffTooSparse (EventFold o p e) (Diff o p e)
- participate :: forall o p e. (Ord p, Event e) => p -> p -> EventFold o p e -> (EventId p, UpdateResult o p e)
- disassociate :: forall o p e. (Event e, Ord p) => p -> EventFold o p e -> (EventId p, UpdateResult o p e)
- class Event e where
- type Output e
- type State e
- apply :: e -> State e -> EventResult e
- data EventResult e
- = SystemError (Output e)
- | Pure (Output e) (State e)
- isBlockedOnError :: EventFold o p e -> Bool
- projectedValue :: Event e => EventFold o p e -> State e
- infimumValue :: EventFold o p e -> State e
- infimumId :: EventFold o p e -> EventId p
- infimumParticipants :: EventFold o p e -> Set p
- allParticipants :: Ord p => EventFold o p e -> Set p
- projParticipants :: Ord p => EventFold o p e -> Set p
- origin :: EventFold o p e -> o
- divergent :: forall o p e. Ord p => EventFold o p e -> Map p (EventId p)
- data EventFold o p e
- data EventId p
- data Diff o p e
Basic API
Creating new CRDTs
:: (Default (State e), Ord p) | |
=> o | The "origin", identifying the historical lineage of this CRDT. |
-> p | The initial participant. |
-> EventFold o p e |
Construct a new EventFold
with the given origin and initial
participant.
Adding new events
event :: (Ord p, Event e) => p -> e -> EventFold o p e -> (Output e, EventId p, UpdateResult o p e) Source #
Coordinating replica updates
Functions in this section are used to help merge foreign copies of
the CRDT, and transmit our own copy. (This library does not provide
any kind of transport support, except that all the relevant types
have Binary
instances. Actually arranging for these things to get
shipped across a wire is left to the user.)
In principal, the only function you need is fullMerge
. Everything
else in this section is an optimization. You can ship the full
EventFold
value to a remote participant and it can incorporate
any changes using fullMerge
, and vice versa. You can receive an
EventFold
value from another participant and incorporate its
changes locally using fullMerge
.
However, if your underlying data structure is large, it may be more
efficient to just ship a sort of diff containing the information
that the local participant thinks the remote participant might be
missing. That is what events
and diffMerge
are for.
:: (Eq (Output e), Eq e, Eq o, Event e, Ord p) | |
=> p | The participant doing the merge. |
-> EventFold o p e | |
-> EventFold o p e | |
-> Either (MergeError o p e) (UpdateResult o p e) |
Monotonically merge the information in two EventFold
s. The resulting
EventFold
may have a higher infimum value, but it will never have
a lower one. Only EventFold
s that originated from the same new
call can be merged. If the origins are mismatched, or if there is some
other programming error detected, then an error will be returned.
Returns the new EventFold
value, along with the output for all of
the events that can now be considered "fully consistent".
data UpdateResult o p e Source #
The result updating the EventFold
, which is some information
containing the new EventFold
value, the outputs of events that have
reached the infimum as a result of update (i.e. "totally consistent
outputs"), and a flag indicating whether the other participants need
to hear about the changes.
events :: Ord p => p -> EventFold o p e -> Diff o p e Source #
Get the outstanding events that need to be propagated to a particular participant.
diffMerge :: (Eq (Output e), Eq e, Eq o, Event e, Ord p) => p -> EventFold o p e -> Diff o p e -> Either (MergeError o p e) (UpdateResult o p e) Source #
data MergeError o p e Source #
This is the exception type for illegal merges. These errors indicate a serious programming bugs.
DifferentOrigins o o | The |
DiffTooNew (EventFold o p e) (Diff o p e) | The |
DiffTooSparse (EventFold o p e) (Diff o p e) | The |
Instances
(Show (Output e), Show o, Show p, Show e, Show (State e)) => Show (MergeError o p e) Source # | |
Defined in Data.CRDT.EventFold showsPrec :: Int -> MergeError o p e -> ShowS # show :: MergeError o p e -> String # showList :: [MergeError o p e] -> ShowS # |
Participation
:: forall o p e. (Ord p, Event e) | |
=> p | The local participant. |
-> p | The participant being added. |
-> EventFold o p e | |
-> (EventId p, UpdateResult o p e) |
Allow a participant to join in the distributed nature of the
EventFold
. Return the EventId
at which the participation is
recorded, and the resulting EventFold
. The purpose of returning the
EventId
is so that you can use it to tell when the participation
event has reached the infimum. See also: infimumId
:: forall o p e. (Event e, Ord p) | |
=> p | The peer removing itself from participation. |
-> EventFold o p e | |
-> (EventId p, UpdateResult o p e) |
Indicate that a participant is removing itself from participating in
the distributed EventFold
.
Defining your state and events
Instances of this class define the particular "events" being "folded"
over in a distributed fashion. In addition to the event type itself,
there are a couple of type families which define the State
into which
folded events are accumulated, and the Output
which application of
a particular event can generate.
TL;DR: This is how users define their own custom operations.
apply :: e -> State e -> EventResult e Source #
Apply an event to a state value. **This function MUST be total!!!**
data EventResult e Source #
The result of applying an event.
Morally speaking, events are always pure functions. However, mundane issues like finite memory constraints and finite execution time can cause referentially opaque behavior. In a normal Haskell program, this usually leads to a crash or an exception, and the crash or exception can itself, in a way, be thought of as being referentially transparent, because there is no way for it to both happen and, simultaneously, not happen.
However, in our case we are replicating computations across many different pieces of hardware, so there most definitely is a way for these aberrant system failures to both happen and not happen simultaneously. What happens if the computation of the event runs out of memory on one machine, but not on another?
There exists a strategy for dealing with these problems: if the computation of an event experiences a failure on every participant, then the event is pushed into the infimum as a failure (i.e. a no-op), but if any single participant successfully computes the event then all other participants can (somehow) request a "Full Merge" from the successful participant. The Full Merge will include the infimum value computed by the successful participant, which will include the successful application of the problematic event. The error participants can thus bypass computation of the problem event altogether, and can simply overwrite their infimum with the infimum provided by the Full Merge.
Doing a full merge can be much more expensive than doing a simple
Diff
merge, because it requires transmitting the full value of the
EventFold
instead of just the outstanding operations.
This type represents how computation of the event finished; with either a pure result, or some kind of system error.
In general SystemError
is probably only ever useful for when your
event type somehow executes untrusted code (for instance when your event
type is a Turing-complete DSL that allows users to submit their own
custom-programmed "events") and you want to limit the resources that can
be consumed by such user-generated code. It is much less useful when
you are encoding some well defined business logic directly in Haskell.
SystemError (Output e) | |
Pure (Output e) (State e) |
Inspecting the EventFold
isBlockedOnError :: EventFold o p e -> Bool Source #
Return True
if progress on the EventFold
is blocked on a
SystemError
.
The implication here is that if the local copy is blocked on a
SystemError
, it needs to somehow arrange for remote copies to send
full EventFold
s, not just Diff
s. A diffMerge
is not sufficient
to get past the block. Only a fullMerge
will suffice.
If your system is not using SystemError
or else not using Diff
s,
then you don't ever need to worry about this function.
projectedValue :: Event e => EventFold o p e -> State e Source #
Return the current projected value of the EventFold
.
infimumValue :: EventFold o p e -> State e Source #
Return the current infimum value of the EventFold
.
infimumParticipants :: EventFold o p e -> Set p Source #
Gets the known participants at the infimum.
allParticipants :: Ord p => EventFold o p e -> Set p Source #
Get all known participants. This includes participants that are projected for removal.
projParticipants :: Ord p => EventFold o p e -> Set p Source #
Get all the projected participants. This does not include participants that are projected for removal.
divergent :: forall o p e. Ord p => EventFold o p e -> Map p (EventId p) Source #
Returns the participants that we think might be diverging. In
this context, a participant is "diverging" if there is an event
that the participant has not acknowledged but we are expecting it
to acknowledge. Along with the participant, return the last known
EventId
which that participant has acknowledged.
Underlying Types
This type is a
CRDT
into which participants can add Event
s that are folded into a
base State
. You can also think of the "events" as operations that
mutate the base state, and the point of this CRDT is to coordinate
the application of the operations across all participants so that
they are applied consistently even if the operations themselves are
not commutative, idempotent, or monotonic.
Variables are:
o
- Originp
- Participante
- Event
The Origin is a value that is more or less meant to identify the
"thing" being replicated, and in particular identify the historical
lineage of the EventFold
. The idea is that it is meaningless to
try and merge two EventFold
s that do not share a common history
(identified by the origin value) and doing so is a programming error. It
is only used to try and check for this type of programming error and
throw an exception if it happens instead of producing undefined (and
difficult to detect) behavior.
EventId
is a monotonically increasing, totally ordered identification
value which allows us to lend the attribute of monotonicity to event
application operations which would not naturally be monotonic.
Instances
Eq p => Eq (EventId p) Source # | |
Ord p => Ord (EventId p) Source # | |
Defined in Data.CRDT.EventFold | |
Show p => Show (EventId p) Source # | |
Generic (EventId p) Source # | |
Binary p => Binary (EventId p) Source # | |
Default (EventId p) Source # | |
Defined in Data.CRDT.EventFold | |
type Rep (EventId p) Source # | |
Defined in Data.CRDT.EventFold type Rep (EventId p) = D1 ('MetaData "EventId" "Data.CRDT.EventFold" "crdt-event-fold-1.1.0.0-inplace" 'False) (C1 ('MetaCons "BottomEid" 'PrefixI 'False) (U1 :: Type -> Type) :+: C1 ('MetaCons "Eid" 'PrefixI 'False) (S1 ('MetaSel ('Nothing :: Maybe Symbol) 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedLazy) (Rec0 Word256) :*: S1 ('MetaSel ('Nothing :: Maybe Symbol) 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedLazy) (Rec0 p))) |
A package containing events that can be merged into an event fold.