monoidmap: Monoidal map type
library monoidmap
library monoidmap:internal
library monoidmap:examples
Downloads
- monoidmap-0.0.1.6.tar.gz [browse] (Cabal source package)
- Package description (as included in the package)
Maintainer's Corner
For package maintainers and hackage trustees
Candidates
- No Candidates
Versions [RSS] | 0.0.0.0, 0.0.0.1, 0.0.1.0, 0.0.1.1, 0.0.1.2, 0.0.1.3, 0.0.1.4, 0.0.1.5, 0.0.1.6 |
---|---|
Change log | CHANGELOG.md |
Dependencies | base (>=4.14.3.0 && <4.21), containers (>=0.6.5.1 && <0.8), deepseq (>=1.4.4.0 && <1.6), groups (>=0.5.3 && <0.6), monoid-subclasses (>=1.2.3 && <1.3), monoidmap, nonempty-containers (>=0.3.4.4 && <0.4), nothunks (>=0.1.3 && <0.4) [details] |
License | Apache-2.0 |
Copyright | 2022–2024 Jonathan Knowles |
Author | Jonathan Knowles |
Maintainer | mail@jonathanknowles.net |
Category | Data Structures |
Bug tracker | https://github.com/jonathanknowles/monoidmap/issues |
Source repo | head: git clone https://github.com/jonathanknowles/monoidmap |
Uploaded | by JonathanKnowles at 2024-09-06T04:30:58Z |
Distributions | LTSHaskell:0.0.1.6, Stackage:0.0.1.6 |
Downloads | 917 total (52 in the last 30 days) |
Rating | (no votes yet) [estimated by Bayesian average] |
Your Rating |
|
Status | Docs available [build log] Last success reported on 2024-09-06 [all 1 reports] |
Readme for monoidmap-0.0.1.6
[back to package description]monoidmap
Overview
This library provides a MonoidMap
type that:
- models a total function with finite support from keys to monoidal values, with a default value of
mempty
. - encodes key-value mappings with a minimal encoding that only
includes values not equal to
mempty
. - provides a comprehensive set of monoidal operations for transforming, combining, and comparing maps.
- provides a general basis for building more specialised monoidal data structures.
Relationship between keys and values
A map of type MonoidMap k v
associates every possible key of type k
with a value of type v
:
MonoidMap.get :: (Ord k, Monoid v) => k -> MonoidMap k v -> v
The empty
map associates every key k
with a default value of mempty
:
∀ k. MonoidMap.get k MonoidMap.empty == mempty
Comparison with standard Map
type
The MonoidMap
type differs from the standard containers
Map
type in how it relates keys to values:
Type | Models a total function with finite support |
---|---|
Map k v |
from keys of type k to values of type Maybe v . |
MonoidMap k v |
from keys of type k to values of type v . |
This difference can be illustrated by comparing the type signatures of operations to query a key for its value, for both types:
Map.lookup :: k -> Map k v -> Maybe v
MonoidMap.get :: Monoid v => k -> MonoidMap k v -> v
Whereas a standard Map
has a default value of Nothing
, a MonoidMap
has a default value of mempty
:
∀ k. Map.lookup k Map.empty == Nothing
∀ k. MonoidMap.get k MonoidMap.empty == mempty
In practice, the standard Map
type uses Maybe
to indicate the presence or absence of a value for a particular key. This representation is necessary because the Map
type imposes no constraints on value types.
However, monoidal types already have a natural way to represent null or empty values: the mempty
constant, which represents the neutral or identity element of a Monoid
.
Consequently, using a standard Map
with a monoidal value type gives rise to two distinct representations for null or empty values:
Map.lookup k m |
Interpretation |
---|---|
Nothing |
Map m has no entry for key k . |
Just mempty |
Map m has an entry for key k , but the value is empty. |
In constrast, the MonoidMap
type provides a single, canonical representation for null or empty values, according to the following conceptual mapping:
Map.lookup k m |
⟼ | MonoidMap.get k m |
---|---|---|
Nothing |
⟼ | mempty |
Just v | v == mempty |
⟼ | mempty |
Just v | v /= mempty |
⟼ | v |
Advantages of using a canonical representation
A canonical representation for mempty
values can make it easier to correctly implement operations that compare or combine pairs of maps.
When comparing or combining maps of the standard containers
Map
type, there are two cases to consider for each key k
in each map:
With a pair of maps, there are four possible cases to consider for each key.
For maps with monoidal values, and in contexts that assume or require a default value of mempty
, there are now three cases to consider for each map:
Map
m
associatesk
withNothing
.Map
m
associatesk
withJust v
wherev == mempty
.Map
m
associatesk
withJust v
wherev /= mempty
.
With a pair of maps, there are now nine possible cases to consider for each key.
Mishandling cases such as these can give rise to subtle bugs that manifest in unexpected places. For maps with more complex value types (such as maps that nest other maps), the number of cases requiring consideration can easily multiply further, making it even easier to introduce bugs.
Since all MonoidMap
operations provide a canonical representation for mempty
values, it's possible to write functions that compare or combine maps without having to consider Nothing
and Just mempty
as separate cases.
Encoding
A MonoidMap
only encodes mappings from keys to values that are not equal to mempty
.
The total function \(T\) modelled by a MonoidMap
is encoded as a support map \(S\), where \(S\) is the finite subset of key-value mappings in \(T\) for which values are not equal to mempty
(denoted by \(\varnothing\)):
$S = \{ (k, v) \in T \ |\ v \ne \varnothing \} $
Automatic minimisation
All MonoidMap
operations perform automatic minimisation of the support map, so that mempty
values do not appear in:
Constraints on values
MonoidMap
operations require the monoidal value type to be an instance of MonoidNull
.
Instances of MonoidNull
must provide a null
indicator function that satisfies the following law:
null v == (v == mempty)
MonoidMap
operations use the null
indicator function to detect and exclude mempty
values from the support map.
Note that it is not generally necessary for the value type to be an instance of Eq
.
The set of monoidal types that admit a
MonoidNull
instance is strictly larger than the set of monoidal types that admit anEq
instance.For any type
v
that is an instance of bothEq
andMonoid
, it is always possible to define aMonoidNull
instance:instance MonoidNull v where null = (== mempty)
However, there are monoidal types for which it is possible to define a
MonoidNull
instance, but not practical (or possible) to define a lawfulEq
instance.For example, consider the following type:
Maybe (String -> Sum Natural)
Requiring a
MonoidNull
constraint instead of anEq
constraint allowsMonoidMap
to be usable with a greater range of monoidal value types.
Examples of automatic minimisation
mempty
Consider the following operation, which constructs a map of type
MonoidMap Int String
:>>> m0 = fromList [(1, "hello"), (2, "brave"), (3, "new"), (4, "world")] >>> m0 fromList [(1, "hello"), (2, "brave"), (3, "new"), (4, "world")]
The
Monoid
instance forString
definesmempty
to be the emptyString
""
.If we update the map to associate key
3
with value""
, that association will no longer appear when encoding the map:>>> m1 = MonoidMap.set 3 "" m0 >>> m1 fromList [(1, "hello"), (2, "brave"), (4, "world")]
However, we can still read the updated value for key
3
:>>> MonoidMap.get 3 m1 ""
mempty
values
Consider the following operation, which constructs a map of type
MonoidMap Char (Sum Natural)
:>>> m = fromList [('a', Sum 0), ('b', Sum 1), ('c', Sum 2), ('d', Sum 3)]
The
Monoid
instance forSum Natural
definesmempty
to beSum 0
.The original list contained a mapping from key
'a'
to valueSum 0
, but that association will not appear when encoding the map:>>> m fromList [('b', Sum 1), ('c', Sum 2), ('d', Sum 3)]
Nevertheless, we can still read the value for key
'a'
:>>> MonoidMap.get 'a' m Sum 0
mempty
values
Consider the following operations, which construct two maps of type
MonoidMap Char (Sum Natural)
:>>> m1 = fromList [('a', Sum 1), ('b', Sum 1 )] >>> m2 = fromList [('a', Sum 1), ('b', Sum (-1))]
The
Semigroup
instance forSum Natural
defines<>
as equivalent to ordinary addition.If we add both maps together with
<>
, then each key in the resulting map will be associated with the result of applying<>
to each matching pair of values in the original maps. However, adding together the values for key'b'
with<>
producesSum 0
, so this value will not appear in the encoding:>>> m1 <> m2 fromList [('a', Sum 2)]
Nevertheless, we can still read the value for key
'b'
:>>> MonoidMap.get 'b' (m1 <> m2) Sum 0
Advantages of automatic minimisation
Consistency
Automatic exclusion of mempty
values can help to ensure consistency when encoding to or decoding from other formats such as JSON, CBOR, or YAML.
For example, you may wish to ensure that:
- When encoding a map, no
mempty
values appear in the encoded result. - When decoding a map, no
mempty
values appear in the decoded result.
Performance
Automatic exclusion of mempty
values makes it possible to perform certain operations in constant time, rather than in linear time, as it is never necessary to traverse the entire map in order to determine which values may or may not be mempty
:
Operation | With Automatic Minimisation |
Without Automatic Minimisation |
---|---|---|
null |
$O(1)$ | $O(n)$ |
nonNull |
$O(1)$ | $O(n)$ |
nonNullCount |
$O(1)$ | $O(n)$ |
toMap |
$O(1)$ | $O(n)$ |
Memory usage
Automatic minimisation makes it easier to reason about the memory usage of a MonoidMap
, as memory is not required to encode mappings from keys to empty values.
This is a useful property for large, long-lived map structures that are subject to multiple updates over their lifetimes, where it's important to not retain large numbers of mappings from keys to empty values.
Simplicity
Some total map data types only perform minimisation when explicitly demanded to.
For example, the TMap
data type provides a trim
operation that traverses the map and removes any values that are equal to the default value. This approach has some advantages, such the ability to provide a lawful Functor
instance.
However, this approach also has some disadvantages:
- It might not be obvious when it's necessary to call
trim
. For example, consider the following operation:m1 <> m2
. Could this operation produce a map where some keys map to default values? (Answer: it depends on the choice of default value and the underlying value type.) - Calling
trim
when it isn't necessary might adversely affect performance, sincetrim
must traverse the entire data structure. - Not calling
trim
when it is necessary might affect correctness. The compiler will not help here, as trimmed and untrimmed maps share the same type. - Even if
trim
is a semantic no-op, default values can still be made visible by operations that encode maps to other types.
Since all MonoidMap
operations perform automatic minimisation when appropriate, it's not necessary for users to reason about when or whether it's necessary to "trim" the map.
Furthermore, for nested maps such as MonoidMap k1 (MonoidMap k2 v)
, automatic minimisation of inner maps enables seamless automatic minimisation of outer maps. See the NestedMonoidMap
type for an example of this.
Limitations of automatic minimisation
The MonoidMap
type has no Functor
instance, as the requirement to exclude mempty
values means that the map
operation must remove mempty
values from its result. Therefore, map
does not unconditionally satisfy the functor composition law:
map (f . g) == map f . map g
Consider the following MonoidMap
m
:
m :: MonoidMap String String
m = singleton "k" "v"
And the following functions f
and g
:
f :: a -> String
f = const "z"
g :: Monoid a => b -> a
g = const mempty
By substituting the above definitions into the left-hand side of the functor composition law, we obtain:
map (f . g) m = map (const "z" . const mempty) (singleton "k" "v")
= map (const "z" ) (singleton "k" "v")
= (singleton "k" "z")
By substituting the above definitions into the right-hand side of the functor composition law, we obtain:
map f (map g m) = map (const "z") (map (const mempty) (singleton "k" "v"))
= map (const "z") mempty
= mempty
This leads to the following inequality between the left-hand side and right-hand side:
singleton "k" "z" /= mempty
Therefore, for this example, the functor composition law is not satisfied.
However, if applying function f
to mempty
produces mempty
, the functor composition law is satisfied:
(f mempty == mempty) ==> (∀ g. map (f . g) == map f . map g)
Monoidal operations
The MonoidMap
type provides a comprehensive set of monoidal operations for transforming, combining, and comparing maps.
Instances for several subclasses of Semigroup
and Monoid
are provided, including classes from the following libraries:
At the root of this hierarchy of subclasses is the Semigroup
class, whose instance for MonoidMap
is defined in terms of the underlying value type, so that applying <>
to a pair of maps is equivalent to applying <>
to all pairs of values for matching keys:
∀ k. MonoidMap.get k (m1 <> m2) == MonoidMap.get k m1 <> get k m2
In general, operations for subclasses of Semigroup
and Monoid
are defined analogously to the Semigroup
instance, so that:
- unary operations on individual maps are defined in terms of their distributive application to all values.
- binary operations on pairs of maps are defined in terms of their distributive application to all pairs of values for matching keys.
Unary monoidal operations typically satisfy a property similar to:
∀ k. MonoidMap.get k (f m) == f (MonoidMap.get k m)
Binary monoidal operations typically satisfy a property similar to:
∀ k. MonoidMap.get k (f m1 m2) == f (MonoidMap.get k m1) (MonoidMap.get k m2)
Defining monoidal operations in this way makes it possible to transform, combine, and compare maps in ways that are consistent with the algebraic properties of the underlying monoidal value type.
Examples of monoidal operations and their properties
MonoidMap operation |
Required class constraint |
Equivalent class method |
Distributive relationship |
---|---|---|---|
append |
Semigroup |
(<>) |
∀ k. get k (m1 <> m2) ≡ get k m1 <> get k m2 |
minus |
Group |
(~~) |
∀ k. get k (m1 ~~ m2) ≡ get k m1 ~~ get k m2 |
monus |
Monus |
(<\>) |
∀ k. get k (m1 <\> m2) ≡ get k m1 <\> get k m2 |
intersection |
GCDMonoid |
gcd |
∀ k. get k (m1 `gcd` m2) ≡ get k m1 `gcd` get k m2 |
union |
LCMMonoid |
lcm |
∀ k. get k (m1 `lcm` m2) ≡ get k m1 `lcm` get k m2 |
Examples of monoidal operations applied to values
MonoidMap k (Set Integer)
For maps with Set
-based values, MonoidMap.union
and MonoidMap.intersection
compute the Set.union
and Set.intersection
of each pair of matching values, respectively.
Consider the following maps of type MonoidMap Char (Set Integer)
:
>>> m1 = fromList [('a', Set.fromList [0, 1]), ('b', Set.fromList [3, 4])]
>>> m2 = fromList [('a', Set.fromList [0, 2]), ('b', Set.fromList [3, 5])]
The MonoidMap.union
of maps m1
and m2
is a map that associates every key k
with the Set.union
of the corresponding sets for k
in m1
and m2
:
>>> m1 `union` m2
fromList [('a', Set.fromList [0,1,2]), ('b', Set.fromList [3,4,5])]
The MonoidMap.intersection
of maps m1
and m2
is a map that associates every key k
with the Set.intersection
of the corresponding sets for k
in m1
and m2
:
>>> m1 `intersection` m2
fromList [('a', Set.fromList [0]), ('b', Set.fromList [3])]
MonoidMap k (Sum Integer)
Consider the following maps of type MonoidMap Char (Sum Integer)
:
>>> m1 = fromList [('a', Sum 10), ('b', Sum 20), ('c, Sum 40)]
>>> m2 = fromList [('a', Sum 40), ('b', Sum 20), ('c, Sum 10)]
The MonoidMap.invert
operation produces a new map where every key is associated with the negation of its value in the original map:
>>> invert m1
fromList [('a', Sum (-10)), ('b', Sum (-20)), ('c, Sum (-40))]
>>> invert m2
fromList [('a', Sum (-40)), ('b', Sum (-20)), ('c, Sum (-10))]
The MonoidMap.minus
operation, when applied to maps m1
and m2
, produces a new map where every key k
is associated with the value of k
in m1
minus the value of k
in m2
:
>>> m1 `minus` m2
fromList [('a', Sum (-30)), ('c', Sum 30)]
>>> m2 `minus` m1
fromList [('a', Sum 30), ('c', Sum (-30))]
MonoidMap k (Sum Natural)
For maps with Sum Natural
values, MonoidMap.union
and MonoidMap.intersection
compute the maximum and minimum of each pair of matching values, respectively:
>>> m1 = fromList [('a', Sum 10), ('b', Sum 20)]
>>> m2 = fromList [('a', Sum 20), ('b', Sum 10)]
>>> m1 `union` m2
fromList [('a', Sum 20), ('b', Sum 20)]
>>> m1 `intersection` m2
fromList [('a', Sum 10), ('b', Sum 10)]
MonoidMap k (Product Natural)
For maps with Product Natural
values, MonoidMap.union
and MonoidMap.intersection
compute the lowest common multiple (LCM) and greatest common divisor (GCD) of each pair of matching values, respectively:
>>> m1 = fromList [('a', Product 6), ('b', Product 15), ('c', Product 35)]
>>> m2 = fromList [('a', Product 15), ('b', Product 35), ('c', Product 77)]
>>> m1 `union` m2
fromList [('a', Product 30), ('b', Product 105), ('c', Product 385)]
>>> m1 `intersection` m2
fromList [('a', Product 3), ('b', Product 5), ('c', Product 7)]
General basis for more specialised map types
The MonoidMap
type can be used as a general basis for building other more specialised map types.
If you have a Map
-based data type with an invariant that values must not be mempty
, then by expressing this type in terms of MonoidMap
, MonoidMap
will handle the invariant for you:
- newtype SomeMap k v = SomeMap ( Map k (SomeMonoidalContainer v))
+ newtype SomeMap k v = SomeMap (MonoidMap k (SomeMonoidalContainer v))
If you're already using a specialised non-empty container type to enforce the invariant that values must not be empty, then MonoidMap
makes it possible to replace the use of the specialised non-empty container type with its ordinary equivalent:
Example transformations:
-- Non-empty lists:
- newtype ListMap k v = ListMap ( Map k (NonEmpty v))
+ newtype ListMap k v = ListMap (MonoidMap k [v])
-- Non-empty sets:
- newtype SetMap k v = SetMap ( Map k (NonEmptySet v))
+ newtype SetMap k v = SetMap (MonoidMap k (Set v))
-- Non-empty sequences:
- newtype SeqMap k v = SeqMap ( Map k (NonEmptySeq v))
+ newtype SeqMap k v = SeqMap (MonoidMap k (Seq v))
Using MonoidMap
can simplify the implementation of such types, as special handling code for empty values can often be greatly simplified or even eliminated.
Real-world examples from the Haskell ecosystem
Example: SignedMultiSet
(a signed multiset type)
The
signed-multiset
library provides theSignedMultiSet
type, which is internally defined as aMap
from elements to signed integer occurrence counts:newtype SignedMultiset a = SMS {unSMS :: Map a Int}
All
SignedMultiSet
operations maintain an invariant that the internalMap
must not contain any mappings to0
(zero). This requiresSignedMultiSet
functions to detect and eliminate values of0
.For example, the
insertMany
operation:insertMany :: Ord a => a -> Int -> SignedMultiset a -> SignedMultiset a insertMany x n = SMS . Map.alter f x . unSMS where f Nothing = Just n f (Just m) = let k = m + n in if k == 0 then Nothing else Just k
Let's redefine
SignedMultiSet
in terms ofMonoidMap
:- newtype SignedMultiset a = SMS {unSMS :: Map a Int } + newtype SignedMultiset a = SMS {unSMS :: MonoidMap a (Sum Int)}
Here we've used the
Sum
wrapper type, whoseMonoid
instance definesmempty
asSum 0
, and<>
as ordinary addition.Now we can redefine
insertMany
(and similar operations) in a simpler way:insertMany :: Ord a => a -> Int -> SignedMultiset a -> SignedMultiset a + insertMany x n = SMS . MonoidMap.adjust (+ Sum n) x . unSMS - insertMany x n = SMS . Map.alter f x . unSMS - where - f Nothing = Just n - f (Just m) = let k = m + n in if k == 0 then Nothing else Just k
Since the
MonoidMap.adjust
operation performs automatic minimisation, values ofSum 0
are automatically excluded from the internal data structure, and there is no need to handle them differently from non-zero values.
Example: SetMultiMap
(a set-based multimap type)
The
multi-containers
library provides theSetMultiMap
type, which is internally defined as aMap
from keys to (possibly-empty) sets of values, together with aSize
parameter that records the total number of elements in the map (counting duplicates):newtype SetMultimap k a = SetMultimap (Map k (Set a), Size) type Size = Int
All
SetMultiMap
operations maintain an invariant that the internalMap
must not contain any mappings to empty sets. This requiresSetMultiMap
functions to detect and eliminate values ofSet.empty
(indicated by theSet.null
function).For example, the
alterWithKey
operation detects if the updated set is empty, and if so, performs a deletion instead of an insertion:alterWithKey :: Ord k => (k -> Set a -> Set a) -> k -> SetMultimap k a -> SetMultimap k a alterWithKey f k mm@(SetMultimap (m, _)) | Set.null as = fromMap (Map.delete k m) | otherwise = fromMap (Map.insert k as m) where as = f k (mm ! k) fromMap :: Map k (Set a) -> SetMultimap k a fromMap m = SetMultimap (m', sum (fmap Set.size m')) where m' = Map.filter (not . Set.null) m
Let's redefine
SetMultiMap
in terms ofMonoidMap
:- newtype SetMultimap k a = SetMultimap ( Map k (Set a), Size) + newtype SetMultimap k a = SetMultimap (MonoidMap k (Set a), Size)
Now we can provide a simpler definition for
alterWithKey
(and other operations):alterWithKey :: Ord k => (k -> Set a -> Set a) -> k -> SetMultimap k a -> SetMultimap k a alterWithKey f k (SetMultimap (m, size)) = SetMultiMap (MonoidMap.set k new m, size - Set.size old + Set.size new) where old = MonoidMap.get k m new = f k old
Since the
MonoidMap.set
operation performs automatic minimisation, empty sets are automatically excluded from the internal data structure, and there is no need to handle them any differently from non-empty sets.
Example: MultiMap
(a list-based multimap type)
The
multi-containers
library provides theMultiMap
type, which is internally defined as aMap
from keys to non-empty lists of values, together with aSize
parameter that records the total number of elements in the map (counting duplicates):newtype Multimap k a = Multimap (Map k (NonEmpty a), Size) type Size = Int
All
MultiMap
operations maintain the invariant that the internalMap
must not contain any mappings to empty lists. This invariant is handled rather nicely by the use of theNonEmpty
list type, which disallows empty lists by construction. As a result, it's arguably more difficult to make a mistake in the implementation than it would be ifMultiMap
were defined in terms of ordinary lists.However, certain operations still need to differentiate between the empty and non-empty case, and it's still necessary to handle each case specially.
For example, the
alterWithKey
operation detects if the updated list is empty, and if so, performs a deletion instead of an insertion:alterWithKey :: Ord k => (k -> [a] -> [a]) -> k -> Multimap k a -> Multimap k a alterWithKey f k mm@(Multimap (m, _)) = case nonEmpty (f k (mm ! k)) of Just as' -> fromMap (Map.insert k as' m) Nothing -> fromMap (Map.delete k m) fromMap :: Map k (NonEmpty a) -> Multimap k a fromMap m = Multimap (m, sum (fmap length m))
Let's redefine
MultiMap
in terms ofMonoidMap
and ordinary lists:- newtype Multimap k a = Multimap ( Map k (NonEmpty a), Size) + newtype Multimap k a = Multimap (MonoidMap k [a], Size)
Now we can provide a simpler definition for
alterWithKey
(and other operations):alterWithKey :: Ord k => (k -> [a] -> [a]) -> k -> Multimap k a -> Multimap k a alterWithKey f k (Multimap (m, size)) = MultiMap (MonoidMap.set k new m, size - List.length old + List.length new) where old = MonoidMap.get k m new = f k old
Since the
MonoidMap.set
operation performs automatic minimisation:
- empty lists are automatically excluded from the internal data structure.
- there is no need to use a specialised
NonEmpty
type.- there is no need to handle empty lists differently from non-empty lists.
Example: MultiAsset
(a nested map type)
The
cardano-ledger
library provides theMultiAsset
type, which models a nested mapping fromPolicyID
keys toAssetName
keys toInteger
values:newtype MultiAsset c = MultiAsset (Map (PolicyID c) (Map AssetName Integer))
Each
Integer
value represents the value of an asset on the Cardano blockchain, where each asset is uniquely identified by the combination of aPolicyID
and anAssetName
. (Multiple assets can share the samePolicyID
.)All
MultiAsset
operations maintain a dual invariant that:
- there must be no mappings from
PolicyID
keys to empty maps; and that- there must be no mappings from
AssetName
keys toInteger
values of0
.To satisfy this invariant,
MultiAsset
operations use a variety of helper functions to ensure thatMultiAsset
values are always in a canonical form.For example, consider the
Semigroup
instance forMultiAsset
:instance Semigroup (MultiAsset c) where MultiAsset m1 <> MultiAsset m2 = MultiAsset (canonicalMapUnion (canonicalMapUnion (+)) m1 m2)
The above definition of
<>
performs pointwise addition of all pairs of values for matching assets.For example, if:
MultiAsset
m1
maps asseta
to a value of10
;MultiAsset
m2
maps asseta
to a value of20
;Then:
MultiAsset
m1 <> m2
will map asseta
to a value of30
.The definition of
<>
uses a function calledcanonicalMapUnion
, which does the heavy lifting work of performing a union while ensuring that each resultingMap
is in canonical form.Let's have a look at the definition of
canonicalMapUnion
:canonicalMapUnion :: (Ord k, CanonicalZero a) => (a -> a -> a) -> Map k a -> Map k a -> Map k a canonicalMapUnion _f t1 Tip = t1 canonicalMapUnion f t1 (Bin _ k x Tip Tip) = canonicalInsert f k x t1 canonicalMapUnion f (Bin _ k x Tip Tip) t2 = canonicalInsert f k x t2 canonicalMapUnion _f Tip t2 = t2 canonicalMapUnion f (Bin _ k1 x1 l1 r1) t2 = case Map.splitLookup k1 t2 of (l2, mb, r2) -> case mb of Nothing -> if x1 == zeroC then link2 l1l2 r1r2 else link k1 x1 l1l2 r1r2 Just x2 -> if new == zeroC then link2 l1l2 r1r2 else link k1 new l1l2 r1r2 where new = f x1 x2 where !l1l2 = canonicalMapUnion f l1 l2 !r1r2 = canonicalMapUnion f r1 r2
The
canonicalMapUnion
function in turn relies oncanonicalInsert
, which handles individual insertions:canonicalInsert :: (Ord k, CanonicalZero a) => (a -> a -> a) -> k -> a -> Map k a -> Map k a canonicalInsert f !kx x = go where go Tip = if x == zeroC then Tip else Map.singleton kx x go (Bin sy ky y l r) = case compare kx ky of LT -> link ky y (go l) r GT -> link ky y l (go r) EQ -> if new == zeroC then link2 l r else Bin sy kx new l r where new = f y x
Similarly, the
insertMultiAsset
function, which "inserts" the value of an individual asset into aMultiAsset
value, has the following definition:insertMultiAsset :: (Integer -> Integer -> Integer) -> PolicyID c -> AssetName -> Integer -> MultiAsset c -> MultiAsset c insertMultiAsset combine pid aid new (MultiAsset m1) = case Map.splitLookup pid m1 of (l1, Just m2, l2) -> case Map.splitLookup aid m2 of (v1, Just old, v2) -> if n == 0 then let m3 = link2 v1 v2 in if Map.null m3 then MultiAsset (link2 l1 l2) else MultiAsset (link pid m3 l1 l2) else MultiAsset (link pid (link aid n v1 v2) l1 l2) where n = combine old new (_, Nothing, _) -> MultiAsset ( link pid ( if new == 0 then m2 else Map.insert aid new m2 ) l1 l2 ) (l1, Nothing, l2) -> MultiAsset ( if new == 0 then link2 l1 l2 else link pid (Map.singleton aid new) l1 l2 )
A notable feature of all the above functions is that they completely eschew the use of
Map.merge
. Instead, they directly manipulate constructors exported fromMap.Internal
. This approach was probably taken for performance reasons.However, it's clear that maintaining the invariant in this way comes at a cost: the code is rather complex, and it were not for a comprehensive test suite, it would probably be very easy to introduce a regression.
In the spirit of demonstration, let's see what happens if we redefine the
MultiAsset
type in terms ofMonoidMap
:- newtype MultiAsset c = MultiAsset (Map (PolicyID c) ( Map AssetName Integer)) + newtype MultiAsset c = MultiAsset (MonoidMap (PolicyID c) (MonoidMap AssetName (Sum Integer))
Note that we have replaced
Integer
withSum Integer
, whoseMonoid
instance definesmempty
asSum 0
, and whoseSemigroup
instance defines<>
as equivalent to ordinary integer addition.Recall that all
MonoidMap
operations automatically take care of the invariant that values cannot bemempty
. For theMultiAsset
type, this means that:
- outer maps are now prevented from including any mappings from
PolicyID
to empty inner maps.- inner maps are now prevented from including any mappings from
AssetName
to values ofSum 0
.As a result, we can remove virtually all code that deals with canonicalisation.
For example, we can now simplify the
Semigroup
instance forMultiAsset
, dispensing entirely with the need to callcanonicalMapUnion
:instance Semigroup (MultiAsset c) where MultiAsset m1 <> MultiAsset m2 = - MultiAsset (canonicalMapUnion (canonicalMapUnion (+)) m1 m2) + m1 <> m2
Given that the above instance is trivial, we can even derive the
Semigroup
andMonoid
instances automatically:newtype MultiAsset c = MultiAsset (MonoidMap (PolicyID c) (MonoidMap AssetName (Sum Integer)) + deriving newtype (Semigroup, Monoid)
We can also simplify the
insertMultiAsset
function:insertMultiAsset :: (Integer -> Integer -> Integer) -> PolicyID c -> AssetName -> Integer -> MultiAsset c -> MultiAsset c insertMultiAsset combine pid aid new (MultiAsset m1) = + MultiAsset $ + MonoidMap.adjust + (MonoidMap.adjust (\(M.Sum old) -> M.Sum (combine old new)) aid) pid m1 - ... - ### 27 lines deleted ### - ...
Finally, since
MonoidMap
already providesEq
andGroup
instances that are defined in terms of the underlying monoidal value type, we can automatically deriveEq
andGroup
instances forMultiAsset
:newtype MultiAsset c = MultiAsset (MonoidMap (PolicyID c) (MonoidMap AssetName (Sum Integer)) - deriving newtype (Semigroup, Monoid) + deriving newtype (Eq, Semigroup, Monoid, Group) - instance Eq (MultiAsset c) where - MultiAsset x == MultiAsset y = pointWise (pointWise (==)) x y - - instance Group (MultiAsset c) where - invert (MultiAsset m) = - MultiAsset (canonicalMap (canonicalMap ((-1 :: Integer) *)) m)
Many other simplifications are also possible. (Left as an exercise for readers!)
Comparison with other generalised map types
The Haskell ecosystem has several different types for maps with monoidal properties, and several different types that model total functions from keys to values. Each type comes with its own set of advantages and limitations.
Here's a comparison between the MonoidMap
type provided by this library and types provided by other libraries:
Type |
Features | Class Instances | |||||
---|---|---|---|---|---|---|---|
Models total functions |
Performs automatic minimisation |
Eq
|
Monoid subclasses |
Group
|
Functor
|
Applicative
|
|
monoidmap
MonoidMap
(this library) |
:heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :x: | :x: |
monoid‑map
MonoidMap
|
:x: | :heavy_check_mark: | :heavy_check_mark: | :x: | :heavy_check_mark: | :heavy_check_mark: | :x: |
monoidal‑containers
MonoidalMap
|
:x: | :x: | :heavy_check_mark: | :x: | :x: | :heavy_check_mark: | :x: |
total‑map
TMap
|
:heavy_check_mark: | :x: | :x: | :x: | :x: | :heavy_check_mark: | :heavy_check_mark: |
total‑maps
TotalSparseMap
|
:heavy_check_mark: | :x: | :heavy_check_mark: | :x: | :x: | :heavy_check_mark: | :heavy_check_mark: |
defaultable‑map
DefaultableMap
|
:x: | :x: | :heavy_check_mark: | :x: | :x: | :heavy_check_mark: | :heavy_check_mark: |
chatter
DefaultMap
|
:heavy_check_mark: | :x: | :heavy_check_mark: | :x: | :x: | :heavy_check_mark: | :x: |
stack
MonoidMap
|
:x: | :x: | :heavy_check_mark: | :x: | :heavy_check_mark: | :heavy_check_mark: | :x: |