ixset-1.1.1: Efficient relational queries on Haskell sets.

Data.IxSet

Description

An efficient implementation of queryable sets.

Assume you have a type like:

data Entry = Entry Author [Author] Updated Id Content
newtype Updated = Updated EpochTime
newtype Id = Id Int64
newtype Content = Content String
newtype Author = Author Email
type Email = String
1. Decide what parts of your type you want indexed and make your type an instance of Indexable. Use ixFun and ixGen to build indexes:
instance Indexable Entry where
empty = ixSet
[ ixGen (Proxy :: Proxy Author)        -- out of order
, ixGen (Proxy :: Proxy Id)
, ixGen (Proxy :: Proxy Updated)
, ixGen (Proxy :: Proxy Test)          -- bogus index
]
1. Use insert, delete, updateIx, deleteIx and empty to build up an IxSet collection:
entries = foldr insert empty [e1,e2,e3,e4]
entries' = foldr delete entries [e1,e3]
entries'' = update e4 e5 entries
1. Use the query functions below to grab data from it:
entries @= (Author "john@doe.com") @< (Updated t1)

Statement above will find all items in entries updated earlier than t1 by john@doe.com.

1. Text index

If you want to do add a text index create a calculated index. Then if you want all entries with either word1 or word2, you change the instance to:

getWords (Entry _ _ _ _ (Content s)) = map Word $words s instance Indexable Entry where empty = ixSet [ ... , ixFun getWords ] Now you can do this query to find entries with any of the words: entries @+ [Word "word1", Word "word2"] And if you want all entries with both: entries @* [Word "word1", Word "word2"] 1. Find only the first author If an Entry has multiple authors and you want to be able to query on the first author only, define a FirstAuthor datatype and create an index with this type. Now you can do: newtype FirstAuthor = FirstAuthor Email getFirstAuthor (Entry author _ _ _ _) = [FirstAuthor author] instance Indexable Entry where ... empty = ixSet [ ... , ixFun getFirstAuthor ] entries @= (FirstAuthor "john@doe.com") -- guess what this does Synopsis # Set type data IxSet a Source # Set with associated indexes. Instances  (Data ctx a, Data ctx [a], Sat (ctx (IxSet a)), Sat (ctx [a]), Typeable IxSet, Indexable a, Data a, Ord a) => Data ctx (IxSet a) Source # Instance detailsDefined in Data.IxSet Methodsgfoldl :: Proxy ctx -> (forall b c. Data ctx b => w (b -> c) -> b -> w c) -> (forall g. g -> w g) -> IxSet a -> w (IxSet a) #gunfold :: Proxy ctx -> (forall b r. Data ctx b => c (b -> r) -> c r) -> (forall r. r -> c r) -> Constr -> c (IxSet a) #toConstr :: Proxy ctx -> IxSet a -> Constr #dataTypeOf :: Proxy ctx -> IxSet a -> DataType #dataCast1 :: Typeable t => Proxy ctx -> (forall b. Data ctx b => w (t b)) -> Maybe (w (IxSet a)) #dataCast2 :: Typeable t => Proxy ctx -> (forall b c. (Data ctx b, Data ctx c) => w (t b c)) -> Maybe (w (IxSet a)) # (Eq a, Ord a, Typeable a) => Eq (IxSet a) Source # Instance detailsDefined in Data.IxSet Methods(==) :: IxSet a -> IxSet a -> Bool #(/=) :: IxSet a -> IxSet a -> Bool # Data a => Data (IxSet a) Source # Instance detailsDefined in Data.IxSet Methodsgfoldl :: (forall d b. Data d => c (d -> b) -> d -> c b) -> (forall g. g -> c g) -> IxSet a -> c (IxSet a) #gunfold :: (forall b r. Data b => c (b -> r) -> c r) -> (forall r. r -> c r) -> Constr -> c (IxSet a) #toConstr :: IxSet a -> Constr #dataTypeOf :: IxSet a -> DataType #dataCast1 :: Typeable t => (forall d. Data d => c (t d)) -> Maybe (c (IxSet a)) #dataCast2 :: Typeable t => (forall d e. (Data d, Data e) => c (t d e)) -> Maybe (c (IxSet a)) #gmapT :: (forall b. Data b => b -> b) -> IxSet a -> IxSet a #gmapQl :: (r -> r' -> r) -> r -> (forall d. Data d => d -> r') -> IxSet a -> r #gmapQr :: (r' -> r -> r) -> r -> (forall d. Data d => d -> r') -> IxSet a -> r #gmapQ :: (forall d. Data d => d -> u) -> IxSet a -> [u] #gmapQi :: Int -> (forall d. Data d => d -> u) -> IxSet a -> u #gmapM :: Monad m => (forall d. Data d => d -> m d) -> IxSet a -> m (IxSet a) #gmapMp :: MonadPlus m => (forall d. Data d => d -> m d) -> IxSet a -> m (IxSet a) #gmapMo :: MonadPlus m => (forall d. Data d => d -> m d) -> IxSet a -> m (IxSet a) # (Eq a, Ord a, Typeable a) => Ord (IxSet a) Source # Instance detailsDefined in Data.IxSet Methodscompare :: IxSet a -> IxSet a -> Ordering #(<) :: IxSet a -> IxSet a -> Bool #(<=) :: IxSet a -> IxSet a -> Bool #(>) :: IxSet a -> IxSet a -> Bool #(>=) :: IxSet a -> IxSet a -> Bool #max :: IxSet a -> IxSet a -> IxSet a #min :: IxSet a -> IxSet a -> IxSet a # (Ord a, Read a, Typeable a, Indexable a) => Read (IxSet a) Source # Instance detailsDefined in Data.IxSet MethodsreadsPrec :: Int -> ReadS (IxSet a) #readList :: ReadS [IxSet a] # (Ord a, Show a) => Show (IxSet a) Source # Instance detailsDefined in Data.IxSet MethodsshowsPrec :: Int -> IxSet a -> ShowS #show :: IxSet a -> String #showList :: [IxSet a] -> ShowS # (Indexable a, Typeable a, Ord a) => Semigroup (IxSet a) Source # Instance detailsDefined in Data.IxSet Methods(<>) :: IxSet a -> IxSet a -> IxSet a #sconcat :: NonEmpty (IxSet a) -> IxSet a #stimes :: Integral b => b -> IxSet a -> IxSet a # (Indexable a, Typeable a, Ord a) => Monoid (IxSet a) Source # Instance detailsDefined in Data.IxSet Methodsmempty :: IxSet a #mappend :: IxSet a -> IxSet a -> IxSet a #mconcat :: [IxSet a] -> IxSet a # (SafeCopy a, Ord a, Typeable a, Indexable a) => SafeCopy (IxSet a) Source # Instance detailsDefined in Data.IxSet Methodsversion :: Version (IxSet a) #kind :: Kind (IxSet a) #getCopy :: Contained (Get (IxSet a)) #errorTypeName :: Proxy (IxSet a) -> String # class Indexable a where Source # Defines objects that can be members of IxSet. Minimal complete definition empty Methods Defines what an empty IxSet for this particular type should look like. It should have all necessary indexes. Use the ixSet function to create the set and fill it in with ixFun and ixGen. data Proxy a Source # Constructors  Proxy noCalcs :: t -> () Source # Function to be used for calcs in inferIxSet when you don't want any calculated values. inferIxSet :: String -> Name -> Name -> [Name] -> Q [Dec] Source # Template Haskell helper function for automatically building an Indexable instance from a data type, e.g. data Foo = Foo Int String and $(inferIxSet "FooDB" ''Foo 'noCalcs [''Int,''String])

will build a type synonym

type FooDB = IxSet Foo

with Int and String as indexes.

WARNING: The type specified as the first index must be a type which appears in all values in the IxSet or toList, toSet and serialization will not function properly. You will be warned not to do this with a runtime error. You can always use the element type itself. For example:

\$(inferIxSet "FooDB" ''Foo 'noCalcs [''Foo, ''Int, ''String])

ixSet :: [Ix a] -> IxSet a Source #

Create an IxSet using a list of indexes. Typically used to create the empty method for an Indexable instance.

The list elements are generally created by using the ixFun and ixGen helper functions.

instance Indexable Type where
empty = ixSet [ ...
, ixFun getIndex1
, ixGen (Proxy :: Proxy Index2Type)
]

Every value in the IxSet must be reachable by the first index in this list, or you'll get a runtime error.

ixFun :: forall a b. (Ord b, Typeable b) => (a -> [b]) -> Ix a Source #

Create a functional index. Provided function should return a list of indexes where the value should be found.

getIndexes value = [...indexes...]
instance Indexable Type where
empty = ixSet [ ixFun getIndexes ]

This is the recommended way to create indexes.

ixGen :: forall a b. (Data a, Ord b, Typeable b) => Proxy b -> Ix a Source #

Create a generic index. Provided example is used only as type source so you may use a Proxy. This uses flatten to traverse values using their Data instances.

instance Indexable Type where
empty = ixSet [ ixGen (Proxy :: Proxy Type) ]

In production systems consider using ixFun in place of ixGen as the former one is much faster.

# Changes to set

type IndexOp = forall k a. (Ord k, Ord a) => k -> a -> Map k (Set a) -> Map k (Set a) Source #

change :: (Typeable a, Indexable a, Ord a) => IndexOp -> a -> IxSet a -> IxSet a Source #

Higher order operator for modifying IxSets. Use this when your final function should have the form a -> IxSet a -> IxSet a, e.g. insert or delete.

insert :: (Typeable a, Ord a, Indexable a) => a -> IxSet a -> IxSet a Source #

Inserts an item into the IxSet. If your data happens to have a primary key this function might not be what you want. See updateIx.

delete :: (Typeable a, Ord a, Indexable a) => a -> IxSet a -> IxSet a Source #

Removes an item from the IxSet.

updateIx :: (Indexable a, Ord a, Typeable a, Typeable k) => k -> a -> IxSet a -> IxSet a Source #

Will replace the item with index k. Only works if there is at most one item with that index in the IxSet. Will not change IxSet if you have more then 1 item with given index.

deleteIx :: (Indexable a, Ord a, Typeable a, Typeable k) => k -> IxSet a -> IxSet a Source #

Will delete the item with index k. Only works if there is at most one item with that index in the IxSet. Will not change IxSet if you have more then 1 item with given index.

# Creation

fromSet :: (Indexable a, Ord a, Typeable a) => Set a -> IxSet a Source #

Converts a Set to an IxSet.

fromList :: (Indexable a, Ord a, Typeable a) => [a] -> IxSet a Source #

Converts a list to an IxSet.

# Conversion

toSet :: Ord a => IxSet a -> Set a Source #

Converts an IxSet to a Set of its elements.

toList :: Ord a => IxSet a -> [a] Source #

Converts an IxSet to its list of elements.

toAscList :: forall k a. (Indexable a, Typeable a, Typeable k) => Proxy k -> IxSet a -> [a] Source #

Converts an IxSet to its list of elements.

List will be sorted in ascending order by the index k.

The list may contain duplicate entries if a single value produces multiple keys.

toDescList :: forall k a. (Indexable a, Typeable a, Typeable k) => Proxy k -> IxSet a -> [a] Source #

Converts an IxSet to its list of elements.

List will be sorted in descending order by the index k.

The list may contain duplicate entries if a single value produces multiple keys.

getOne :: Ord a => IxSet a -> Maybe a Source #

If the IxSet is a singleton it will return the one item stored in it. If IxSet is empty or has many elements this function returns Nothing.

getOneOr :: Ord a => a -> IxSet a -> a Source #

Like getOne with a user-provided default.

# Size checking

size :: Ord a => IxSet a -> Int Source #

Returns the number of unique items in the IxSet.

null :: IxSet a -> Bool Source #

Return True if the IxSet is empty, False otherwise.

# Set operations

(&&&) :: (Ord a, Typeable a, Indexable a) => IxSet a -> IxSet a -> IxSet a infixr 5 Source #

An infix intersection operation.

(|||) :: (Ord a, Typeable a, Indexable a) => IxSet a -> IxSet a -> IxSet a infixr 5 Source #

An infix union operation.

union :: (Ord a, Typeable a, Indexable a) => IxSet a -> IxSet a -> IxSet a Source #

Takes the union of the two IxSets.

intersection :: (Ord a, Typeable a, Indexable a) => IxSet a -> IxSet a -> IxSet a Source #

Takes the intersection of the two IxSets.

# Indexing

(@=) :: (Indexable a, Typeable a, Ord a, Typeable k) => IxSet a -> k -> IxSet a Source #

Infix version of getEQ.

(@<) :: (Indexable a, Typeable a, Ord a, Typeable k) => IxSet a -> k -> IxSet a Source #

Infix version of getLT.

(@>) :: (Indexable a, Typeable a, Ord a, Typeable k) => IxSet a -> k -> IxSet a Source #

Infix version of getGT.

(@<=) :: (Indexable a, Typeable a, Ord a, Typeable k) => IxSet a -> k -> IxSet a Source #

Infix version of getLTE.

(@>=) :: (Indexable a, Typeable a, Ord a, Typeable k) => IxSet a -> k -> IxSet a Source #

Infix version of getGTE.

(@><) :: (Indexable a, Typeable a, Ord a, Typeable k) => IxSet a -> (k, k) -> IxSet a Source #

Returns the subset with indexes in the open interval (k,k).

(@>=<) :: (Indexable a, Typeable a, Ord a, Typeable k) => IxSet a -> (k, k) -> IxSet a Source #

Returns the subset with indexes in [k,k).

(@><=) :: (Indexable a, Typeable a, Ord a, Typeable k) => IxSet a -> (k, k) -> IxSet a Source #

Returns the subset with indexes in (k,k].

(@>=<=) :: (Indexable a, Typeable a, Ord a, Typeable k) => IxSet a -> (k, k) -> IxSet a Source #

Returns the subset with indexes in [k,k].

(@+) :: (Indexable a, Typeable a, Ord a, Typeable k) => IxSet a -> [k] -> IxSet a Source #

Creates the subset that has an index in the provided list.

(@*) :: (Indexable a, Typeable a, Ord a, Typeable k) => IxSet a -> [k] -> IxSet a Source #

Creates the subset that matches all the provided indexes.

getOrd :: (Indexable a, Ord a, Typeable a, Typeable k) => Ordering -> k -> IxSet a -> IxSet a Source #

A function for building up selectors on IxSets. Used in the various get* functions. The set must be indexed over key type, doing otherwise results in runtime error.

getOrd2 :: (Indexable a, Ord a, Typeable a, Typeable k) => Bool -> Bool -> Bool -> k -> IxSet a -> IxSet a Source #

A function for building up selectors on IxSets. Used in the various get* functions. The set must be indexed over key type, doing otherwise results in runtime error.

getEQ :: (Indexable a, Typeable a, Ord a, Typeable k) => k -> IxSet a -> IxSet a Source #

Returns the subset with an index equal to the provided key. The set must be indexed over key type, doing otherwise results in runtime error.

getLT :: (Indexable a, Typeable a, Ord a, Typeable k) => k -> IxSet a -> IxSet a Source #

Returns the subset with an index less than the provided key. The set must be indexed over key type, doing otherwise results in runtime error.

getGT :: (Indexable a, Typeable a, Ord a, Typeable k) => k -> IxSet a -> IxSet a Source #

Returns the subset with an index greater than the provided key. The set must be indexed over key type, doing otherwise results in runtime error.

getLTE :: (Indexable a, Typeable a, Ord a, Typeable k) => k -> IxSet a -> IxSet a Source #

Returns the subset with an index less than or equal to the provided key. The set must be indexed over key type, doing otherwise results in runtime error.

getGTE :: (Indexable a, Typeable a, Ord a, Typeable k) => k -> IxSet a -> IxSet a Source #

Returns the subset with an index greater than or equal to the provided key. The set must be indexed over key type, doing otherwise results in runtime error.

getRange :: (Indexable a, Typeable k, Ord a, Typeable a) => k -> k -> IxSet a -> IxSet a Source #

Returns the subset with an index within the interval provided. The bottom of the interval is closed and the top is open, i. e. [k1;k2). The set must be indexed over key type, doing otherwise results in runtime error.

groupBy :: (Typeable k, Typeable t) => IxSet t -> [(k, [t])] Source #

Returns lists of elements paired with the indexes determined by type inference.

groupAscBy :: (Typeable k, Typeable t) => IxSet t -> [(k, [t])] Source #

Returns lists of elements paired with the indexes determined by type inference.

The resulting list will be sorted in ascending order by k. The values in '[t]' will be sorted in ascending order as well.

groupDescBy :: (Typeable k, Typeable t) => IxSet t -> [(k, [t])] Source #

Returns lists of elements paired with the indexes determined by type inference.

The resulting list will be sorted in descending order by k.

NOTE: The values in '[t]' are currently sorted in ascending order. But this may change if someone bothers to add toDescList. So do not rely on the sort order of '[t]'.

# Index creation helpers

flatten :: (Typeable a, Data a, Typeable b) => a -> [b] Source #

Generically traverses the argument to find all occurences of values of type b and returns them as a list.

This function properly handles String as String not as [Char].

flattenWithCalcs :: (Data c, Typeable a, Data a, Typeable b) => (a -> c) -> a -> [b] Source #

Generically traverses the argument and calculated values to find all occurences of values of type b and returns them as a list. Equivalent to:

flatten (x,calcs x)

This function properly handles String as String not as [Char].

# Debugging and optimization

stats :: Ord a => IxSet a -> (Int, Int, Int, Int) Source #

Statistics about IxSet. This function returns quadruple consisting of 1. total number of elements in the set 2. number of declared indexes 3. number of keys in all indexes 4. number of values in all keys in all indexes. This can aid you in debugging and optimisation.