TCache-0.9: A Transactional cache with user-defined persistence

Data.TCache

Contents

Description

TCache is a transactional cache with configurable persitence that permits STM transactions with objects thar syncronize sincromous or asyncronously with their user defined storages. Default persistence in files is provided for testing purposes

In this release some stuff has been supressed without losing functionality. Dynamic interfaces are not needed since TCache can handle heterogeneous data. The new things in this release, besides the backward compatible stuf are:

TCache now implements. ''DBRef' 's . They are persistent STM references with a traditional readDBRef, writeDBRef Haskell interface. simitar to TVars, but with aded. persistence Additionally, because DBRefs are serializable, they can be embeded in serializable registers. Because they are references,they point to other serializable registers. This permits persistent mutable Inter-object relations

Triggers are user defined hooks that are called back on register updates. That can be used for:

  • ease the work of maintain actualized the inter-object relations
  • permit more higuer level and customizable accesses

Data.TCache.IndexQuery implements an straighforwards pure haskell type safe query language based on register field relations. This module must be imported separately. see Data.TCache.IndexQuery for further information

The file persistence is now more reliable, and the embedded IO reads inside STM transactions are safe.

To ease the implementation of other user-defined persistence, Data.TCache.FIlePersistence must be imported for deriving file persistence instances

Synopsis

Inherited from Control.Concurrent.STM

atomically :: STM a -> IO a

Perform a series of STM actions atomically.

You cannot use atomically inside an unsafePerformIO or unsafeInterleaveIO. Any attempt to do so will result in a runtime error. (Reason: allowing this would effectively allow a transaction inside a transaction, depending on exactly when the thunk is evaluated.)

However, see newTVarIO, which can be called inside unsafePerformIO, and which allows top-level TVars to be allocated.

data STM a

A monad supporting atomic memory transactions.

Instances

Monad STM 
Functor STM 
Typeable1 STM 
MonadPlus STM 
(Typeable reg, IResource reg) => Select (reg -> a) (STM [DBRef reg]) (STM [a]) 
(Typeable reg, IResource reg, Typeable reg', IResource reg', Select (reg -> a) (STM [DBRef reg]) (STM [a]), Select (reg' -> b) (STM [DBRef reg']) (STM [b])) => Select (reg -> a, reg' -> b) (STM (JoinData reg reg')) (STM [([a], [b])]) 
(Typeable reg, IResource reg, Select (reg -> a) (STM [DBRef reg]) (STM [a]), Select (reg -> b) (STM [DBRef reg]) (STM [b])) => Select (reg -> a, reg -> b) (STM [DBRef reg]) (STM [(a, b)]) 
(Typeable reg, IResource reg, Select (reg -> a) (STM [DBRef reg]) (STM [a]), Select (reg -> b) (STM [DBRef reg]) (STM [b]), Select (reg -> c) (STM [DBRef reg]) (STM [c])) => Select (reg -> a, reg -> b, reg -> c) (STM [DBRef reg]) (STM [(a, b, c)]) 
(Typeable reg, IResource reg, Select (reg -> a) (STM [DBRef reg]) (STM [a]), Select (reg -> b) (STM [DBRef reg]) (STM [b]), Select (reg -> c) (STM [DBRef reg]) (STM [c]), Select (reg -> d) (STM [DBRef reg]) (STM [d])) => Select (reg -> a, reg -> b, reg -> c, reg -> d) (STM [DBRef reg]) (STM [(a, b, c, d)]) 

operations with cached database references

DBRefs are persistent cached database references in the STM monad with read/write primitives, so the traditional syntax of Haskell STM references can be used for interfacing with databases. As expected, the DBRefs are transactional, because they operate in the STM monad.

DBRefs are references to cached database objects. A DBRef is associated with its referred object and its key Since DBRefs are serializable, they can be elements of mutable objects. They could point to other mutable objects and so on, so DBRefs can act as hardwired relations from mutable objects to other mutable objects in the database/cache. their referred objects are loaded, saved and flused to and from the cache automatically depending on the cache handling policies and the access needs

DBRefs are univocally identified by its pointed object keys, so they can be compared, ordered and so on. The creation of a DBRef, trough getDBRef is pure. This permits an efficient lazy marshalling of registers with references, such are indexes when are queried for some fields but not others.

Example: Car registers have references to Person regiters

data Person= Person {pname :: String} deriving  (Show, Read, Eq, Typeable)
data Car= Car{owner :: DBRef Person , cname:: String} deriving (Show, Read, Eq, Typeable)

Here the Car register point to the Person register trough the owner field

To permit persistence and being refered with DBRefs, define the Indexable instance for these two register types:

instance Indexable Person where key Person{pname= n} = Person  ++ n
instance Indexable Car where key Car{cname= n} = Car  ++ n

Now we create a DBRef to a Person whose name is "Bruce"

>>> let bruce =   getDBRef . key $ Person "Bruce" :: DBRef Person
>>> show bruce
>"DBRef \"Person bruce\""
>>> atomically (readDBRef bruce)
>Nothing

getDBRef is pure and creates the reference, but not the referred object; To create both the reference and the DBRef, use newDBRef. Lets create two Car's and its two Car DBRefs with bruce as owner:

>>> cars <- atomically $  mapM newDBRef [Car bruce "Bat Mobile", Car bruce "Porsche"]
>>> print cars
>[DBRef "Car Bat Mobile",DBRef "Car Porsche"]
>>> carRegs<- atomically $ mapM readDBRef cars
> [Just (Car {owner = DBRef "Person bruce", cname = "Bat Mobile"})
> ,Just (Car {owner = DBRef "Person bruce", cname = "Porsche"})]

try to write with writeDBRef

>>> atomically . writeDBRef bruce $ Person "Other"
>*** Exception: writeDBRef: law of key conservation broken: old , new= Person bruce , Person Other

DBRef's can not be written with objects of different keys

>>> atomically . writeDBRef bruce $ Person "Bruce"
>>> let Just carReg1= head carRegs

now from the Car register it is possible to recover the owner's register

>>> atomically $ readDBRef ( owner carReg1)
>Just (Person {pname = "bruce"})

DBRefs, once the pointed cached object is looked up in the cache and found at creation, they does not perform any further cache lookup afterwards, so reads and writes from/to DBRefs are faster than *Resource(s) calls, which perform lookups everytime in the cache

DBRef's and *Resource(s) primitives are completely interoperable. The latter operate implicitly with DBRef's

data DBRef a Source

Instances

Typeable1 DBRef 
Eq (DBRef a) 
Ord (DBRef a) 
(IResource a, Typeable a) => Read (DBRef a) 
Show (DBRef a) 
SetOperations [DBRef a] [DBRef a] [DBRef a] 
SetOperations [DBRef a] (JoinData a a') (JoinData a a') 
Queriable reg a => RelationOps (reg -> a) a [DBRef reg] 
(Typeable reg, IResource reg) => Select (reg -> a) (STM [DBRef reg]) (STM [a]) 
(Typeable reg, IResource reg, Typeable reg', IResource reg', Select (reg -> a) (STM [DBRef reg]) (STM [a]), Select (reg' -> b) (STM [DBRef reg']) (STM [b])) => Select (reg -> a, reg' -> b) (STM (JoinData reg reg')) (STM [([a], [b])]) 
(Typeable reg, IResource reg, Select (reg -> a) (STM [DBRef reg]) (STM [a]), Select (reg -> b) (STM [DBRef reg]) (STM [b])) => Select (reg -> a, reg -> b) (STM [DBRef reg]) (STM [(a, b)]) 
SetOperations (JoinData a a') [DBRef a'] (JoinData a a') 
SetOperations (JoinData a a') [DBRef a] (JoinData a a') 
(Queriable reg a, Queriable reg' a) => RelationOps (reg -> a) (reg' -> a) (JoinData reg reg') 
(Typeable reg, IResource reg, Select (reg -> a) (STM [DBRef reg]) (STM [a]), Select (reg -> b) (STM [DBRef reg]) (STM [b]), Select (reg -> c) (STM [DBRef reg]) (STM [c])) => Select (reg -> a, reg -> b, reg -> c) (STM [DBRef reg]) (STM [(a, b, c)]) 
(Typeable reg, IResource reg, Select (reg -> a) (STM [DBRef reg]) (STM [a]), Select (reg -> b) (STM [DBRef reg]) (STM [b]), Select (reg -> c) (STM [DBRef reg]) (STM [c]), Select (reg -> d) (STM [DBRef reg]) (STM [d])) => Select (reg -> a, reg -> b, reg -> c, reg -> d) (STM [DBRef reg]) (STM [(a, b, c, d)]) 

getDBRef :: (Typeable a, IResource a) => String -> DBRef aSource

get the reference to the object in the cache. if it does not exist, the reference is created empty. Every execution of getDBRef returns the same unique reference to this key, so it can be safely considered pure. This is a property useful because deserialization of objects with unused embedded DBRef's do not need to marshall them eagerly Tbis also avoid unnecesary cache lookups of the pointed objects.

keyObjDBRef :: DBRef a -> StringSource

return the key of the object pointed to by the DBRef

newDBRef :: (IResource a, Typeable a) => a -> STM (DBRef a)Source

Create the object passed as parameter (if it does not exist) and return its reference in the STM monad. If an object with the same key already exists, it is returned as is If not, the reference is created with the new value. If you like to update in any case, use getDBRef and writeDBRef combined

readDBRef :: (IResource a, Typeable a) => DBRef a -> STM (Maybe a)Source

return the reference value. If it is not in the cache, it is fetched from the database.

writeDBRef :: (IResource a, Typeable a) => DBRef a -> a -> STM ()Source

write in the reference a value The new key must be the same than the old key of the previous object stored otherwise, an error law of key conservation broken will be raised

delDBRef :: (IResource a, Typeable a) => DBRef a -> STM ()Source

delete the content of the DBRef form the cache and from permanent storage

IResource class

cached objects must be instances of IResource. Such instances can be implicitly derived trough auxiliary clasess for file persistence

class IResource a whereSource

An IResource instance that must be defined for every object being cached. there are a set of implicit IResource instance trough utiliy classes (See below)

Methods

keyResourceSource

Arguments

:: a 
-> String

must be defined

readResourceByKey :: String -> IO (Maybe a)Source

readResourceByKey implements the database access and marshalling or of the object. while the database access must be strict, the marshaling must be lazy if, as is often the case, some parts of the object are not really accesed. Moreover, if the object contains DBRefs, this avoids unnecesary cache lookups this method is called inside atomically blocks and thus may be interrupted without calling Since STM transactions retry, readResourceByKey may be called twice in strange situations. So it must be idempotent, not only in the result but also in the effect in the database

writeResource :: a -> IO ()Source

the write operation in persistent storage. It must be strict. Since STM transactions may retry, writeResource must be idempotent, not only in the result but also in the effect in the database all the new obbects are writeen to the database on synchromization so writeResource must not autocommit. Commit code must be located in the postcondition. (see setConditions)

delResource :: a -> IO ()Source

is called syncronously. It must autocommit

Instances

Operations with cached objects

Operations with DBRef's can be performed implicitly with the "traditional" TCache operations available in older versions.

In this example "buy" is a transaction where the user buy an item. The spent amount is increased and the stock of the product is decreased:

data  Data=   User{uname::String, uid::String, spent:: Int} |
              Item{iname::String, iid::String, price::Int, stock::Int}
              deriving (Read, Show)

instance Indexable Data where
        key   User{uid=id}= id
        key   Item{iid=id}= id

user buy item=  withResources[user,item] buyIt
 where
    buyIt[Just us,Just it]
       | stock it > 0= [us',it']
       | otherwise   = error "stock is empty for this product"
      where
       us'= us{spent=spent us + price it}
       it'= it{stock= stock it-1}
    buyIt _ = error "either the user or the item (or both) does not exist"

resources :: Resources a ()Source

Empty resources: resources= Resources [] [] ()

withSTMResourcesSource

Arguments

:: (IResource a, Typeable a) 
=> [a]

the list of resources to be retrieved

-> ([Maybe a] -> Resources a x)

The function that process the resources found and return a Resources structure

-> STM x

The return value in the STM monad.

This is the main function for the *Resource(s) calls. All the rest derive from it. The results are kept in the STM monad so it can be part of a larger STM transaction involving other DBRefs The Resources register returned by the user-defined function is interpreted as such:

  • toAdd: the content of this field will be added/updated to the cache
  • toDelete: the content of this field will be removed from the cache and from permanent storage
  • toReturn: the content of this field will be returned by withSTMResources

data Resources a b Source

Resources data definition used by withSTMResources

Constructors

Retry

forces a retry

Resources 

Fields

toAdd :: [a]

resources to be inserted back in the cache

toDelete :: [a]

resources to be deleted from the cache and from permanent storage

toReturn :: b

result to be returned

withResources :: (IResource a, Typeable a) => [a] -> ([Maybe a] -> [a]) -> IO ()Source

to atomically add/modify many objects in the cache

withResourceSource

Arguments

:: (IResource a, Typeable a) 
=> a

prototypes of the object to be retrieved for which keyResource can be derived

-> (Maybe a -> a)

update function that return another full object

-> IO () 

update of a single object in the cache

withResource r f= withResources [r] ([mr]-> [f mr])

getResources :: (IResource a, Typeable a) => [a] -> IO [Maybe a]Source

getResource :: (IResource a, Typeable a) => a -> IO (Maybe a)Source

to read a resource from the cache.

deleteResources :: (IResource a, Typeable a) => [a] -> IO ()Source

delete the list of resources from cache and from persistent storage.

deleteResource :: (IResource a, Typeable a) => a -> IO ()Source

delete the resource from cache and from persistent storage. deleteResource r= deleteResources [r]

Trigger operations

Trriggers are called just before an object of the given type is created, modified or deleted. The DBRef to the object and the new value is passed to the trigger. The called trigger function has two parameters: the DBRef being accesed (which still contains the old value), and the new value. If the content of the DBRef is being deleted, the second parameter is Nothing. if the DBRef contains Nothing, then the object is being created

Example:

every time a car is added, or deleted, the owner's list is updated this is done by the user defined trigger addCar

 addCar pcar (Just(Car powner _ )) = addToOwner powner pcar
 addCar pcar Nothing  = readDBRef pcar >>= (Just car)-> deleteOwner (owner car) pcar

addToOwner powner pcar=do
    Just owner <- readDBRef powner
    writeDBRef powner owner{cars= nub $ pcar : cars owner}

deleteOwner powner pcar= do
   Just owner <- readDBRef powner
   writeDBRef powner owner{cars= delete  pcar $ cars owner}

main= do
    addTrigger addCar
    putStrLn "create bruce's register with no cars"
    bruce <- atomically newDBRef $ Person "Bruce" []
    putStrLn add two car register with "bruce" as owner using the reference to the bruces register
    let newcars= [Car bruce "Bat Mobile" , Car bruce "Porsche"]
    insert newcars
    Just bruceData <- atomically $ readDBRef bruce
    putStrLn the trigger automatically updated the car references of the Bruce register
    print . length $ cars bruceData
    print bruceData

produces:

gives:

 main
 2
 Person {pname = "Bruce", cars = [DBRef "Car Porsche",DBRef "Car Bat Mobile"]}

addTrigger :: (IResource a, Typeable a) => (DBRef a -> Maybe a -> STM ()) -> IO ()Source

Add an user defined trigger to the list of triggers Trriggers are called just before an object of the given type is created, modified or deleted. The DBRef to the object and the new value is passed to the trigger. The called trigger function has two parameters: the DBRef being accesed (which still contains the old value), and the new value. If the content of the DBRef is being deleted, the second parameter is Nothing. if the DBRef contains Nothing, then the object is being created

cache control

flushDBRef :: (IResource a, Typeable a) => DBRef a -> STM ()Source

deletes the pointed object from the cache, not the database (see delDBRef) useful for cache invalidation when the database is modified by other process

flushAll :: STM ()Source

drops the entire cache.

setCache :: Cache -> IO ()Source

set the cache. this is useful for hot loaded modules that will update an existing cache. Experimental

newCache :: IO (Ht, Integer)Source

newCache creates a new cache. Experimental

syncCache :: IO ()Source

Force the atomic write of all cached objects modified since the last save into permanent storage Cache writes allways save a coherent state

setConditions :: IO () -> IO () -> IO ()Source

stablishes the procedures to call before and after saving with syncCache, clearSyncCache or clearSyncCacheProc. The postcondition of database persistence should be a commit.

clearSyncCache :: (Integer -> Integer -> Integer -> Bool) -> Int -> IO ()Source

Saves the unsaved elems of the cache Cache writes allways save a coherent state delete some elems of the cache when the number of elems > sizeObjects. The deletion depends on the check criteria. defaultCheck is the one implemented

numElems :: IO IntSource

return the total number of DBRefs in the cache. For debug purposes This does not count the number of objects in the cache since many of the DBRef may not have the pointed object loaded. It O(n).

clearSyncCacheProcSource

Arguments

:: Int

number of seconds betwen checks. objects not written to disk are written

-> (Integer -> Integer -> Integer -> Bool)

The user-defined check-for-cleanup-from-cache for each object. defaultCheck is an example

-> Int

The max number of objects in the cache, if more, the cleanup starts

-> IO ThreadId

Identifier of the thread created

Start the thread that periodically call clearSyncCache to clean and writes on the persistent storage. Otherwise, syncCache must be invoked explicitly or no persistence will exist. Cache writes allways save a coherent state

defaultCheckSource

Arguments

:: Integer

current time in seconds

-> Integer

last access time for a given object

-> Integer

last cache syncronization (with the persisten storage)

-> Bool

return true for all the elems not accesed since half the time between now and the last sync

ths is a default cache clearance check. It forces to drop from the cache all the elems not accesed since half the time between now and the last sync if it returns True, the object will be discarded from the cache it is invoked when the cache size exceeds the number of objects configured in clearSyncCacheProc or clearSyncCache

auxiliary file operations used for default persistence in files.