traildb-0.1.0.2: TrailDB bindings for Haskell

Safe HaskellNone
LanguageHaskell2010

System.TrailDB

Contents

Description

Haskell bindings to trailDB library.

Minimal program that lists a TrailDB:

  import qualified Data.ByteString as B
  import System.TrailDB

  main :: IO ()
  main = do
    tdb <- openTrailDB "wikipedia-history-small.tdb"
    forEachTrailID tdb $ tid -> print =<< getTrailBytestring tdb tid

Example program that reads a TrailDB using low-level (faster) cursor API:

  import qualified Data.ByteString as B
  import qualified Data.Vector.Unboxed as V
  import System.TrailDB

  main :: IO ()
  main = do
    tdb <- openTrailDB "some-trail-db"
    number_of_trails <- getNumTrails tdb

    let arbitrarily_chosen_trail_id = 12345 `mod` number_of_trails

    cursor <- makeCursor tdb
    setCursor cursor arbitrarily_chosen_trail_id

    -- Read the first event in the arbitrary chosen trail
    crumb <- stepCursor cursor
    case crumb of
      Nothing -> putStrLn "Cannot find this particular trail."
      Just (timestamp, features) ->
        V.forM_ features $ feature -> do
          field_name <- getFieldName tdb (feature^.field)
          putStr "Field: "
          B.putStr field_name
          putStr " contains value "
          value <- getValue tdb feature
          B.putStrLn value

Another example program that writes a TrailDB:

  {-# LANGUAGE OverloadedStrings #-}

  import System.TrailDB

  main :: IO ()
  main = do
    cons <- newTrailDBCons "some-trail-db" (["currency", "order_amount", "item"] :: [String])
    addTrail cons ("aaaaaaaaaaaaaa00")   -- UUIDs are 16 bytes in length
                  1457049455             -- This is timestamp
                  ["USD", "10.14", "Bacon & Cheese" :: String]
    addTrail cons ("aaaaaaaaaaaaaa00")   -- Same UUID as above, same customer ordered more
                  1457051221
                  ["USD", "8.90", "Avocado Sandwich" :: String]
    addTrail cons ("aaaaaaaaaaaaaa02")
                  1457031239
                  ["JPY", "2900", "Sun Lotion" :: String]
    closeTrailDBCons cons

    -- TrailDB has been written to 'some-trail-db'

Synopsis

Constructing new TrailDBs

newTrailDBCons :: (FieldNameLike a, MonadIO m) => FilePath -> [a] -> m TdbCons Source

Create a new TrailDB and return TrailDB construction handle.

Close it with closeTrailDBCons. Garbage collector will close it eventually if you didn't do it yourself. You won't be receiving FinalizationFailure exception though if that fails when using the garbage collector.

closeTrailDBCons :: MonadIO m => TdbCons -> m () Source

Close a TdbCons

Does nothing if it's closed already.

withTrailDBCons :: (FieldNameLike a, MonadIO m, MonadMask m) => FilePath -> [a] -> (TdbCons -> m b) -> m b Source

Runs an IO action with an opened TdbCons. The TdbCons is closed after the action has been executed.

addTrail :: (MonadIO m, ToTdbRow r) => TdbCons -> UUID -> UnixTime -> r -> m () Source

Add a cookie with timestamp and values to TdbCons.

appendTdbToTdbCons :: MonadIO m => Tdb -> TdbCons -> m () Source

Appends a Tdb to an open TdbCons.

finalizeTrailDBCons :: MonadIO m => TdbCons -> m () Source

Finalizes a TdbCons.

You usually don't need to call this manually because this is called automatically by closeTrailDBCons before actually closing TdbCons.

Supplying values to construction

class ToTdbRow r where Source

Class of things that can be turned into rows and added with addTrail.

The native type inside traildb is the ByteString. The use of this typeclass can eliminate some noise when converting values to ByteString.

Methods

toTdbRow :: r -> [ByteString] Source

class ToTdbRowField f where Source

Class of things that can be turned into a field in a TrailDB.

data TdbConsRow a b Source

Convenience type that lets you arbitrarily make heterogenous list of things that implement ToTdbRowField and subsequently ToTdbRow. Use this if plain lists are not suitable (because they are monotyped) or your tuples are too long to implement ToTdbRow.

Constructors

(:.) a b infixr 7 

Instances

newtype TdbShowable a Source

Convenience newtype to put things that you can Show in TrailDB. It implements ToTdbRowField.

Constructors

TdbShowable a 

pattern TShow :: t -> TdbShowable t Source

Short-cut pattern synonym for TdbShowable

Opening existing TrailDBs

openTrailDB :: MonadIO m => FilePath -> m Tdb Source

Opens an existing TrailDB.

It can open file TrailDBs and also directory TrailDBs.

In case of files you can use format "traildb.tdb" or just "traildb".

closeTrailDB :: MonadIO m => Tdb -> m () Source

Closes a TrailDB.

Does nothing if Tdb is already closed.

dontneedTrailDB :: MonadIO m => Tdb -> m () Source

Hints that Tdb will not be accessed in near future.

Internally may invoke system call 'madvise' behind the scenes to operating system.

This has no effect on semantics, only performance.

willneedTrailDB :: MonadIO m => Tdb -> m () Source

Hints that Tdb will be walked over in near future.

Internally may invoke system call 'madvise' behind the scenes to operating system.

This has no effect on semantics, only performance.

withTrailDB :: (MonadIO m, MonadMask m) => FilePath -> (Tdb -> m a) -> m a Source

Opens a Tdb and then closes it after action is over.

Accessing TrailDBs

High-level, slow, access

class FromTrail a where Source

Class of things that can be result from getTrail.

Methods

fromBytestringList :: [(UnixTime, [(ByteString, ByteString)])] -> a Source

Makes a result from a list of trails.

One crumb of a trail has timestamp and associative list of fields.

    [(timestamp, [(fieldname1, value1), (fieldname2, value2), ...]
    , ... ]

Instances

FromTrail [[ByteString]] Source

Throws away timestamps and field names.

FromTrail [(UnixTime, [(ByteString, ByteString)])] Source 
FromTrail [Map ByteString ByteString] Source

Throws away timestamps.

FromTrail (Set ByteString) Source

Set of all values that appear in the trail.

FromTrail (Vector (Map ByteString ByteString)) Source

Vector version.

getTrail :: (FromTrail a, MonadIO m) => Tdb -> TrailID -> m a Source

Convenience function that returns a full trail in human-readable format.

This is quite a bit slower than using a cursor (makeCursor, setCursor, stepCursor) but is simpler. If you don't need to go through bazillions of data really fast you might want to use this one.

See FromTrail for things that can be taken as a result.

getTrailBytestring :: MonadIO m => Tdb -> TrailID -> m [(UnixTime, [(ByteString, ByteString)])] Source

Same as getTrail but not polymorphic in output.

Meant to be used for very quick throw-away programs where you don't want to spell out the type to be used (e.g. if you all you do is print the trail).

Lowerish-level, fast, access

makeCursor :: MonadIO m => Tdb -> m Cursor Source

Creates a cursor to a trailDB.

stepCursor :: MonadIO m => Cursor -> m (Maybe Crumb) Source

Steps cursor forward in its trail.

Returns Nothing if there are no more crumbs in the trail.

setCursor :: MonadIO m => Cursor -> TrailID -> m () Source

Puts cursor at the start of some trail.

Iterating over TrailDB

forEachTrailID :: (Applicative m, MonadIO m) => Tdb -> (TrailID -> m ()) -> m () Source

Convenience function that runs a function for each TrailID in TrailDB.

forEachTrailIDUUID :: (Applicative m, MonadIO m) => Tdb -> (TrailID -> UUID -> m ()) -> m () Source

Same as forEachTrailID but passes UUID as well.

traverseEachTrailID :: (Applicative m, MonadIO m) => (TrailID -> m ()) -> Tdb -> m () Source

Same as forEachTrailID but arguments flipped.

traverseEachTrailIDUUID :: (Applicative m, MonadIO m) => (TrailID -> UUID -> m ()) -> Tdb -> m () Source

Same as traverseEachTrailID but passes UUID as well.

foldTrailDB :: MonadIO m => (a -> TrailID -> m a) -> a -> Tdb -> m a Source

Fold TrailDB for each TrailID.

This is like traverseEachTrailID but lets you carry a folding value.

foldTrailDBUUID :: MonadIO m => (a -> TrailID -> UUID -> m a) -> a -> Tdb -> m a Source

Same as foldTrailDB but passes UUID as well.

Basic querying

getNumTrails :: MonadIO m => Tdb -> m Word64 Source

Returns the number of cookies in Tdb

getNumEvents :: MonadIO m => Tdb -> m Word64 Source

Returns the number of events in Tdb

getNumFields :: MonadIO m => Tdb -> m Word64 Source

Returns the number of fields in Tdb

getMinTimestamp :: MonadIO m => Tdb -> m UnixTime Source

Returns the minimum timestamp in Tdb

getMaxTimestamp :: MonadIO m => Tdb -> m UnixTime Source

Returns the maximum timestamp in Tdb

UUID handling

getUUID :: MonadIO m => Tdb -> TrailID -> m UUID Source

Finds a uuid by trail ID.

getTrailID :: MonadIO m => Tdb -> UUID -> m TrailID Source

Finds a trail ID by uuid.

Fields

getFieldName :: MonadIO m => Tdb -> FieldID -> m FieldName Source

Given a field ID, returns its human-readable field name.

getFieldID :: (FieldNameLike a, MonadIO m) => Tdb -> a -> m FieldID Source

Given a field name, returns its FieldID.

getItemByField :: (FieldNameLike a, MonadIO m) => Tdb -> a -> ByteString -> m Feature Source

Same as getItem but uses a resolved field name rather than raw FieldID.

This is implemented in terms of getFieldID and getItem inside.

getValue :: MonadIO m => Tdb -> Feature -> m ByteString Source

Given a Feature, returns a string that describes it.

Values in a TrailDB are integers which need to be mapped back to strings to be human-readable.

getItem :: MonadIO m => Tdb -> FieldID -> ByteString -> m Feature Source

Given a field ID and a human-readable value, turn it into Feature for that field ID.

Time handling

dayToUnixTime :: Day -> UnixTime Source

Converts Day to UnixTime.

The time will be the first second of the Day.

C interop

withRawTdb :: MonadIO m => Tdb -> (Ptr TdbRaw -> IO a) -> m a Source

Run an action with a raw pointer to Tdb.

The Tdb is guaranteed not to be garbage collected while the given action is running.

getRawTdb :: MonadIO m => Tdb -> m (Ptr TdbRaw) Source

Returns the raw pointer to a TrailDB.

You can pass this pointer to C code and use the C API of TrailDB to use it.

You become responsible for ensuring Haskell doesn't clean up and close the managed Tdb handle. You can use touchTdb or withRawTdb to deal with this.

touchTdb :: MonadIO m => Tdb -> m () Source

Touch a Tdb.

Ensures that Tdb has not been garbage collected at the point touchTdb is invoked. Has no other effect.

withRawTdbCons :: MonadIO m => TdbCons -> (Ptr TdbConsRaw -> IO a) -> m a Source

Run an action with a raw pointer to TdbCons.

The TdbCons is guaranteed not to be garbage collected while the given action is running.

getRawTdbCons :: MonadIO m => TdbCons -> m (Ptr TdbConsRaw) Source

Returns the raw pointer to a TrailDB construction handle.

Just as with getRawTdb, this pointer can be passed to C and used with the TrailDB C API.

Use touchTdbCons or withRawTdbCons to ensure the pointer is not garbage collected while you are using it.

touchTdbCons :: MonadIO m => TdbCons -> m () Source

Touch a TdbCons.

Ensures that TdbCons has not been garbage collected at the point touchTdbCons is invoked. Has no other effect.

data TdbRaw Source

Represents the raw TrailDB handle as used in C.

data TdbConsRaw Source

Represents the raw TrailDB construction as used in C.

Taking apart Feature

(^.) :: s -> Getting a s a -> a

Data types

type UUID = ByteString Source

UUIDs should be 16 bytes in size. It can be converted to TrailID within a traildb.

type TrailID = Word64 Source

TrailID indexes a trail in a traildb. It can be converted to and back to UUID within a traildb.

type FieldID = TdbField Source

FieldID is indexes a field number.

type Crumb = (UnixTime, Vector Feature) Source

A single crumb is some event at certain time.

The vector always has length as told by getNumFields.

data Feature Source

Feature is a value in traildb. getValue can turn it into a human-readable value within a traildb.

Instances

Eq Feature Source 
Data Feature Source 
Ord Feature Source 
Read Feature Source 
Show Feature Source 
Generic Feature Source 
Storable Feature Source 
Unbox Feature Source

Feature is isomorphic to Word64 so it's safe to coerce between them. (see featureWord).

Vector Vector Feature Source 
MVector MVector Feature Source 
type Rep Feature Source 
data Vector Feature = V_Feature (Vector Word64) Source 
data MVector s Feature = VM_Feature (MVector s Word64) Source 

type TdbVersion = Word64 Source

TrailDB version

type FieldName = ByteString Source

Fields names are bytestring and can contain nulls.

class FieldNameLike a where Source

Class of things that can be used as a field name.

The strict bytestring is the native type. Other types are converted, encoding with UTF-8.

data Tdb Source

Instances

Time

type UnixTime = Word64 Source

The type of time used in traildbs.

getUnixTime :: MonadIO m => m UnixTime Source

A helper function to get the current unix time.

May be useful when building TrailDBs if you don't have timestamps already.

Exceptions

data TrailDBException Source

Exceptions that may happen with TrailDBs.

Some programming errors may throw with error instead.

Constructors

CannotAllocateTrailDBCons

Failed to allocate TdbCons.

CannotAllocateTrailDB

Failed to allocate Tdb.

TrailDBError !CInt String

Errors reported by error code from TrailDB C library. includes numerical error and human-readable error.

NoSuchTrailID

A TrailID was used that doesn't exist in Tdb.

NoSuchUUID

A UUID was used that doesn't exist in Tdb.

NoSuchFieldID

A FieldID was used that doesn't exist in Tdb.

NoSuchValue

A Feature was used that doesn't contain a valid value.

NoSuchFeature

Attempted to find Feature for human readable name that doesn't exist.

FinalizationFailure

For some reason, finalizing a TdbCons failed.

Multiple TrailDBs

Operations in this section are conveniences that may be useful if you have many TrailDB directories you want to query.

findTrailDBs Source

Arguments

:: (MonadIO m, MonadMask m) 
=> FilePath 
-> Bool

Follow symbolic links?

-> m [FilePath] 

Given a directory, find all valid TrailDB paths inside it, recursively.

filterTrailDBDirectories :: (MonadIO m, MonadMask m) => [FilePath] -> m [FilePath] Source

Given a list of directories, filters it, returning only directories that are valid TrailDB directories.

Used internally by findTrailDBs but can be useful in general so we export it.