Safe Haskell | None |
---|---|
Language | Haskell98 |
- data Bucket = Bucket {}
- openBucket :: FilePath -> IOMode -> IO Bucket
- hBucket :: Handle -> IO Bucket
- fromFiles :: [FilePath] -> (Array B Bucket -> IO b) -> IO b
- fromFiles' :: (Bulk l FilePath, Target l Bucket) => Array l FilePath -> (Array l Bucket -> IO b) -> IO b
- fromDir :: FilePath -> (Array B Bucket -> IO b) -> IO b
- fromSplitFile :: Int -> (Word8 -> Bool) -> FilePath -> (Array B Bucket -> IO b) -> IO b
- fromSplitFileAt :: Int -> (Word8 -> Bool) -> FilePath -> Integer -> (Array B Bucket -> IO b) -> IO b
- toFiles :: [FilePath] -> (Array B Bucket -> IO b) -> IO b
- toFiles' :: (Bulk l FilePath, Target l Bucket) => Array l FilePath -> (Array l Bucket -> IO b) -> IO b
- toDir :: Int -> FilePath -> (Array B Bucket -> IO b) -> IO b
- toDirs :: Int -> [FilePath] -> (Array (E B DIM2) Bucket -> IO b) -> IO b
- bClose :: Bucket -> IO ()
- bIsOpen :: Bucket -> IO Bool
- bAtEnd :: Bucket -> IO Bool
- bSeek :: Bucket -> SeekMode -> Integer -> IO ()
- bGetArray :: Bucket -> Integer -> IO (Array F Word8)
- bPutArray :: Bucket -> Array F Word8 -> IO ()
Documentation
A bucket represents portion of a whole data-set on disk, and contains a file handle that points to the next piece of data to be read or written.
The bucket could be created from a portion of a single flat file,
or be one file of a pre-split data set. The main advantage over a
plain Handle
is that a Bucket
can represent a small portion
of a single large file.
Bucket | |
|
hBucket :: Handle -> IO Bucket Source #
Wrap an existing file handle as a bucket.
The starting position is set to 0.
Reading
fromFiles :: [FilePath] -> (Array B Bucket -> IO b) -> IO b Source #
Open some files as buckets and use them as Sources
.
:: (Bulk l FilePath, Target l Bucket) | |
=> Array l FilePath | Files to open. |
-> (Array l Bucket -> IO b) | Consumer. |
-> IO b |
Like fromFiles'
, but take a list of file paths.
fromDir :: FilePath -> (Array B Bucket -> IO b) -> IO b Source #
Open all the files in a directory as separate buckets.
This operation may fail with the same exceptions as getDirectoryContents
.
:: Int | Number of buckets. |
-> (Word8 -> Bool) | Detect the end of a record. |
-> FilePath | File to open. |
-> (Array B Bucket -> IO b) | Consumer. |
-> IO b |
Open a file containing atomic records and split it into the given number of evenly sized buckets.
The records are separated by a special terminating charater, which the given predicate detects. The file is split cleanly on record boundaries, so we get a whole number of records in each bucket. As the records can be of varying size the buckets are not guaranteed to have exactly the same length, in either records or buckets, though we try to give them the approximatly the same number of bytes.
:: Int | Number of buckets. |
-> (Word8 -> Bool) | Detect the end of a record. |
-> FilePath | File to open. |
-> Integer | Starting offset. |
-> (Array B Bucket -> IO b) | Consumer. |
-> IO b |
Like fromSplitFile
but start at the given offset.
Writing
Open some files for writing as individual buckets and pass them to the given consumer.
:: Int | Number of buckets to create. |
-> FilePath | Path to directory. |
-> (Array B Bucket -> IO b) | Consumer. |
-> IO b |
Create a new directory of the given name, containing the given number of buckets.
If the directory is named somedir
then the files are named
somedir/000000
, somedir/000001
, somedir/000002
and so on.
:: Int | Number of buckets to create per directory. |
-> [FilePath] | Paths to directories. |
-> (Array (E B DIM2) Bucket -> IO b) | Consumer. |
-> IO b |
Given a list of directories, create those directories and open the given number of output files per directory.
In the resulting array of buckets, the outer dimension indexes each directory, and the inner one indexes each file in its directory.
For each directory somedir
the files are named
somedir/000000
, somedir/000001
, somedir/000002
and so on.