Copyright | (c) Matthew Mosior 2022 |
---|---|
License | BSD-style |
Maintainer | mattm.github@gmail.com |
Portability | portable |
Safe Haskell | Safe-Inferred |
Language | Haskell2010 |
Full-text Minute-space index (FM-index)
Users will get the most mileage by first compressing to a BWT
on the initial ByteString
or Text
input before compressing to
a FMIndexB
or FMIndexT
.
To do this, users can use the bytestringToBWTToFMIndexB
and bytestringToBWTToFMIndexT
functions,
as well as the textToBWTToFMIndexB
and textToBWTToFMIndexT
functions.
The base functions for ByteString
, bytestringToFMIndexB
and bytestringToFMIndexT
can be used to
convert a Seq
(Maybe
ByteString
) to a FMIndexB
and FMIndexT
, respectively.
Likewise, the base functions for Text
, textToFMIndexB
and textToFMIndexT
can be used to
convert a Seq
(Maybe
Text
) to a FMIndexB
and FMIndexT
respectively.
There are various other lower-level functions for interacting with the FMIndex implementation on ByteString
and Text
as well.
Operations
The count operation is supported by the countFMIndexB
function for ByteString
s
and the countFMIndexT
function for Text
.
Internal
Data.FMIndex.Internal
contains efficient and stateful implementations of the FMIndex and Inverse FMIndex algorithms.
Synopsis
- bytestringToBWTToFMIndexB :: ByteString -> FMIndexB
- bytestringToBWTToFMIndexT :: ByteString -> FMIndexT
- textToBWTToFMIndexB :: Text -> FMIndexB
- textToBWTToFMIndexT :: Text -> FMIndexT
- textBWTToFMIndexB :: BWTMatrix Word8 -> TextBWT -> FMIndexB
- bytestringBWTToFMIndexB :: BWTMatrix Word8 -> BWT Word8 -> FMIndexB
- textBWTToFMIndexT :: BWTMatrix Word8 -> TextBWT -> FMIndexT
- bytestringBWTToFMIndexT :: BWTMatrix Word8 -> BWT Word8 -> FMIndexT
- textToFMIndexB :: BWTMatrix Text -> Seq (Maybe Text) -> FMIndexB
- bytestringToFMIndexB :: BWTMatrix ByteString -> Seq (Maybe ByteString) -> FMIndexB
- textToFMIndexT :: BWTMatrix Text -> Seq (Maybe Text) -> FMIndexT
- bytestringToFMIndexT :: BWTMatrix ByteString -> Seq (Maybe ByteString) -> FMIndexT
- bytestringFromBWTFromFMIndexB :: FMIndexB -> ByteString
- bytestringFromBWTFromFMIndexT :: FMIndexT -> ByteString
- textFromBWTFromFMIndexB :: FMIndexB -> Text
- textFromBWTFromFMIndexT :: FMIndexT -> Text
- textBWTFromFMIndexT :: FMIndexT -> BWT Text
- bytestringBWTFromFMIndexT :: FMIndexT -> BWT ByteString
- textBWTFromFMIndexB :: FMIndexB -> BWT Text
- bytestringBWTFromFMIndexB :: FMIndexB -> BWT ByteString
- textFromFMIndexB :: FMIndexB -> Seq (Maybe Text)
- bytestringFromFMIndexB :: FMIndexB -> Seq (Maybe ByteString)
- textFromFMIndexT :: FMIndexT -> Seq (Maybe Text)
- bytestringFromFMIndexT :: FMIndexT -> Seq (Maybe ByteString)
- bytestringFMIndexCount :: ByteString -> ByteString -> CIntB
- textFMIndexCount :: Text -> Text -> CIntT
To FMIndex functions
bytestringToBWTToFMIndexB :: ByteString -> FMIndexB Source #
Helper function for converting a ByteString
to a FMIndexB
via a BWT
first.
bytestringToBWTToFMIndexT :: ByteString -> FMIndexT Source #
Helper function for converting a ByteString
to a FMIndexT
via a BWT
first.
textToBWTToFMIndexB :: Text -> FMIndexB Source #
textToBWTToFMIndexT :: Text -> FMIndexT Source #
bytestringToFMIndexB :: BWTMatrix ByteString -> Seq (Maybe ByteString) -> FMIndexB Source #
Takes a Seq
of ByteString
s and returns the FM-index (FMIndexB
).
bytestringToFMIndexT :: BWTMatrix ByteString -> Seq (Maybe ByteString) -> FMIndexT Source #
Takes a ByteString
and returns the FM-index (FMIndexT
).
From FMIndex functions
bytestringFromBWTFromFMIndexB :: FMIndexB -> ByteString Source #
Helper function for converting a BWT
ed FMIndexB
back to the original ByteString
.
bytestringFromBWTFromFMIndexT :: FMIndexT -> ByteString Source #
Helper function for converting a BWT
ed FMIndexT
back to the original ByteString
.
bytestringBWTFromFMIndexT :: FMIndexT -> BWT ByteString Source #
Takes a FMIndexT
and returns
the BWT
of ByteString
s.
bytestringBWTFromFMIndexB :: FMIndexB -> BWT ByteString Source #
Take a FMIndexB
and returns
the BWT
of ByteString
s.
bytestringFromFMIndexB :: FMIndexB -> Seq (Maybe ByteString) Source #
Takes a FMIndexB
and returns
the original Seq
of ByteString
s.
bytestringFromFMIndexT :: FMIndexT -> Seq (Maybe ByteString) Source #
Takes a FMIndexT
and returns
the original Seq
of ByteString
s.
Count operations
bytestringFMIndexCount :: ByteString -> ByteString -> CIntB Source #
Takes a pattern (ByteString
)
and an input ByteString
and returns the number of occurences of the pattern
in the input ByteString
.