| Copyright | (c) Matthew Mosior 2022 |
|---|---|
| License | BSD-style |
| Maintainer | mattm.github@gmail.com |
| Portability | portable |
| Safe Haskell | Safe-Inferred |
| Language | Haskell2010 |
Data.FMIndex
Description
Full-text Minute-space index (FM-index)
Users will get the most mileage by first compressing to a BWT
on the initial ByteString or Text input before compressing to
a FMIndexB or FMIndexT.
To do this, users can use the bytestringToBWTToFMIndexB and bytestringToBWTToFMIndexT functions,
as well as the textToBWTToFMIndexB and textToBWTToFMIndexT functions.
The base functions for ByteString, bytestringToFMIndexB and bytestringToFMIndexT can be used to
convert a Seq (Maybe ByteString) to a FMIndexB and FMIndexT, respectively.
Likewise, the base functions for Text, textToFMIndexB and textToFMIndexT can be used to
convert a Seq (Maybe Text) to a FMIndexB and FMIndexT respectively.
There are various other lower-level functions for interacting with the FMIndex implementation on ByteString and Text as well.
Operation: Count
The count operation on ByteString, bytestringFMIndexCount, is implemented using the countFMIndexB function.
The count operation on Text, textFMIndexCount, is implemented using the countFMIndexT function.
Operation: Locate
The locate operation on ByteString, bytestringFMIndexLocate, is implemented using the locateFMIndexB function.
The locate operation on Text, textFMIndexLocate, is implemented using the locateFMIndexT function.
Internal
Data.FMIndex.Internal contains efficient and stateful implementations of the FMIndex and Inverse FMIndex algorithms.
Synopsis
- bytestringToBWTToFMIndexB :: ByteString -> FMIndexB
- bytestringToBWTToFMIndexT :: ByteString -> FMIndexT
- textToBWTToFMIndexB :: Text -> FMIndexB
- textToBWTToFMIndexT :: Text -> FMIndexT
- textBWTToFMIndexB :: BWTMatrix Word8 -> TextBWT -> FMIndexB
- bytestringBWTToFMIndexB :: BWTMatrix Word8 -> BWT Word8 -> FMIndexB
- textBWTToFMIndexT :: BWTMatrix Word8 -> TextBWT -> FMIndexT
- bytestringBWTToFMIndexT :: BWTMatrix Word8 -> BWT Word8 -> FMIndexT
- textToFMIndexB :: BWTMatrix Text -> Seq (Maybe Text) -> FMIndexB
- bytestringToFMIndexB :: BWTMatrix ByteString -> Seq (Maybe ByteString) -> FMIndexB
- textToFMIndexT :: BWTMatrix Text -> Seq (Maybe Text) -> FMIndexT
- bytestringToFMIndexT :: BWTMatrix ByteString -> Seq (Maybe ByteString) -> FMIndexT
- bytestringFromBWTFromFMIndexB :: FMIndexB -> ByteString
- bytestringFromBWTFromFMIndexT :: FMIndexT -> ByteString
- textFromBWTFromFMIndexB :: FMIndexB -> Text
- textFromBWTFromFMIndexT :: FMIndexT -> Text
- textBWTFromFMIndexT :: FMIndexT -> BWT Text
- bytestringBWTFromFMIndexT :: FMIndexT -> BWT ByteString
- textBWTFromFMIndexB :: FMIndexB -> BWT Text
- bytestringBWTFromFMIndexB :: FMIndexB -> BWT ByteString
- textFromFMIndexB :: FMIndexB -> Seq (Maybe Text)
- bytestringFromFMIndexB :: FMIndexB -> Seq (Maybe ByteString)
- textFromFMIndexT :: FMIndexT -> Seq (Maybe Text)
- bytestringFromFMIndexT :: FMIndexT -> Seq (Maybe ByteString)
- bytestringFMIndexCount :: [ByteString] -> ByteString -> Seq (ByteString, CIntB)
- textFMIndexCount :: [Text] -> Text -> Seq (Text, CIntT)
- bytestringFMIndexLocate :: [ByteString] -> ByteString -> Seq (ByteString, LIntB)
- textFMIndexLocate :: [Text] -> Text -> Seq (Text, LIntT)
To FMIndex functions
bytestringToBWTToFMIndexB :: ByteString -> FMIndexB Source #
Helper function for converting a ByteString
to a FMIndexB via a BWT first.
bytestringToBWTToFMIndexT :: ByteString -> FMIndexT Source #
Helper function for converting a ByteString
to a FMIndexT via a BWT first.
textToBWTToFMIndexB :: Text -> FMIndexB Source #
textToBWTToFMIndexT :: Text -> FMIndexT Source #
bytestringToFMIndexB :: BWTMatrix ByteString -> Seq (Maybe ByteString) -> FMIndexB Source #
Takes a Seq of ByteStrings and returns the FM-index (FMIndexB).
bytestringToFMIndexT :: BWTMatrix ByteString -> Seq (Maybe ByteString) -> FMIndexT Source #
Takes a ByteString and returns the FM-index (FMIndexT).
From FMIndex functions
bytestringFromBWTFromFMIndexB :: FMIndexB -> ByteString Source #
Helper function for converting a BWTed FMIndexB
back to the original ByteString.
bytestringFromBWTFromFMIndexT :: FMIndexT -> ByteString Source #
Helper function for converting a BWTed FMIndexT
back to the original ByteString.
bytestringBWTFromFMIndexT :: FMIndexT -> BWT ByteString Source #
Takes a FMIndexT and returns
the BWT of ByteStrings.
bytestringBWTFromFMIndexB :: FMIndexB -> BWT ByteString Source #
Take a FMIndexB and returns
the BWT of ByteStrings.
bytestringFromFMIndexB :: FMIndexB -> Seq (Maybe ByteString) Source #
Takes a FMIndexB and returns
the original Seq of ByteStrings.
bytestringFromFMIndexT :: FMIndexT -> Seq (Maybe ByteString) Source #
Takes a FMIndexT and returns
the original Seq of ByteStrings.
Count operations
bytestringFMIndexCount :: [ByteString] -> ByteString -> Seq (ByteString, CIntB) Source #
Takes a list of pattern(s) of ByteStrings
and an input ByteString
and returns the number of occurences of the pattern(s)
in the input ByteString.
Locate operations
bytestringFMIndexLocate :: [ByteString] -> ByteString -> Seq (ByteString, LIntB) Source #
Takes a list of pattern(s) of ByteStrings
and an input ByteString
and returns the indexe(s) of occurences of the pattern(s)
in the input ByteString.
The output indices are 1-based,
and are not sorted.