kangaroo-0.4.0: Binary parsing with random access.

Portabilityto be determined.
Stabilityhighly unstable
MaintainerStephen Tetley <stephen.tetley@gmail.com>

Data.ParserCombinators.KangarooReader

Contents

Description

Kangaroo parse monad with env.

Synopsis

Parser tpye

type Kangaroo r a = GenKangaroo r aSource

ask :: Kangaroo r rSource

Retrieve the environment.

asks :: (r -> a) -> Kangaroo r aSource

local :: (r -> r) -> Kangaroo r a -> Kangaroo r aSource

Execute a computation in a modified environment.

Parser types

Region types

data RegionCoda Source

RegionCoda - represents three useful final positions:

  1. dalpunto - 'from the point' - Run the parser within a region and return to where you came from.
  2. alfermata - 'to the stop' - Run the parser within a region, the cursor remains wherever the parse finished.
  3. alfine - 'to the end' - Run the parser within a region and jump to the right-end of the region after the parse.

Constructors

Dalpunto 
Alfermata 
Alfine 

Lift IO actions

liftIOAction :: IO a -> GenKangaroo ust aSource

Lift an IO action into the Kangaroo monad.

Error reporting and exception handling

reportError :: ParseErr -> GenKangaroo ust aSource

Report a parse error.

Source position is appended to the supplied error message

substError :: GenKangaroo ust a -> ParseErr -> GenKangaroo ust aSource

substError : parser * error_msg -> parser

substError is equivalent to Parsec's <?> combinator.

Run the supplied parser, if the parse succeeds return the result, otherwise override the original error message with the supplied error_msg.

Primitive parsers

word8 :: GenKangaroo ust Word8Source

Parse a single byte.

If the cursor is beyond the end of the current region a parse-error is thrown with reportError.

satisfy :: (Word8 -> Bool) -> GenKangaroo ust Word8Source

satisfy : predicate -> parser

Parse a single byte and apply the predicate to it. On True return the parsed byte, on False throw a parse-error with reportError.

checkWord8 :: (Word8 -> Bool) -> GenKangaroo ust (Maybe Word8)Source

checkWord8 : predicate -> opt parser

Byte parser with backtracking when the match fails.

Parse a single byte and apply the predicate to the result. On success return (Just answer), on failure move the cursor position back one and return Nothing.

opt :: GenKangaroo ust a -> GenKangaroo ust (Maybe a)Source

Backtracking parser similar to Parsec's try.

Try the supplied parser, if the parse succeeds with no parse-errors return (Just answer). If a parse-error is generated, discard the parse-error, return the cursor to the initial position and return Nothing.

skip :: Int -> GenKangaroo ust ()Source

skip : num_bytes -> ()

Move the cursor forward by the supplied distance. The distance must be positive, negative distances are ignored.

skip performs no range checking. If the cursor is moved beyond the region boundary then the next parse will fail.

Query the cursor position

position :: GenKangaroo ust IntSource

position : -> cursor-position

Return the current cursor position

region :: GenKangaroo ust (Int, Int, Int)Source

region : -> (region-start, cursor-position, region-end)

Return the current parse region and the current position of the cursor within it.

atEnd :: GenKangaroo ust BoolSource

atEnd - is the cursor at the end of the current region?

lengthRemaining :: GenKangaroo ust IntSource

lengthRemaining : -> distance-to-region-end

Distance from the current cursor position to the end of the current region

regionSize :: GenKangaroo ust IntSource

regionSize : -> region-length

Size of the current region.

Parse within a region

intraparse :: RegionName -> RegionCoda -> RegionStart -> Int -> GenKangaroo ust a -> GenKangaroo ust aSource

intraparse : name * coda * abs_region_start * region_length * parser -> parser

Create a new region within the current one and run the supplied parser. The cursor position is moved to the start of the new region. The value of coda determines where the cursor is positioned after a successful parse.

intraparse throws a parse error if the supplied absolute-region-start is not located within the current region, or if the right-boundary of the new region (abs_region_start + region_length) extends beyond the right-boundary of the current region.

advance :: RegionName -> RegionCoda -> Int -> GenKangaroo ust a -> GenKangaroo ust aSource

advance : name * coda * abs_region_start * parser -> parser

A variation of intraparse - the new region starts at the supplied abs_region_start and continues to the end of the current region.

advance throws a parse error if the new start position is not within the current region.

advanceRelative :: RegionName -> RegionCoda -> Int -> GenKangaroo ust a -> GenKangaroo ust aSource

advanceRelative : name * coda * distance * parser -> parser

A variation of advance - the start of the new region is calculated from the current-cursor-position + the supplied distance.

advanceRelative throws a parse error if the new start position is not within the current region.

restrict :: RegionName -> RegionCoda -> Int -> GenKangaroo ust a -> GenKangaroo ust aSource

restrict : name * coda * distance * parser -> parser

A variation of intraparse - create a new region as a restriction of the current one and run the supplied parser. The new region starts at the current coursor position, the right-boundary is restricted to the current-cursor-position + the supplied distance.

restrict throws a parse error if the right-boundary of the new region extends beyond the current region.

restrictToPos :: RegionName -> RegionCoda -> Int -> GenKangaroo ust a -> GenKangaroo ust aSource

restrictToPos : region-name * coda * abs-end-pos * parser -> parser

A variantion of restrict - the new region takes the current cursor position for the left-boundary and the supplied absolute-end-position (abs-end-pos) as the right-boundary.

restrictToPos throws a parse error if the abs-end-pos extends beyond the right-boundary of the current region.

Debug

printHexAll :: GenKangaroo ust ()Source

printHexRange :: (Int, Int) -> GenKangaroo ust ()Source

printRegionStack :: GenKangaroo ust ()Source

manyTill :: GenKangaroo ust a -> GenKangaroo ust b -> GenKangaroo ust [a]Source

genericManyTill :: (a -> c -> c) -> c -> GenKangaroo ust a -> GenKangaroo ust b -> GenKangaroo ust cSource

manyTillPC :: GenKangaroo ust a -> (a -> Bool) -> GenKangaroo ust ([a], a)Source

genericManyTillPC :: (a -> b -> b) -> b -> GenKangaroo ust a -> (a -> Bool) -> GenKangaroo ust (b, a)Source

count :: Int -> GenKangaroo ust a -> GenKangaroo ust [a]Source

countPrefixed :: Integral i => GenKangaroo ust i -> GenKangaroo ust a -> GenKangaroo ust (i, [a])Source

genericCount :: (a -> b -> b) -> b -> Int -> GenKangaroo ust a -> GenKangaroo ust bSource

runOn :: GenKangaroo ust a -> GenKangaroo ust [a]Source

genericRunOn :: (a -> b -> b) -> b -> GenKangaroo ust a -> GenKangaroo ust bSource

postCheck :: GenKangaroo ust a -> (a -> Bool) -> String -> GenKangaroo ust aSource

Apply parse then apply the check, if the check fails report the error message.

buildWhile :: (a -> Bool) -> (a -> b -> b) -> (a -> b -> b) -> b -> GenKangaroo ust a -> GenKangaroo ust bSource

Build a value by while the test holds. When the test fails the position is not backtracked, instead we use the "failing" element with lastOp potentially still building the value with it.

buildPrimitive :: Int -> (Word8 -> Bool) -> (Word8 -> b -> b) -> b -> GenKangaroo ust bSource

Char and String parsers

char :: Char -> GenKangaroo ust CharSource

Attempt to parse the supplied single character (the supplied char must be in the ASCII range 0-255).

If the parse succeeds return the char, otherwise a parse-error will be thrown with reportError.

anyChar :: GenKangaroo ust CharSource

Parse any single character. The parser consumes one byte and uses chr to convert it.

text :: Int -> GenKangaroo ust StringSource

Parse a string of the supplied length n.

If n is less than or equal to zero the empty string is returned.

string :: String -> GenKangaroo ust StringSource

Parse the supplied string. All characters should be within the range 0-255.

If the parse succeeds return the char, otherwise a parse-error will be thrown with reportError.

cstring :: GenKangaroo ust StringSource

Parse a null-terminated C-style string.

Byte parsers

w8Zero :: GenKangaroo ust Word8Source

Parse the literal 0x00.

getBytes :: Integral a => a -> GenKangaroo ust [Word8]Source

Get n bytes.

If n is less than or equal to zero an empty list is returned.

Number parsers

int8 :: GenKangaroo ust Int8Source

Parse a single byte, returning it as an Int8.

The conversion from a byte (0-255) to an Int8 uses the Prelude function fromIntegral.

The conversion is summarized as:

 0..127   = 0..127
 128      = -128
 129      = -127
 130      = -126
 ...
 254      = -2
 255      = -1   

 wtoi :: Word8 -> Int8
 wtoi i | i < 128   = i
        | otherwise = -128 + (clearBit i 7)

Word - big endian

word16be :: GenKangaroo ust Word16Source

Parse a Word16 in big endian form.

word24be :: GenKangaroo ust Word32Source

Parse a "Word24" in big endian form.

3 bytes are read - the answer is returned as a Word32.

word32be :: GenKangaroo ust Word32Source

Parse a Word32 in big endian form.

word64be :: GenKangaroo ust Word64Source

Parse a Word64 in big endian form.

Word - little endian

word16le :: GenKangaroo ust Word16Source

Parse a Word16 in little endian form.

word24le :: GenKangaroo ust Word32Source

Parse a "Word24" in little endian form.

3 bytes are read - the answer is returned as a Word32.

word32le :: GenKangaroo ust Word32Source

Parse a Word32 in little endian form.

Int - big endian

int16be :: GenKangaroo ust Int16Source

Parse an Int16 in big endian form.

The ans is parsed as a Word16 (big endian) then converted to an Int16 using the Prelude function fromIntegral.

int32be :: GenKangaroo ust Int32Source

Parse an Int32 in big endian form.

The ans is parsed as a Word32 (big endian) then converted to an Int32 using the Prelude function fromIntegral.

Int - little endian

int16le :: GenKangaroo ust Int16Source

Parse an Int16 in little endian form.

The ans is parsed as a Word16 (little endian) then converted to an Int16 using the Prelude function fromIntegral.

int32le :: GenKangaroo ust Int32Source

Parse an Int32 in little endian form.

The ans is parsed as a Word32 (little endian) then converted to an Int32 using the Prelude function fromIntegral.

IEEE754 single precision float

ieeeFloatSP :: Fractional a => GenKangaroo ust aSource

Parse an 4-byte IEEE single precision float.

NOTE - THIS FUNCTION IS UNTESTED!