Safe Haskell | None |
---|---|
Language | GHC2021 |
Struct parser.
We do still have to do failure checking, because unlike C we check some types (e.g. bitfields). Hopefully inlining can remove those checks when unnecessary.
Synopsis
- type PureMode = Proxy# Void
- type IOMode = State# RealWorld
- type STMode s = State# s
- type ParserT# (st :: ZeroBitType) e a = ForeignPtrContents -> Addr# -> Int# -> st -> Res# st e a
- newtype ParserT (st :: ZeroBitType) e a = ParserT {
- runParserT# :: ParserT# st e a
- type Parser = ParserT PureMode
- type ParserIO = ParserT IOMode
- type ParserST s = ParserT (STMode s)
- type Res# (st :: ZeroBitType) e a = (# st, ResI# e a #)
- type ResI# e a = (# (# a #) | (# #) | (# e #) #)
- pattern OK# :: st -> a -> Res# st e a
- pattern Fail# :: st -> Res# st e a
- pattern Err# :: st -> e -> Res# st e a
- unsafeRunParserBs :: forall a e. ByteString -> Parser e a -> Result e a
- unsafeRunParserPtr :: forall a e. Ptr Word8 -> Parser e a -> Result e a
- unsafeRunParserFPtr :: forall a e. ForeignPtr Word8 -> Parser e a -> Result e a
- unsafeRunParser' :: forall a e. Addr# -> ForeignPtrContents -> Parser e a -> Result e a
- data Result e a
- constParse :: forall a (st :: ZeroBitType) e. a -> ParserT st e a
- sequenceParsers :: forall a b c (st :: ZeroBitType) e. Int -> (a -> b -> c) -> ParserT st e a -> ParserT st e b -> ParserT st e c
- prim :: forall a (st :: ZeroBitType) e. Prim' a => ParserT st e a
- lit :: forall a (st :: ZeroBitType) e. Eq a => a -> ParserT st e a -> ParserT st e ()
- withLit :: forall a (st :: ZeroBitType) e r. Eq a => Int# -> a -> ParserT st e a -> ParserT st e r -> ParserT st e r
- withLitErr :: forall a (st :: ZeroBitType) r. (Integral a, FiniteBits a) => Int# -> a -> Int -> (Addr# -> Int# -> a) -> ParserT st (Int, Word8) r -> ParserT st (Int, Word8) r
- firstNonMatchByteIdx :: FiniteBits a => a -> a -> Int
- unsafeByteAt :: (Num a, Bits a) => a -> Int -> a
Documentation
type ParserT# (st :: ZeroBitType) e a Source #
= ForeignPtrContents | pointer provenance (does not change) |
-> Addr# | base address (does not change) |
-> Int# | cursor offset from base |
-> st | state token |
-> Res# st e a | result |
newtype ParserT (st :: ZeroBitType) e a Source #
Like flatparse, but no buffer length (= no buffer overflow checking), and
no Addr#
on success (= no dynamic length parses).
Unlike flatparse, we separate base address from offset, rather than adding
them. This fits the unaligned Addr#
primops (added in GHC 9.10) better, and
in my head should hopefully assist in emitting immediates where possible for
offsets on the assembly level.
Combining them like in flatparse might be faster; but I really don't know how to find out, without doing both and comparing various examples. After a lot of scratching my head, I think this is most appropriate.
The ForeignPtrContents
is for keeping the Addr#
data in scope.
ParserT | |
|
type Res# (st :: ZeroBitType) e a = (# st, ResI# e a #) Source #
type ResI# e a = (# (# a #) | (# #) | (# e #) #) Source #
Primitive parser result.
Like flatparse, but no Addr#
on success.
pattern OK# :: st -> a -> Res# st e a Source #
Res#
constructor for a successful parse.
Contains the return value and a state token.
pattern Fail# :: st -> Res# st e a Source #
Res#
constructor for recoverable failure.
Contains only a state token.
pattern Err# :: st -> e -> Res# st e a Source #
Res#
constructor for errors which are by default non-recoverable.
Contains the error, plus a state token.
unsafeRunParserBs :: forall a e. ByteString -> Parser e a -> Result e a Source #
caller must guarantee that buffer is long enough for parser!!
unsafeRunParserPtr :: forall a e. Ptr Word8 -> Parser e a -> Result e a Source #
caller must guarantee that buffer is long enough for parser!!
unsafeRunParserFPtr :: forall a e. ForeignPtr Word8 -> Parser e a -> Result e a Source #
caller must guarantee that buffer is long enough for parser!!
unsafeRunParser' :: forall a e. Addr# -> ForeignPtrContents -> Parser e a -> Result e a Source #
caller must guarantee that buffer is long enough for parser!!
Higher-level boxed data type for parsing results.
constParse :: forall a (st :: ZeroBitType) e. a -> ParserT st e a Source #
can't provide via pure
as no Applicative
sequenceParsers :: forall a b c (st :: ZeroBitType) e. Int -> (a -> b -> c) -> ParserT st e a -> ParserT st e b -> ParserT st e c Source #
lit :: forall a (st :: ZeroBitType) e. Eq a => a -> ParserT st e a -> ParserT st e () Source #
parse literal
withLit :: forall a (st :: ZeroBitType) e r. Eq a => Int# -> a -> ParserT st e a -> ParserT st e r -> ParserT st e r Source #
parse literal (CPS)
withLitErr :: forall a (st :: ZeroBitType) r. (Integral a, FiniteBits a) => Int# -> a -> Int -> (Addr# -> Int# -> a) -> ParserT st (Int, Word8) r -> ParserT st (Int, Word8) r Source #
parse literal, return first (leftmost) failing byte on error (CPS)
This can be used to parse large literals via chunking, rather than byte-by-byte, while retaining useful error behaviour.
We don't check equality with XOR even though we use that when handling errors, because it's hard to tell if it would be faster with modern CPUs and compilers.
firstNonMatchByteIdx :: FiniteBits a => a -> a -> Int Source #
Given two non-equal words wActual
and wExpect
, return the index of the
first non-matching byte. Zero indexed.
If both words are equal, returns word_size (e.g. 4 for Word32
).