-- Hoogle documentation, generated by Haddock -- See Hoogle, http://www.haskell.org/hoogle/ -- | An implementation of the web Document Object Model, and its rendering. -- -- willow is the basis of a web browser suite, providing the -- underlying types to represent various documents found on the internet. -- It does not provide parsing algorithms for anything but the -- simplest filetypes, instead expecting them to be outsourced to other -- modules. @package willow @version 0.1.0.0 module Web.Willow.Common.Encoding.Character -- | The Unicode character \xFFFD, safely (but unrecoverably) -- representing an illegal, invalid, or otherwise unknown character. replacementChar :: Char -- | Infra: ASCII whitespace -- -- The ASCII characters defined as whitespace in the HTML standard. -- Unlike Haskell's isSpace and anything following that example, -- does not include \x11 (VT). asciiWhitespace :: [Char] -- | Infra: ASCII alpha -- -- Test whether the character is an alphabetic character in the ASCII -- range ([A-Za-z]). isAsciiAlpha :: Char -> Bool -- | Infra: ASCII alphanumeric -- -- Test whether the character is either an alphabetic character or a -- digit in the ASCII range ([A-Za-z0-9]). isAsciiAlphaNum :: Char -> Bool -- | Infra: ASCII whitespace -- -- Test whether the character fits the spec's definition of -- asciiWhitespace. isAsciiWhitespace :: Char -> Bool -- | Convert an uppercase, alphabetic, ASCII character to its lowercase -- form. This has the same semantics within the ASCII range as -- toLower, but leaves any non-ASCII characters unchanged. -- --
-- >>> toAsciiLower 'A' -- 'a' ---- --
-- >>> toAsciiLower 'Á' -- 'Á' --toAsciiLower :: Char -> Char -- | Convert a lowercase, alphabetic, ASCII character to its uppercase -- form. This has the same semantics within the ASCII range as -- toUpper, but leaves any non-ASCII characters unchanged. -- --
-- >>> toAsciiUpper 'a' -- 'A' ---- --
-- >>> toAsciiUpper 'á' -- 'á' --toAsciiUpper :: Char -> Char -- | The existing parsing libraries are wonderful, but backtracking parsers -- have a bad habit of being strict in their output; sure, you might be -- able to operate over Data.ByteString.Lazy, but they all expect -- to consume their entire input before handing you their result. -- Data.Attoparsec's continuations fully lean into that---even -- though you don't have to provide all the input in one block, you can't -- get a value before closing it out. Text.Megaparsec does provide -- a reentrant form in runParser', but it also comes with -- comparatively heavyweight error and pretty-printing features. -- -- For complicated formats, those all can indeed be desirable. However, -- the HTML algorithms have been optimized for minimal lookahead and -- certainly no output revocation---once something is shipped out, it's -- not going to be called back. Not taking advantage of that by using a -- lazy output type means that parsing would always be subject to the -- whims of slow or unreliable network connections. Moreover, the entire -- complexity of the parsing algorithm is built around never reaching a -- fatal failure condition, so error handling and especially recovery are -- unnecessary overhead. -- -- And so, a custom parsing framework must be defined. module Web.Willow.Common.Parser -- | Unlike most monad transformers, a Parser is built around the -- concept of success and failure, so its "default" form is better -- structured over Maybe than over Identity. type Parser stream = ParserT stream Maybe -- | Set the constructed parser loose on a given input. Returns both the -- resulting value and the remaining contents of the Stream. runParser :: Parser stream out -> stream -> Maybe (out, stream) -- | Encapsulation of an operation for transforming the head of a -- Stream into some other value. Standard usage, with similar -- behaviour to other Text.Parsec-derived parsers, ("accept the -- first which matches") may be obtained by instantiating gather -- with Maybe, or non-deterministic parsing ("accept any of -- these") through []. -- -- Notably, this implementation is designed to allow laziness in both -- input and output. For the best usage, therefore, consume as little -- input at a time as possible, and so call runParser often). -- -- As part of this simplification, all Text.Parsec-style -- integrated state (use StateT) and Text.Megaparsec-style -- error pretty-printing (build your position tracking into the -- stream, and/or wrap the output in Either) has been -- stripped out. newtype ParserT stream gather out ParserT :: (stream -> gather (out, stream)) -> ParserT stream gather out [runParserT] :: ParserT stream gather out -> stream -> gather (out, stream) -- | Purely a convenience of the package rather than the module, the state -- machines described by the HTML standard all involve some degree of -- persistence, and so are built over a deeper monad stack. This could -- easily one of the most common transformers to add, anyway, no matter -- what input is being parsed. type StateParser state stream = StateT state (Parser stream) -- | Generalize the transformation of an input Stream into a more -- meaningful value. This class provides the basic building blocks from -- which more expressive such parsers may be constructed. -- -- See also the description of ParserT for some of the design -- decisions. class (Alternative m, Monad m, Stream stream token, Monoid stream) => MonadParser m stream token | m -> stream -- | Runs the argument parser on the current input, without consuming any -- of it; these are identical semantics to saving and restoring the input -- after running the computation, assuming the MonadState instance -- runs over the input stream (see ParserT): -- --
-- input <- get -- a <- parser -- put input ---- --
-- a <- lookAhead parser --lookAhead :: MonadParser m stream token => m out -> m out -- | Succeeds if and only if the argument parser fails (the input is not -- consumed). avoiding :: MonadParser m stream token => m out -> m () -- | Retrieve the next token in the stream, whatever it may be. Identical -- to uncons in all but type. next :: MonadParser m stream token => m token -- | Retrieve the next several tokens in the stream. Identical to -- count (with a safer index type) in the case that -- gather is a list [token]. -- -- If fewer tokens are in the input stream than asked for, returns what -- does remain in the input stream. nextChunk :: MonadParser m stream token => Word -> m stream -- | Prepend a token to the input stream to be processed next. Identical to -- operating on the stream directly through MonadState, if that -- instance also exists. -- --
-- stream <- get -- put $ cons tok stream ---- --
-- push tok --push :: MonadParser m stream token => token -> m () -- | Concatenate the given sequence with the existing input, processing the -- argument before the older stream. pushChunk :: MonadParser m stream token => stream -> m () -- | Drop the remainder of the input, simulating an early end-of-stream. -- Can be emulated through appropriate MonadState and -- Monoid instances: -- --
-- stream <- get -- put mempty -- return stream ---- --
-- abridge --abridge :: MonadParser m stream token => m stream -- | Succeeds if and only if the input is empty. end :: MonadParser trans stream token => trans () -- | Succeeds if and only if the value parsed by the argument parser -- satisfies the predicate. No further input is consumed. satisfying :: MonadParser trans stream token => (out -> Bool) -> out -> trans out -- | Expect a specific token from the Stream, and fail if a -- different token is found instead. Identical to running -- satisfying with equality in the (by far most likely) case that -- gather is a Monad in addition to an -- Alternative: -- --
-- tok <- next >>= satisfying (== desired) ---- --
-- tok <- token desired --token :: (MonadParser trans stream token, Eq token) => token -> trans token -- | Expect a specific sequence of tokens from the Stream, and fail -- if anything else is found instead, or if the Stream doesn't -- have enough characters before its end. Identical to running -- satisfying with equality over nextChunk in the case that -- stream is an Eq (which all provided instances are) and -- can easily provide a length (which they do, unless the sequence -- to test against also needs to be lazy). -- --
-- stream <- nextChunk (length desired) >>= satisfying (== desired) ---- --
-- stream <- chunk desired --chunk :: (MonadParser trans stream token, Eq stream) => stream -> trans stream -- | A sequence of values which may be processed via a MonadParser. -- This class is essentially just a unification of the various list-like -- interfaces (uncons == head, etc.) as Haskell's -- abstractions are slightly lacking in that area. -- --
-- >>> Just (tok, str) == uncons (cons tok str) -- True --class Monoid stream => Stream stream token | stream -> token -- | Prepend a token to the stream for proximate processing, before -- everything already in it. cons :: Stream stream token => token -> stream -> stream -- | As cons, but append multiple tokens at once. consChunk :: Stream stream token => stream -> stream -> stream -- | Retrieve the next token from the stream. -- -- This should only return Nothing if the stream is actually -- empty---if the next value is not available yet due to slow IO or other -- computation, uncons waits until it is. uncons :: Stream stream token => stream -> Maybe (token, stream) -- | Retrieve the next several tokens from the stream. -- -- If fewer tokens are in the input stream than asked for, the left side -- of the return value is the (shorter than requested) entire input -- stream and the right is mempty. unconsChunk :: Stream stream token => Word -> stream -> (stream, stream) -- | The number of tokens remaining in the stream. chunkLen :: Stream stream token => stream -> Word instance (GHC.Base.Alternative gather, GHC.Base.Monad gather, Web.Willow.Common.Parser.Stream stream token, GHC.Base.Monoid stream) => Web.Willow.Common.Parser.MonadParser (Web.Willow.Common.Parser.ParserT stream gather) stream token instance (Web.Willow.Common.Parser.MonadParser trans stream token, GHC.Base.Monoid accum, GHC.Base.MonadPlus trans) => Web.Willow.Common.Parser.MonadParser (Control.Monad.Trans.Accum.AccumT accum trans) stream token instance (Web.Willow.Common.Parser.MonadParser trans stream token, GHC.Base.Monoid except) => Web.Willow.Common.Parser.MonadParser (Control.Monad.Trans.Except.ExceptT except trans) stream token instance Web.Willow.Common.Parser.MonadParser trans stream token => Web.Willow.Common.Parser.MonadParser (Control.Monad.Trans.Identity.IdentityT trans) stream token instance Web.Willow.Common.Parser.MonadParser trans stream token => Web.Willow.Common.Parser.MonadParser (Control.Monad.Trans.Maybe.MaybeT trans) stream token instance Web.Willow.Common.Parser.MonadParser trans stream token => Web.Willow.Common.Parser.MonadParser (Control.Monad.Trans.Reader.ReaderT reader trans) stream token instance (Web.Willow.Common.Parser.MonadParser trans stream token, GHC.Base.MonadPlus trans) => Web.Willow.Common.Parser.MonadParser (Control.Monad.Trans.State.Lazy.StateT state trans) stream token instance (Web.Willow.Common.Parser.MonadParser trans stream token, GHC.Base.MonadPlus trans) => Web.Willow.Common.Parser.MonadParser (Control.Monad.Trans.State.Strict.StateT state trans) stream token instance (Web.Willow.Common.Parser.MonadParser trans stream token, GHC.Base.Monoid writer) => Web.Willow.Common.Parser.MonadParser (Control.Monad.Trans.Writer.Lazy.WriterT writer trans) stream token instance (Web.Willow.Common.Parser.MonadParser trans stream token, GHC.Base.Monoid writer) => Web.Willow.Common.Parser.MonadParser (Control.Monad.Trans.Writer.Strict.WriterT writer trans) stream token instance (Web.Willow.Common.Parser.MonadParser trans stream token, GHC.Base.Monoid writer, GHC.Base.MonadPlus trans) => Web.Willow.Common.Parser.MonadParser (Control.Monad.Trans.RWS.Strict.RWST reader writer state trans) stream token instance (Web.Willow.Common.Parser.MonadParser trans stream token, GHC.Base.Monoid writer, GHC.Base.MonadPlus trans) => Web.Willow.Common.Parser.MonadParser (Control.Monad.Trans.RWS.Lazy.RWST reader writer state trans) stream token instance Web.Willow.Common.Parser.Stream Data.ByteString.Lazy.Internal.ByteString GHC.Word.Word8 instance Web.Willow.Common.Parser.Stream Data.ByteString.Internal.ByteString GHC.Word.Word8 instance Web.Willow.Common.Parser.Stream Data.Text.Internal.Lazy.Text GHC.Types.Char instance Web.Willow.Common.Parser.Stream Data.Text.Internal.Text GHC.Types.Char instance Web.Willow.Common.Parser.Stream [token] token instance GHC.Base.Functor gather => GHC.Base.Functor (Web.Willow.Common.Parser.ParserT stream gather) instance GHC.Base.Monad gather => GHC.Base.Applicative (Web.Willow.Common.Parser.ParserT stream gather) instance (GHC.Base.Alternative gather, GHC.Base.Monad gather) => GHC.Base.Alternative (Web.Willow.Common.Parser.ParserT stream gather) instance (GHC.Base.Monad gather, GHC.Base.Semigroup out) => GHC.Base.Semigroup (Web.Willow.Common.Parser.ParserT stream gather out) instance (GHC.Base.Monad gather, GHC.Base.Monoid out) => GHC.Base.Monoid (Web.Willow.Common.Parser.ParserT stream gather out) instance GHC.Base.Monad gather => GHC.Base.Monad (Web.Willow.Common.Parser.ParserT stream gather) instance (GHC.Base.Alternative gather, GHC.Base.Monad gather) => GHC.Base.MonadPlus (Web.Willow.Common.Parser.ParserT stream gather) instance Control.Monad.Fail.MonadFail gather => Control.Monad.Fail.MonadFail (Web.Willow.Common.Parser.ParserT stream gather) instance Control.Monad.Error.Class.MonadError err gather => Control.Monad.Error.Class.MonadError err (Web.Willow.Common.Parser.ParserT stream gather) instance GHC.Base.Monad gather => Control.Monad.Fix.MonadFix (Web.Willow.Common.Parser.ParserT stream gather) instance Control.Monad.Trans.Class.MonadTrans (Web.Willow.Common.Parser.ParserT stream) instance GHC.Base.Monad gather => Control.Monad.Reader.Class.MonadReader stream (Web.Willow.Common.Parser.ParserT stream gather) instance GHC.Base.Monad gather => Control.Monad.State.Class.MonadState stream (Web.Willow.Common.Parser.ParserT stream gather) instance Control.Monad.IO.Class.MonadIO gather => Control.Monad.IO.Class.MonadIO (Web.Willow.Common.Parser.ParserT stream gather) instance Control.Monad.Cont.Class.MonadCont gather => Control.Monad.Cont.Class.MonadCont (Web.Willow.Common.Parser.ParserT stream gather) -- | Alternative instances can provide a form of pattern matching if -- given a fail-on-false combinator (e.g. when), however the exact -- behaviour isn't guaranteed; an underlying Maybe does provide a -- greedy match, but [] will match later overlapping tests even -- if they are intended to be masked; compare the masking to standard, -- cascading pattern guards. This module provides a means of formalizing -- that behaviour into a predictable form, no matter which -- Alternative winds up being used. module Web.Willow.Common.Parser.Switch -- | The building blocks for predictable pattern matches over -- Alternative. The constructors are distinguished along three -- axes (see also the examples in the documentation for switch): -- --
-- >>> uppercase = If_ isUpper $ return "uppercase" -- -- >>> one = When_ (== '1') $ return "single '1'" -- -- >>> alpha = If_ isAlpha $ return "ASCII letter" -- Matches -- -- >>> catchall = Else_ $ return "none of the above" -- Matches -- -- >>> switch [uppercase, one, alpha, catchall] 'a' :: [String] -- ["ASCII letter"] ---- -- Non-masking cases don't interact with the masking calculations: -- --
-- >>> uppercase = If_ isUpper $ return "uppercase" -- -- >>> one = When_ (== '1') $ return "single '1'" -- Matches -- -- >>> alpha = If_ isAlpha $ return "ASCII letter" -- -- >>> catchall = Else_ $ return "none of the above" -- Matches -- -- >>> switch [uppercase, one, alpha, catchall] '1' :: [String] -- ["single '1'", "none of the above"] ---- -- Maybe always takes the earliest successful test: -- --
-- >>> uppercase = If_ isUpper $ return "uppercase" -- -- >>> one = When_ (== '1') $ return "single '1'" -- Matches -- -- >>> alpha = If_ isAlpha $ return "ASCII letter" -- -- >>> catchall = Else_ $ return "none of the above" -- Matches -- -- >>> switch [uppercase, one, alpha, catchall] '1' :: Maybe String -- Just "single '1'" ---- -- Always and Always_ function as a standard -- Alternative computation: -- --
-- >>> switch [Always a, Always b, Always_ c] tok == a tok <|> b tok <|> c -- True --switch :: Alternative m => [SwitchCase test m out] -> test -> m out module Web.Willow.Common.Parser.Util -- | Test whether a given value falls within the range defined by the two -- bounds, inclusive. -- --
-- >>> range 1 2 3 -- False ---- --
-- >>> range 1 3 2 -- True ---- --
-- >>> range 1 2 2 -- True --range :: Ord a => a -> a -> a -> Bool -- | Reduce a list of Alternatives, such that the first successful -- instance will be run. If the list is empty, the resulting value will -- always fail. choice :: Alternative m => [m a] -> m a -- | Scan through the stream, until the given parser succeeds (discarding -- any tokens between the initial location and where the first success is -- found). Fails if the parser does not succeed at any point in the -- remainder of the stream. findNext :: MonadParser parser stream token => parser out -> parser out -- | The Encoding spec uses a conceptual model of an -- "encoding" as being the function between Unicode values and bytes. As -- this is a bit more complex than any content author wants to specify -- every document, HTML (and other interfaces) represent them as -- semi-standardized but freeform text strings; the standard document -- then collects the various strings authors have used across the web and -- associates the most common as "labels" of those abstract encodings. -- -- To refer to them internally, however, it also promotes one of the -- labels of each encoding as the canonical form; this library implements -- that set (with modifications to fit Haskell identifiers) in -- Encoding. The labels are described via a reversible many-to-one -- mapping with those names, which as the reverse is rarely used, lends -- itself well to being adapted as a lookup table. This then is a -- machine-readable formatting of that table. module Web.Willow.Common.Encoding.Labels -- | Encoding: get an encoding -- -- Given an encoding's case-insensitive label, try to retrieve an -- appropriate Encoding. The set prescribed by the HTML -- specification is smaller than that used by other registries for -- security and interoperability reasons, and may not always return the -- expected Encoding if an alternate one has been determined to be -- more internet-compatible. lookupEncoding :: Text -> Maybe Encoding instance GHC.Read.Read Web.Willow.Common.Encoding.Labels.EncodingDesc instance GHC.Show.Show Web.Willow.Common.Encoding.Labels.EncodingDesc instance GHC.Classes.Eq Web.Willow.Common.Encoding.Labels.EncodingDesc instance GHC.Read.Read Web.Willow.Common.Encoding.Labels.EncodingTable instance GHC.Show.Show Web.Willow.Common.Encoding.Labels.EncodingTable instance GHC.Classes.Eq Web.Willow.Common.Encoding.Labels.EncodingTable instance Data.Aeson.Types.FromJSON.FromJSON Web.Willow.Common.Encoding.Labels.EncodingTable instance Data.Aeson.Types.FromJSON.FromJSON Web.Willow.Common.Encoding.Labels.EncodingDesc -- | This module and the internal branch it heads implement the -- Encoding specification for translating text to and from -- UTF-8 and a selection of less-favoured but grandfathered encoding -- schemes. As the standard authors' primary goal has been security -- followed closely by compatibility with existing web pages, the -- algorithms described and the names associated with them do not -- perfectly match the descriptions originally given by the various -- original encoding specifications themselves. module Web.Willow.Common.Encoding -- | Encoding: encoding -- -- All character encoding schemes supported by the HTML standard, defined -- as a bidirectional map between characters and binary sequences. -- Utf8 is strongly encouraged for new content (including all -- encoding purposes), but the others are retained for compatibility with -- existing pages. -- -- Note that none of these are complete functions, to one degree or -- another, and that no guarantee is made that the mapping round-trips. data Encoding -- | The UTF-8 encoding for Unicode. Utf8 :: Encoding -- | The UTF-16 encoding for Unicode, in big endian order. -- -- No encoder is provided for this scheme. Utf16be :: Encoding -- | The UTF-16 encoding for Unicode, in little endian order. -- -- No encoder is provided for this scheme. Utf16le :: Encoding -- | Big5, primarily covering traditional Chinese characters. Big5 :: Encoding -- | EUC-JP, primarily covering Japanese as the union of JIS-0208 -- and JIS-0212. EucJp :: Encoding -- | EUC-KR, primarily covering Hangul. EucKr :: Encoding -- | The GB18030-2005 extension to GBK, with one tweak for web -- compatibility, primarily covering both forms of Chinese characters. -- -- Note that this encoding also includes a large number of four-byte -- sequences which aren't listed in the linked visualization. Gb18030 :: Encoding -- | GBK, primarily covering simplified Chinese characters. -- -- In practice, this is just Gb18030 with a restricted set of -- encodable characters; the decoder is identical. Gbk :: Encoding -- | DOS and OS/2 code page for Cyrillic characters. Ibm866 :: Encoding -- | A Japanese-focused implementation of the ISO 2022 meta-encoding, -- including both JIS-0208 and halfwidth katakana. Iso2022Jp :: Encoding -- | Latin-2 (Central European). Iso8859_2 :: Encoding -- | Latin-3 (South European and Esperanto) Iso8859_3 :: Encoding -- | Latin-4 (North European). Iso8859_4 :: Encoding -- | Latin/Cyrillic. Iso8859_5 :: Encoding -- | Latin/Arabic. Iso8859_6 :: Encoding -- | Latin/Greek (modern monotonic). Iso8859_7 :: Encoding -- | Latin/Hebrew (visual order). Iso8859_8 :: Encoding -- | Latin/Hebrew (logical order). Iso8859_8i :: Encoding -- | Latin-6 (Nordic). Iso8859_10 :: Encoding -- | Latin-7 (Baltic Rim). Iso8859_13 :: Encoding -- | Latin-8 (Celtic). Iso8859_14 :: Encoding -- | Latin-9 (revision of ISO 8859-1 Latin-1, Western European). Iso8859_15 :: Encoding -- | Latin-10 (South-Eastern European). Iso8859_16 :: Encoding -- | KOI-8 specialized for Russian Cyrillic. Koi8R :: Encoding -- | KOI-8 specialized for Ukrainian Cyrillic. Koi8U :: Encoding -- | Mac OS Roman. Macintosh :: Encoding -- | Mac OS Cyrillic (as of Mac OS 9.0) MacintoshCyrillic :: Encoding -- | The Windows variant (code page 932) of Shift JIS. ShiftJis :: Encoding -- | ISO 8859-11 Latin/Thai with Windows extensions in the C1 -- control character slots. -- -- Note that this encoding is always used instead of pure Latin/Thai. Windows874 :: Encoding -- | The Windows extension and rearrangement of ISO 8859-2 Latin-2. Windows1250 :: Encoding -- | Windows Cyrillic. Windows1251 :: Encoding -- | The Windows extension of ISO 8859-1 Latin-1, replacing most of -- the C1 control characters with printable glyphs. -- -- Note that this encoding is always used instead of pure Latin-1. Windows1252 :: Encoding -- | Windows Greek (modern monotonic). Windows1253 :: Encoding -- | The Windows extension of ISO 8859-9 Latin-5 (Turkish), -- replacing most of the C1 control characters with printable glyphs. -- -- Note that this encoding is always used instead of pure Latin-5. Windows1254 :: Encoding -- | The Windows extension and rearrangement of ISO 8859-8 -- Latin/Hebrew. Windows1255 :: Encoding -- | Windows Arabic. Windows1256 :: Encoding -- | Windows Baltic. Windows1257 :: Encoding -- | Windows Vietnamese. Windows1258 :: Encoding -- | The input is reduced to a single \xFFFD replacement -- character. -- -- No encoder is provided for this scheme. Replacement :: Encoding -- | Non-ASCII bytes (\x80 through \xFF) are mapped to a -- portion of the Unicode Private Use Area (\xF780 through -- \xF7FF). UserDefined :: Encoding -- | All the data which needs to be tracked for correct behaviour in -- decoding a binary stream into readable text. data DecoderState -- | Retrieve the encoding scheme currently used by the decoder to decode -- the binary document stream. decoderEncoding :: DecoderState -> Encoding -- | Any leftover bytes at the end of the binary stream, which require -- further input to be processed in order to correctly map to a character -- or error value. decoderRemainder :: DecoderState -> ShortByteString -- | HTML: change the encoding -- -- The data required to determine if a new encoding would produce an -- identical output to what the current one has already done, and to -- restart the parsing with the new one if the two are incompatible. -- Values may be easily initialized via emptyReparseData. data ReparseData -- | All the data which needs to be tracked for correct behaviour in -- decoding a binary stream into readable text. data EncoderState -- | The collection of data which, for any given encoding scheme, results -- in behaviour according to the vanilla decoder before any bytes have -- been read. initialDecoderState :: Encoding -> DecoderState -- | Instruct the decoder that the binary document stream is known -- to be in the certain encoding. setEncodingCertain :: Encoding -> DecoderState -> DecoderState -- | Store the given binary sequence as unparsable without further input, -- to be prepended to the beginning of stream on the next decode -- or decode' call. setRemainder :: ShortByteString -> DecoderState -> DecoderState -- | The collection of data which, for any given encoding scheme, results -- in behaviour according to the vanilla decoder before any bytes have -- been read. initialEncoderState :: Encoding -> EncoderState -- | Encoding: run an encoding's decoder with error -- mode fatal -- -- Given a character encoding scheme, transform a dependant -- ByteString into portable Chars. If any byte sequences -- are meaningless or illegal, they are returned verbatim for error -- reporting; a Left should not be parsed further. -- -- See decodeStep to decode only a minimal section, or -- decode' for simple error replacement. Call -- finalizeDecode on the returned DecoderState if no -- further bytes will be added to the document stream. decode :: DecoderState -> ByteString -> ([Either ShortByteString String], DecoderState) -- | Encoding: decode -- -- Given a character encoding scheme, transform a dependant -- ByteString into a portable Text. If any byte sequences -- are meaningless or illegal, they are replaced with the Unicode -- replacement character \xFFFD. -- -- See decodeStep' to decode only a minimal section, or -- decode if the original data should be retained for custom error -- reporting. Call finalizeDecode' on the returned -- DecoderState if no further bytes will be added to the document -- stream. decode' :: DecoderState -> ByteString -> (Text, DecoderState) -- | Encoding: BOM sniff -- -- Checks for a "byte-order mark" signature character in various -- encodings. If present, returns both the encoding found and the -- remainder of the stream, otherwise returns the input unchanged. byteOrderMark :: ByteString -> (Maybe Encoding, ByteString) -- | Explicitly indicate that the input stream will not contain any further -- bytes, and perform any finalization processing based on that. -- -- See finalizeDecode' for simple error replacement. finalizeDecode :: DecoderState -> [Either ShortByteString String] -- | Explicitly indicate that the input stream will not contain any further -- bytes, and perform any finalization processing based on that. -- -- See finalizeDecode if the original data should be retained for -- custom error reporting. finalizeDecode' :: DecoderState -> Text -- | Read a binary stream of UTF-8 encoded text. If the stream begins with -- a UTF-8 byte-order mark, it's silently dropped (any other BOM is -- ignored but remains in the output). Fails (returning a Left) if -- the stream contains byte sequences which don't represent any -- character, or which encode a surrogate character. -- -- See decodeUtf8' for simple error replacement, or -- decodeUtf8NoBom if the BOM should always be retained. decodeUtf8 :: ByteString -> ([Either ShortByteString String], DecoderState) -- | Encoding: UTF-8 decode without BOM or fail -- -- Read a binary stream of UTF-8 encoded text. If the stream begins with -- a byte-order mark, it is kept as the first character of the output. -- Fails (returning a Left) if the stream contains byte sequences -- which don't represent any character, or which encode a surrogate -- character. -- -- See decodeUtf8NoBom' for simple error replacement, or -- decodeUtf8' if a redundant UTF-8 BOM should be dropped. decodeUtf8NoBom :: ByteString -> ([Either ShortByteString String], DecoderState) -- | Encoding: UTF-8 decode -- -- Read a binary stream of UTF-8 encoded text. If the stream begins with -- a UTF-8 byte-order mark, it's silently dropped (any other BOM is -- ignored but remains in the output). Any surrogate characters or -- invalid byte sequences are replaced with the Unicode replacement -- character \xFFFD. -- -- See decodeUtf8 if the original data should be retained for -- custom error reporting, or decodeUtf8NoBom' if the BOM should -- always be retained. decodeUtf8' :: ByteString -> (Text, DecoderState) -- | Encoding: UTF-8 decode without BOM -- -- Read a binary stream of UTF-8 encoded text. If the stream begins with -- a byte-order mark, it is kept as the first character of the output. -- Any surrogate characters or invalid byte sequences are replaced with -- the Unicode replacement character \xFFFD. -- -- See decodeUtf8NoBom if the original data should be retained for -- custom error reporting, or decodeUtf8' if a redundant UTF-8 BOM -- should be dropped. decodeUtf8NoBom' :: ByteString -> (Text, DecoderState) -- | Encoding: run an encoding's encoder with error -- mode fatal -- -- Given a character encoding scheme, transform a portable Text -- into a sequence of bytes representing those characters. If the -- encoding scheme does not define a binary representation for a -- character in the input, the original Char is returned unchanged -- for custom error reporting. -- -- See encodeStep to encode only a minimal section, or -- encode' for escaping with HTML-style character codes. encode :: EncoderState -> Text -> ([Either Char ShortByteString], EncoderState) -- | Encoding: encode -- -- Given a character encoding scheme, transform a portable Text -- into a sequence of bytes representing those characters. If the -- encoding scheme does not define a binary representation for a -- character in the input, they are replaced with an HTML-style escape -- (e.g. "�"). -- -- See encodeStep' to encode only a minimal section, or -- encode if the original data should be retained for custom error -- reporting. encode' :: EncoderState -> Text -> (ByteString, EncoderState) -- | Encoding: UTF-8 encode -- -- Transform a portable Text into a sequence of bytes according to -- the UTF-8 encoding scheme. encodeUtf8 :: Text -> (ByteString, EncoderState) -- | Encoding: run an encoding's decoder with error -- mode fatal -- -- Read the smallest number of bytes from the head of the -- ByteString which would leave the decoder in a re-enterable -- state. If any byte sequences are meaningless or illegal, they are -- returned verbatim for error reporting; a Left should not be -- parsed further. -- -- See decode to decode the entire string at once, or -- decodeStep' for simple error replacement. decodeStep :: DecoderState -> ByteString -> (Maybe (Either ShortByteString String), DecoderState, ByteString) -- | Encoding: run an encoding's encoder with error -- mode fatal -- -- Read the smallest number of characters from the head of the -- Text which would leave the encoder in a re-enterable state. If -- the encoding scheme does not define a binary representation for a -- character in the input, the original Char is returned unchanged -- for custom error reporting. -- -- See encode to decode the entire string at once, or -- encodeStep' for simple error replacement. encodeStep :: EncoderState -> Text -> Maybe (Either Char ShortByteString, EncoderState, Text) -- | Encoding: run an encoding's decoder with error -- mode replacement -- -- Read the smallest number of bytes from the head of the -- ByteString which would leave the decoder in a re-enterable -- state. Any byte sequences which are meaningless or illegal are -- replaced with the Unicode replacement character \xFFFD. -- -- See decode' to decode the entire string at once, or -- decodeStep if the original data should be retained for custom -- error reporting. decodeStep' :: DecoderState -> ByteString -> (Maybe String, DecoderState, ByteString) -- | Encoding: run an encoding's encoder with error -- mode html -- -- Read the smallest number of characters from the head of the -- Text which would leave the encoder in a re-enterable state. If -- the encoding scheme does not define a binary representation for a -- character in the input, they are replaced with an HTML-style escape -- (e.g. "�"). -- -- See encode' to encode the entire string at once, or -- encodeStep if the original data should be retained for custom -- error reporting. encodeStep' :: EncoderState -> Text -> Maybe (ShortByteString, EncoderState, Text) -- | The union of all state variables tracked by the bytes-to-Char -- decoding algorithm of a single encoding scheme. data InnerDecoderState -- | The union of all state variables tracked by the Char-to-bytes -- encoding algorithm of a single encoding scheme. data InnerEncoderState instance GHC.Read.Read Web.Willow.Common.Encoding.InnerEncoderState instance GHC.Show.Show Web.Willow.Common.Encoding.InnerEncoderState instance GHC.Classes.Eq Web.Willow.Common.Encoding.InnerEncoderState instance GHC.Read.Read Web.Willow.Common.Encoding.InnerDecoderState instance GHC.Show.Show Web.Willow.Common.Encoding.InnerDecoderState instance GHC.Classes.Eq Web.Willow.Common.Encoding.InnerDecoderState -- | In an ideal internet, every server would declare the binary encoding -- with which it is transmitting a file (actually, the true ideal -- would be for it to always be Utf8, but there are still a lot of -- legacy documents out there). However, that's not always the case. -- -- A good fallback would be for every document to declare itself what -- encoding it has been saved in. However, not every one does, and the -- ones that do may still get it wrong (take, for instance, the case of a -- server which does translate everything it sends to -- Utf8). -- -- And so, the HTML standard describes an algorithm for guessing -- the proper bytes-to-text translation to use in decode. While -- this does therefore assume some HTML syntax and specific tags, none of -- the semantics should cause an issue for other filetypes. module Web.Willow.Common.Encoding.Sniffer -- | Encoding: encoding -- -- All character encoding schemes supported by the HTML standard, defined -- as a bidirectional map between characters and binary sequences. -- Utf8 is strongly encouraged for new content (including all -- encoding purposes), but the others are retained for compatibility with -- existing pages. -- -- Note that none of these are complete functions, to one degree or -- another, and that no guarantee is made that the mapping round-trips. data Encoding -- | The UTF-8 encoding for Unicode. Utf8 :: Encoding -- | The UTF-16 encoding for Unicode, in big endian order. -- -- No encoder is provided for this scheme. Utf16be :: Encoding -- | The UTF-16 encoding for Unicode, in little endian order. -- -- No encoder is provided for this scheme. Utf16le :: Encoding -- | Big5, primarily covering traditional Chinese characters. Big5 :: Encoding -- | EUC-JP, primarily covering Japanese as the union of JIS-0208 -- and JIS-0212. EucJp :: Encoding -- | EUC-KR, primarily covering Hangul. EucKr :: Encoding -- | The GB18030-2005 extension to GBK, with one tweak for web -- compatibility, primarily covering both forms of Chinese characters. -- -- Note that this encoding also includes a large number of four-byte -- sequences which aren't listed in the linked visualization. Gb18030 :: Encoding -- | GBK, primarily covering simplified Chinese characters. -- -- In practice, this is just Gb18030 with a restricted set of -- encodable characters; the decoder is identical. Gbk :: Encoding -- | DOS and OS/2 code page for Cyrillic characters. Ibm866 :: Encoding -- | A Japanese-focused implementation of the ISO 2022 meta-encoding, -- including both JIS-0208 and halfwidth katakana. Iso2022Jp :: Encoding -- | Latin-2 (Central European). Iso8859_2 :: Encoding -- | Latin-3 (South European and Esperanto) Iso8859_3 :: Encoding -- | Latin-4 (North European). Iso8859_4 :: Encoding -- | Latin/Cyrillic. Iso8859_5 :: Encoding -- | Latin/Arabic. Iso8859_6 :: Encoding -- | Latin/Greek (modern monotonic). Iso8859_7 :: Encoding -- | Latin/Hebrew (visual order). Iso8859_8 :: Encoding -- | Latin/Hebrew (logical order). Iso8859_8i :: Encoding -- | Latin-6 (Nordic). Iso8859_10 :: Encoding -- | Latin-7 (Baltic Rim). Iso8859_13 :: Encoding -- | Latin-8 (Celtic). Iso8859_14 :: Encoding -- | Latin-9 (revision of ISO 8859-1 Latin-1, Western European). Iso8859_15 :: Encoding -- | Latin-10 (South-Eastern European). Iso8859_16 :: Encoding -- | KOI-8 specialized for Russian Cyrillic. Koi8R :: Encoding -- | KOI-8 specialized for Ukrainian Cyrillic. Koi8U :: Encoding -- | Mac OS Roman. Macintosh :: Encoding -- | Mac OS Cyrillic (as of Mac OS 9.0) MacintoshCyrillic :: Encoding -- | The Windows variant (code page 932) of Shift JIS. ShiftJis :: Encoding -- | ISO 8859-11 Latin/Thai with Windows extensions in the C1 -- control character slots. -- -- Note that this encoding is always used instead of pure Latin/Thai. Windows874 :: Encoding -- | The Windows extension and rearrangement of ISO 8859-2 Latin-2. Windows1250 :: Encoding -- | Windows Cyrillic. Windows1251 :: Encoding -- | The Windows extension of ISO 8859-1 Latin-1, replacing most of -- the C1 control characters with printable glyphs. -- -- Note that this encoding is always used instead of pure Latin-1. Windows1252 :: Encoding -- | Windows Greek (modern monotonic). Windows1253 :: Encoding -- | The Windows extension of ISO 8859-9 Latin-5 (Turkish), -- replacing most of the C1 control characters with printable glyphs. -- -- Note that this encoding is always used instead of pure Latin-5. Windows1254 :: Encoding -- | The Windows extension and rearrangement of ISO 8859-8 -- Latin/Hebrew. Windows1255 :: Encoding -- | Windows Arabic. Windows1256 :: Encoding -- | Windows Baltic. Windows1257 :: Encoding -- | Windows Vietnamese. Windows1258 :: Encoding -- | The input is reduced to a single \xFFFD replacement -- character. -- -- No encoder is provided for this scheme. Replacement :: Encoding -- | Non-ASCII bytes (\x80 through \xFF) are mapped to a -- portion of the Unicode Private Use Area (\xF780 through -- \xF7FF). UserDefined :: Encoding -- | HTML: confidence -- -- How likely the specified encoding is to be the actual stream encoding. -- -- The spec names a third confidence level irrelevant, to be -- used when the stream doesn't depend on any particular encoding scheme -- (i.e. it is composed directly of Chars rather than parsed from -- a binary stream). This has not been included in the sum type, as it -- makes little sense to have that as a parameter of the decoding -- stage. Use Maybe DecoderState to represent it -- instead. data Confidence -- | The binary stream is likely the named encoding, but more data may -- prove it to be something else. In the latter case, the -- ReparseData (if available) may be used to transition to the -- proper encoding, or restart the stream if necessary. Tentative :: Encoding -> ReparseData -> Confidence -- | The binary stream is confirmed to be of the given encoding. Certain :: Encoding -> Confidence -- | HTML: change the encoding -- -- The data required to determine if a new encoding would produce an -- identical output to what the current one has already done, and to -- restart the parsing with the new one if the two are incompatible. -- Values may be easily initialized via emptyReparseData. data ReparseData ReparseData :: HashMap ShortByteString Char -> ByteString -> ReparseData -- | The input binary sequences and the resulting characters which are -- already emitted to the output. [parsedChars] :: ReparseData -> HashMap ShortByteString Char -- | The complete binary sequence parsed thus far, in case it needs to be -- re-processed under a new, incompatible encoding. [streamStart] :: ReparseData -> ByteString -- | The collection of data which would indicate nothing has yet been -- parsed. emptyReparseData :: ReparseData -- | HTML: encoding sniffing algorithm -- -- Given a stream and related metadata, try to determine what encoding -- may have been used to write it. -- -- Will resolve and/or wait for the number of bytes requested by -- prescanDepth to be available in the stream (or, if it comes -- sooner, the end of the stream), if they have not yet been produced. sniff :: SnifferEnvironment -> ByteString -> Confidence -- | Various datapoints which may indicate a document's binary encoding, to -- be fed into the sniff algorithm. Values may be easily -- instantiated as updates to emptySnifferEnvironment. data SnifferEnvironment SnifferEnvironment :: Maybe Encoding -> Maybe Encoding -> Word -> Maybe Encoding -> Maybe Encoding -> Maybe Encoding -> Maybe Encoding -> SnifferEnvironment -- | The encoding the end user has specified should be used. Note that even -- this can still be overridden by the presence of a byte-order mark at -- the head of the stream. [userOverride] :: SnifferEnvironment -> Maybe Encoding -- | The encoding given by the transport layer (e.g. through an HTTP -- Content-Type header). [transportHeader] :: SnifferEnvironment -> Maybe Encoding -- | The number of bytes which should be skimmed for meta -- attributes specifying an encoding. [prescanDepth] :: SnifferEnvironment -> Word -- | The encoding used for the enclosing document (e.g., if this document -- is loaded via an <iframe>). [parentEncoding] :: SnifferEnvironment -> Maybe Encoding -- | The encoding from the last time this page was loaded, other pages on -- the site, or other cached data. [cachedInfo] :: SnifferEnvironment -> Maybe Encoding -- | The encoding the end user has specified as being their preferred -- default, if no better encoding can be determined. [userDefault] :: SnifferEnvironment -> Maybe Encoding -- | The encoding recommended as a reasonable guess based on the current -- language of the user's system. -- | Warning: The type of this argument will be changed in a future -- release [localeEncoding] :: SnifferEnvironment -> Maybe Encoding -- | A neutral set of parameters to pass to the sniff algorithm: no -- accessory data, and a prescanDepth limit of 1024 bytes. emptySnifferEnvironment :: SnifferEnvironment -- | Guess what encoding may be in use by the binary stream, and generate a -- collection of data based on that which results in the behaviour -- described by the decoding algorithm at the start of the stream. sniffDecoderState :: SnifferEnvironment -> ByteString -> DecoderState -- | The encoding scheme currently in use by the parser, along with how -- likely that scheme actually represents the binary stream. decoderConfidence :: DecoderState -> Confidence -- | Extract the underlying encoding scheme from the wrapping data. confidenceEncoding :: Confidence -> Encoding -- | HTML: algorithm for extracting a character encoding from -- a meta element -- -- Find the first occurrence of an ASCII-encoded string charset -- in the stream, and try to parse its attribute-style value into an -- Encoding. -- -- Returns Nothing if the stream does not contain charset -- followed by =, or if the value can not be successfully parsed -- as an encoding label. extractEncoding :: ByteString -> Maybe Encoding instance GHC.Read.Read Web.Willow.Common.Encoding.Sniffer.SnifferEnvironment instance GHC.Show.Show Web.Willow.Common.Encoding.Sniffer.SnifferEnvironment instance GHC.Classes.Eq Web.Willow.Common.Encoding.Sniffer.SnifferEnvironment -- | In lieu of a fully-featured DOM implementation ---and -- even, for that matter, a styled tree--- this module provides -- bare-bones data structures to temporarily contain the minimal data -- currently returned by tree parsing. Eventually this will be padded out -- into a fully-featured DOM implementation, but doing so now would be -- creating much more work than necessary. module Web.Willow.DOM -- | DOM: tree -- -- The core concept underlying HTML and related languages: a nested -- collection of data and metadata marked up according to several broad -- categories. Values may be easily instantiated as updates to -- emptyTree. data Tree Tree :: Node -> [Tree] -> Tree -- | The atomic portion of the tree at the current location. [node] :: Tree -> Node -- | All parts of the tree nested below the current location. [children] :: Tree -> [Tree] -- | A sane default collection for easy record initialization; namely, a -- Document without any children. emptyTree :: Tree -- | DOM: node -- -- The sum type of all different classes of behaviour a particular point -- of data may fill. data Node -- | DOM: Text -- -- A simple character string to be rendered to the output or to be -- processed further, according to which Elements enclose it. Text :: Text -> Node -- | DOM: Comment -- -- An author's aside, not intended to be shown to the end user. Comment :: Text -> Node -- | DOM: DocumentType -- -- Largely vestigial in HTML5, but used in previous versions and related -- languages to specify the semantics of Elements used in the -- document. DocumentType :: DocumentTypeParams -> Node -- | DOM: Element -- -- Markup instructions directing the behaviour or classifying a portion -- of the document's content. Element :: ElementParams -> Node -- | DOM: Attr -- -- Metadata allowing finer customization and description of the heavier -- Elements. Attribute :: AttributeParams -> Node -- | DOM: DocumentType -- -- As like Document, but requiring less precise structure in its -- children and generally only containing a small slice of a -- larger document. DocumentFragment :: Node -- | DOM: Document -- -- The root of a Tree, typically imposing a principled structure. Document :: QuirksMode -> Node -- | A simplified view of the Node constructors, for use in testing -- via nodeType. data NodeType -- | DOM: ELEMENT_NODE -- -- Element ElementNode :: NodeType -- | DOM: ATTRIBUTE_NODE -- -- Attribute AttributeNode :: NodeType -- | DOM: TEXT_NODE -- -- Text TextNode :: NodeType -- | DOM: CDATA_SECTION_NODE CDataSectionNode :: NodeType -- | DOM: ENTITY_REFERENCE_NODE -- | Deprecated: historical EntityReferenceNode :: NodeType -- | DOM: ENTITY_NODE -- | Deprecated: historical EntityNode :: NodeType -- | DOM: PROCESSING_INSTRUCTION_NODE ProcessingInstructionNode :: NodeType -- | DOM: COMMENT_NODE -- -- Comment CommentNode :: NodeType -- | DOM: DOCUMENT_NODE -- -- Document DocumentNode :: NodeType -- | DOM: DOCUMENT_TYPE_NODE -- -- DocumentType DocumentTypeNode :: NodeType -- | DOM: DOCUMENT_FRAGMENT_NODE -- -- DocumentFragment DocumentFragmentNode :: NodeType -- | DOM: NOTATION_NODE -- | Deprecated: historical NotationNode :: NodeType -- | DOM: nodeType -- -- Simplify the algebraic data type to a one-dimensional Enum to -- allow equality testing rather than requiring pattern matching. nodeType :: Node -> Maybe NodeType -- | Through the long history of HTML browsers, many unique and/or buggy -- behaviours have become enshrined due to the simple fact that website -- authors used them. As the standards and the parse engines have -- continued to develop, three separated degrees of emulation have -- emerged for that backwards compatibility. data QuirksMode -- | DOM: no-quirks mode -- -- Fully compliant with the modern standard. NoQuirks :: QuirksMode -- | DOM: limited-quirks mode -- -- Largely compliant with the standard, except for a couple height -- calculations. LimitedQuirks :: QuirksMode -- | DOM: quirks mode -- -- Backwards compatibility with 1990's-era technology. FullQuirks :: QuirksMode -- | DOM: Element -- -- The collection of metadata identifying and describing a markup tag -- used to associate text or other data with its broader role in the -- document, or to indicate a preferred rendering. Values may be easily -- instantiated as updates to emptyElementParams. data ElementParams ElementParams :: Maybe ElementPrefix -> ElementName -> Maybe Namespace -> AttributeMap -> ElementParams -- | The variable fragment used to represent the elementNamespace in -- the original source. [elementPrefix] :: ElementParams -> Maybe ElementPrefix -- | The key defining what role the markup tag is meant to represent, as -- defined by the elementNamespace. [elementName] :: ElementParams -> ElementName -- | The scope defining the language by which the elementibute participates -- in the document. [elementNamespace] :: ElementParams -> Maybe Namespace -- | The points of metadata further describing rendering behaviour or -- adding other information. [elementAttributes] :: ElementParams -> AttributeMap -- | A sane default collection for easy record initialization. emptyElementParams :: ElementParams -- | Type-level clarification for the name of a markup tag. type ElementName = Text -- | Type-level clarification for the short namespace reference classifying -- a markup tag. type ElementPrefix = Text -- | DOM: NamedNodeMap -- -- Type-level clarification for the collection of key-value points of -- supplemental metadata attached to an Element. Note that, while -- an Attribute's prefix is used to determine the associated -- namespace (and needs to be tracked for round-trip serialization), it -- doesn't factor into testing equality or in lookups. type AttributeMap = HashMap (Maybe Namespace, AttributeName) (Maybe AttributePrefix, AttributeValue) -- | Pack a list of key-value metadata pairs into a form better optimized -- for random lookup. fromAttrList :: [AttributeParams] -> AttributeMap -- | Extract the key-value metadata pairs from a indexed collection into an -- iterable form. The order of elements is unspecified. toAttrList :: AttributeMap -> [AttributeParams] -- | As insert, performing the required data reordering for the -- less-comfortable internal type representation. insertAttribute :: AttributeParams -> AttributeMap -> AttributeMap -- | A simple key-value representation of an attribute on an HTML tag, -- before any namespace processing. type BasicAttribute = (AttributeName, AttributeValue) -- | DOM: Attr -- -- A more complete representation of an attribute, including extensions -- beyond the BasicAttribute to support more structured (XML-like) -- markup languages. Values may be easily instantiated as updates to -- emptyAttributeParams. data AttributeParams AttributeParams :: Maybe AttributePrefix -> AttributeName -> Maybe Namespace -> AttributeValue -> AttributeParams -- | The variable fragment used to represent the attrNamespace in -- the original source. [attrPrefix] :: AttributeParams -> Maybe AttributePrefix -- | The key defining what role the metadata value point at -- attrValue is meant to represent, as defined by the -- attrNamespace. [attrName] :: AttributeParams -> AttributeName -- | The scope defining the language by which the attribute participates in -- the document. [attrNamespace] :: AttributeParams -> Maybe Namespace -- | A point of metadata further describing rendering behaviour or adding -- other information. [attrValue] :: AttributeParams -> AttributeValue -- | A sane default collection for easy record initialization; namely, -- Nothings and emptys. emptyAttributeParams :: AttributeParams -- | Type-level clarification for the key of a supplemental point of -- metadata. type AttributeName = Text -- | Type-level clarification for the value of a supplemental point of -- metadata. type AttributeValue = Text -- | Type-level clarification for the short namespace reference classifying -- a supplemental point of metadata. type AttributePrefix = Text -- | DOM: DocumentType -- -- The collection of metadata representing a document type declaration -- describing the markup language used in a document; of vestigal use in -- HTML, but important for related languages. Values may be easily -- instantiated as updates to emptyDocumentTypeParams. data DocumentTypeParams DocumentTypeParams :: DoctypeName -> DoctypePublicId -> DoctypeSystemId -> DocumentTypeParams -- | The root element of the document, which may also identify the primary -- language used. [documentTypeName] :: DocumentTypeParams -> DoctypeName -- | A globally-unique reference to the definition of the language. [documentTypePublicId] :: DocumentTypeParams -> DoctypePublicId -- | A system-dependant (but perhaps easier to access) reference to the -- definition of the language. [documentTypeSystemId] :: DocumentTypeParams -> DoctypeSystemId -- | A sane default collection for easy record initialization; namely, -- emptys. emptyDocumentTypeParams :: DocumentTypeParams -- | Type-level clarification for the language used in the document or, -- equivalently, the name of the root node. type DoctypeName = Text -- | Type-level clarification for a registered or otherwise globally-unique -- reference to a description of the language used in the document. type DoctypePublicId = Text -- | Type-level clarification for a reference to the description of the -- language used in the document, dependant on the state of the system -- (and/or the internet). type DoctypeSystemId = Text -- | XML-NAMES: XML namespace -- -- An identifier (theoretically) pointing to a reference defining a -- particular element or attribute ---though not necessarily in -- machine-readable form--- and so providing a scope for differentiating -- multiple elements with the same local name but different semantics. type Namespace = Text -- | Infra: HTML namespace -- -- The canonical scope value for elements and attributes defined by the -- HTML standard when used in XML or XML-compatible documents. htmlNamespace :: Namespace -- | Infra: MathML namespace -- -- The canonical scope value for elements and attributes defined by the -- MathML standard. mathMLNamespace :: Namespace -- | Infra: SVG namespace -- -- The canonical scope value for elements and attributes defined by the -- SVG standard. svgNamespace :: Namespace -- | Infra: XLink namespace -- -- The canonical scope value for elements and attributes defined by the -- XLink standard. xlinkNamespace :: Namespace -- | Infra: XML namespace -- -- The canonical scope value for elements and attributes defined by the -- XML standard. xmlNamespace :: Namespace -- | Infra: XMLNS namespace -- -- The canonical scope value for elements and attributes defined by the -- XMLNS standard. xmlnsNamespace :: Namespace instance GHC.Read.Read Web.Willow.DOM.QuirksMode instance GHC.Show.Show Web.Willow.DOM.QuirksMode instance GHC.Enum.Bounded Web.Willow.DOM.QuirksMode instance GHC.Enum.Enum Web.Willow.DOM.QuirksMode instance GHC.Classes.Ord Web.Willow.DOM.QuirksMode instance GHC.Classes.Eq Web.Willow.DOM.QuirksMode instance GHC.Read.Read Web.Willow.DOM.DocumentTypeParams instance GHC.Show.Show Web.Willow.DOM.DocumentTypeParams instance GHC.Classes.Eq Web.Willow.DOM.DocumentTypeParams instance GHC.Read.Read Web.Willow.DOM.NodeType instance GHC.Show.Show Web.Willow.DOM.NodeType instance GHC.Enum.Bounded Web.Willow.DOM.NodeType instance GHC.Classes.Ord Web.Willow.DOM.NodeType instance GHC.Classes.Eq Web.Willow.DOM.NodeType instance GHC.Read.Read Web.Willow.DOM.ElementParams instance GHC.Show.Show Web.Willow.DOM.ElementParams instance GHC.Classes.Eq Web.Willow.DOM.ElementParams instance GHC.Read.Read Web.Willow.DOM.AttributeParams instance GHC.Show.Show Web.Willow.DOM.AttributeParams instance GHC.Classes.Eq Web.Willow.DOM.AttributeParams instance GHC.Read.Read Web.Willow.DOM.Node instance GHC.Show.Show Web.Willow.DOM.Node instance GHC.Classes.Eq Web.Willow.DOM.Node instance GHC.Read.Read Web.Willow.DOM.Tree instance GHC.Show.Show Web.Willow.DOM.Tree instance GHC.Classes.Eq Web.Willow.DOM.Tree instance GHC.Enum.Enum Web.Willow.DOM.NodeType