hsemail-ns-1.7.7: Internet Message Parsers

Text.ParserCombinators.Parsec.Rfc2234NS

Description

This module provides parsers for the grammar defined in RFC2234, "Augmented BNF for Syntax Specifications: ABNF", http://www.faqs.org/rfcs/rfc2234.html. The terminal called char in the RFC is called character here to avoid conflicts with Parsec's char function.

This module deviates from the RFC currently in

• Allowing for non-standard line endings.

These allowances are subject to change, and should not be used when parsing incoming messages, only for parsing messages that have been stored on disk. The goal of these nonstandard parsers is to provide a higher probability of parsing common headers (rather than only those explicitly defined in the RFC) as well as allowing for potential oddities / changes that may occur during storage of an email message. These parsers have be rebranded so as not to conflict with the standard parsers available from the excellent hsemail package, upon which this package depends. For patches to this package only (namely 'hsemail-ns'), patches should be sent to phlummox2@gmail.com, for patches to the proper parsers, you can send them to the original maintainer.

Synopsis

# Parser Combinators

Case-insensitive variant of Parsec's char function.

caseString :: String -> CharParser st () Source #

Case-insensitive variant of Parsec's string function.

manyN :: Int -> GenParser a b c -> GenParser a b [c] Source #

Match a parser at least n times.

manyNtoM :: Int -> Int -> GenParser a b c -> GenParser a b [c] Source #

Match a parser at least n times, but no more than m times.

parsec2read :: Parser a -> String -> [(a, String)] Source #

Helper function to generate Parser-based instances for the Read class.

# Primitive Parsers

Match any character of the alphabet.

Match either "1" or "0".

Match any 7-bit US-ASCII character except for NUL (ASCII value 0, that is).

Match the carriage return character \r.

Match returns the linefeed character \n.

Match the Internet newline \r\n.

Non-standard newline - matches any of crlf, lfcr, cr, lf (in that order)

Match any US-ASCII control character. That is any character with a decimal value in the range of [0..31,127].

Match the double quote character """.

Match any character that is valid in a hexadecimal number; ['0'..'9'] and ['A'..'F','a'..'f'] that is.

Match the tab ("\t") character.

Match "linear white-space". That is any number of consecutive wsp, optionally followed by a crlf and (at least) one more wsp.

Match any character.

Match the space.

Match any printable ASCII character. (The "v" stands for "visible".) That is any character in the decimal range of [33..126].

Match either sp or htab.

Match a quoted string. The specials "\" and """ must be escaped inside a quoted string; CR and LF are not allowed at all.