Portability	portable
Stability	alpha
Maintainer	John MacFarlane <jgm@berkeley.edu>
Safe Haskell	None

Text.Pandoc.Parsing

Contents

Re-exports from Text.Pandoc.Parsec

Description

A utility library with parsers used in pandoc readers.

Synopsis

Documentation

(>>~) :: Monad m => m a -> m b -> m aSource

Like >>, but returns the operation on the left. (Suggested by Tillmann Rendel on Haskell-cafe list.)

anyLine :: Parser [Char] st [Char]Source

Parse any line of text

many1Till :: Parser [tok] st a -> Parser [tok] st end -> Parser [tok] st [a]Source

Like manyTill, but reads at least one item.

notFollowedBy' :: Show b => Parser [a] st b -> Parser [a] st ()Source

A more general form of notFollowedBy. This one allows any type of parser to be specified, and succeeds only if that parser fails. It does not consume any input.

oneOfStrings :: [String] -> Parser [Char] st String Source

Parses one of a list of strings. If the list contains two strings one of which is a prefix of the other, the longer string will be matched if possible.

oneOfStringsCI :: [String] -> Parser [Char] st String Source

Parses one of a list of strings (tried in order), case insensitive.

spaceChar :: Parser [Char] st Char Source

Parses a space or tab.

nonspaceChar :: Parser [Char] st Char Source

Parses a nonspace, nonnewline character.

skipSpaces :: Parser [Char] st ()Source

Skips zero or more spaces or tabs.

blankline :: Parser [Char] st Char Source

Skips zero or more spaces or tabs, then reads a newline.

blanklines :: Parser [Char] st [Char]Source

Parses one or more blank lines and returns a string of newlines.

enclosed Source

Arguments

:: Parser [Char] st t	start parser
-> Parser [Char] st end	end parser
-> Parser [Char] st a	content parser (to be used repeatedly)
-> Parser [Char] st [a]

Parses material enclosed between start and end parsers.

stringAnyCase :: [Char] -> Parser [Char] st String Source

Parse string, case insensitive.

parseFromString :: Parser [tok] st a -> [tok] -> Parser [tok] st aSource

Parse contents of str using parser and return result.

lineClump :: Parser [Char] st String Source

Parse raw line block up to and including blank lines.

charsInBalanced :: Char -> Char -> Parser [Char] st Char -> Parser [Char] st String Source

Parse a string of characters between an open character and a close character, including text between balanced pairs of open and close, which must be different. For example, charsInBalanced '(' ')' anyChar will parse (hello (there)) and return hello (there).

romanNumeral Source

Arguments

:: Bool	Uppercase if true
-> Parser [Char] st Int

Parses a roman numeral (uppercase or lowercase), returns number.

emailAddress :: Parser [Char] st (String, String)Source

Parses an email address; returns original and corresponding escaped mailto: URI.

uri :: Parser [Char] st (String, String)Source

Parses a URI. Returns pair of original and URI-escaped version.

withHorizDisplacement Source

Arguments

:: Parser [Char] st a	Parser to apply
-> Parser [Char] st (a, Int)	(result, displacement)

Applies a parser, returns tuple of its results and its horizontal displacement (the difference between the source column at the end and the source column at the beginning). Vertical displacement (source row) is ignored.

withRaw :: Parser [Char] st a -> Parser [Char] st (a, [Char])Source

Applies a parser and returns the raw string that was parsed, along with the value produced by the parser.

escaped Source

Arguments

:: Parser [Char] st Char	Parser for character to escape
-> Parser [Char] st Char

Parses backslash, then applies character parser.

characterReference :: Parser [Char] st Char Source

Parse character entity.

updateLastStrPos :: Parser [Char] ParserState ()Source

anyOrderedListMarker :: Parser [Char] ParserState ListAttributes Source

Parses an ordered list marker and returns list attributes.

orderedListMarker :: ListNumberStyle -> ListNumberDelim -> Parser [Char] ParserState Int Source

Parses an ordered list marker with a given style and delimiter, returns number.

charRef :: Parser [Char] st Inline Source

Parses a character reference and returns a Str element.

lineBlockLines :: Parser [Char] st [String]Source

Parses an RST-style line block and returns a list of strings.

tableWith :: Parser [Char] ParserState ([[Block]], [Alignment], [Int]) -> ([Int] -> Parser [Char] ParserState [[Block]]) -> Parser [Char] ParserState sep -> Parser [Char] ParserState end -> Parser [Char] ParserState Block Source

Parse a table using headerParser, rowParser, lineParser, and footerParser.

widthsFromIndices :: Int -> [Int] -> [Double]Source

gridTableWith Source

Arguments

:: Parser [Char] ParserState [Block]	Block list parser
-> Bool	Headerless table
-> Parser [Char] ParserState Block

readWith Source

Arguments

:: Parser [t] ParserState a	parser
-> ParserState	initial state
-> [t]	input
-> a

Parse a string with a given parser and state.

testStringWith :: Show a => Parser [Char] ParserState a -> String -> IO ()Source

Parse a string with parser (for testing).

getOption :: (ReaderOptions -> a) -> Parser s ParserState aSource

guardEnabled :: Extension -> Parser s ParserState ()Source

Succeed only if the extension is enabled.

guardDisabled :: Extension -> Parser s ParserState ()Source

Succeed only if the extension is disabled.

data ParserState Source

Parsing options.

Constructors

ParserState

Fields

stateOptions :: ReaderOptions: User options
stateParserContext :: ParserContext: Inside list?
stateQuoteContext :: QuoteContext: Inside quoted environment?
stateAllowLinks :: Bool: Allow parsing of links
stateMaxNestingLevel :: Int: Max # of nested Strong/Emph
stateLastStrPos :: Maybe SourcePos: Position after last str parsed
stateKeys :: KeyTable: List of reference keys (with fallbacks)
stateSubstitutions :: SubstTable: List of substitution references
stateNotes :: NoteTable: List of notes (raw bodies)
stateNotes' :: NoteTable': List of notes (parsed bodies)
stateTitle :: [Inline]: Title of document
stateAuthors :: [[Inline]]: Authors of document
stateDate :: [Inline]: Date of document
stateHeaderTable :: [HeaderType]: Ordered list of header types used
stateHeaders :: [[Inline]]: List of headers (used for implicit ref links)
stateIdentifiers :: [String]: List of header identifiers used
stateNextExample :: Int: Number of next example
stateExamples :: Map String Int: Map from example labels to numbers
stateHasChapters :: Bool: True if chapter encountered
stateMacros :: [Macro]: List of macros defined so far
stateRstDefaultRole :: String: Current rST default interpreted text role
stateWarnings :: [String]: Warnings generated by the parser

Instances

Default ParserState

defaultParserState :: ParserState Source

data HeaderType Source

Constructors

SingleHeader Char	Single line of characters underneath
DoubleHeader Char	Lines of characters above and below

Instances

Eq HeaderType
Show HeaderType

data ParserContext Source

Constructors

ListItemState	Used when running parser on list item contents
NullState	Default state

Instances

Eq ParserContext
Show ParserContext

data QuoteContext Source

Constructors

InSingleQuote	Used when parsing inside single quotes
InDoubleQuote	Used when parsing inside double quotes
NoQuote	Used when not parsing inside quotes

Instances

Eq QuoteContext
Show QuoteContext

type NoteTable = [(String, String)]Source

type NoteTable' = [(String, F Blocks)]Source

type KeyTable = Map Key Target Source

type SubstTable = Map Key Inlines Source

newtype Key Source

Constructors

Key String

Instances

Eq Key
Ord Key
Read Key
Show Key

toKey :: String -> Key Source

smartPunctuation :: Parser [Char] ParserState Inline -> Parser [Char] ParserState Inline Source

withQuoteContext :: QuoteContext -> Parser [tok] ParserState a -> Parser [tok] ParserState aSource

singleQuoteStart :: Parser [Char] ParserState ()Source

singleQuoteEnd :: Parser [Char] st ()Source

doubleQuoteStart :: Parser [Char] ParserState ()Source

doubleQuoteEnd :: Parser [Char] st ()Source

ellipses :: Parser [Char] st Inline Source

apostrophe :: Parser [Char] ParserState Inline Source

dash :: Parser [Char] ParserState Inline Source

nested :: Parser s ParserState a -> Parser s ParserState aSource

macro :: Parser [Char] ParserState Blocks Source

Parse a newcommand or renewcommand macro definition.

applyMacros' :: String -> Parser [Char] ParserState String Source

Apply current macros to string.

type Parser t s = Parsec t sSource

newtype F a Source

Constructors

F
Fields unF :: Reader ParserState a

Instances

Monad F
Functor F
Monoid a => Monoid (F a)

runF :: F a -> ParserState -> aSource

askF :: F ParserState Source

asksF :: (ParserState -> a) -> F aSource

Re-exports from Text.Pandoc.Parsec

runParser :: Stream s Identity t => Parsec s u a -> u -> SourceName -> s -> Either ParseError a

The most general way to run a parser over the Identity monad. runParser p state filePath input runs parser p on the input list of tokens input, obtained from source filePath with the initial user state st. The filePath is only used in error messages and may be the empty string. Returns either a ParseError (Left) or a value of type a (Right).

  parseFromFile p fname
    = do{ input <- readFile fname
        ; return (runParser p () fname input)
        }

parse :: Stream s Identity t => Parsec s () a -> SourceName -> s -> Either ParseError a

parse p filePath input runs a parser p over Identity without user state. The filePath is only used in error messages and may be the empty string. Returns either a ParseError (Left) or a value of type a (Right).

  main    = case (parse numbers "" "11, 2, 43") of
             Left err  -> print err
             Right xs  -> print (sum xs)

  numbers = commaSep integer

anyToken :: (Stream s m t, Show t) => ParsecT s u m t

The parser anyToken accepts any kind of token. It is for example used to implement eof. Returns the accepted token.

getInput :: Monad m => ParsecT s u m s

Returns the current input

setInput :: Monad m => s -> ParsecT s u m ()

setInput input continues parsing with input. The getInput and setInput functions can for example be used to deal with #include files.

unexpected :: Stream s m t => String -> ParsecT s u m a

The parser unexpected msg always fails with an unexpected error message msg without consuming any input.

The parsers fail, (<?>) and unexpected are the three parsers used to generate error messages. Of these, only (<?>) is commonly used. For an example of the use of unexpected, see the definition of notFollowedBy.

char :: Stream s m Char => Char -> ParsecT s u m Char

char c parses a single character c. Returns the parsed character (i.e. c).

  semiColon  = char ';'

letter :: Stream s m Char => ParsecT s u m Char

Parses a letter (an upper case or lower case character). Returns the parsed character.

digit :: Stream s m Char => ParsecT s u m Char

Parses a digit. Returns the parsed character.

alphaNum :: Stream s m Char => ParsecT s u m Char

Parses a letter or digit (a character between '0' and '9'). Returns the parsed character.

skipMany :: ParsecT s u m a -> ParsecT s u m ()

skipMany p applies the parser p zero or more times, skipping its result.

  spaces  = skipMany space

skipMany1 :: Stream s m t => ParsecT s u m a -> ParsecT s u m ()

skipMany1 p applies the parser p one or more times, skipping its result.

spaces :: Stream s m Char => ParsecT s u m ()

Skips zero or more white space characters. See also skipMany.

space :: Stream s m Char => ParsecT s u m Char

Parses a white space character (any character which satisfies isSpace) Returns the parsed character.

anyChar :: Stream s m Char => ParsecT s u m Char

This parser succeeds for any character. Returns the parsed character.

satisfy :: Stream s m Char => (Char -> Bool) -> ParsecT s u m Char

The parser satisfy f succeeds for any character for which the supplied function f returns True. Returns the character that is actually parsed.

newline :: Stream s m Char => ParsecT s u m Char

Parses a newline character ('\n'). Returns a newline character.

string :: Stream s m Char => String -> ParsecT s u m String

string s parses a sequence of characters given by s. Returns the parsed string (i.e. s).

  divOrMod    =   string "div" 
              <|> string "mod"

count :: Stream s m t => Int -> ParsecT s u m a -> ParsecT s u m [a]

count n p parses n occurrences of p. If n is smaller or equal to zero, the parser equals to return []. Returns a list of n values returned by p.

eof :: (Stream s m t, Show t) => ParsecT s u m ()

This parser only succeeds at the end of the input. This is not a primitive parser but it is defined using notFollowedBy.

  eof  = notFollowedBy anyToken <?> "end of input"

noneOf :: Stream s m Char => [Char] -> ParsecT s u m Char

As the dual of oneOf, noneOf cs succeeds if the current character not in the supplied list of characters cs. Returns the parsed character.

  consonant = noneOf "aeiou"

oneOf :: Stream s m Char => [Char] -> ParsecT s u m Char

oneOf cs succeeds if the current character is in the supplied list of characters cs. Returns the parsed character. See also satisfy.

   vowel  = oneOf "aeiou"

lookAhead :: Stream s m t => ParsecT s u m a -> ParsecT s u m a

lookAhead p parses p without consuming any input.

If p fails and consumes some input, so does lookAhead. Combine with try if this is undesirable.

notFollowedBy :: (Stream s m t, Show a) => ParsecT s u m a -> ParsecT s u m ()

notFollowedBy p only succeeds when parser p fails. This parser does not consume any input. This parser can be used to implement the 'longest match' rule. For example, when recognizing keywords (for example let), we want to make sure that a keyword is not followed by a legal identifier character, in which case the keyword is actually an identifier (for example lets). We can program this behaviour as follows:

  keywordLet  = try (do{ string "let"
                       ; notFollowedBy alphaNum
                       })

many :: ParsecT s u m a -> ParsecT s u m [a]

many p applies the parser p zero or more times. Returns a list of the returned values of p.

  identifier  = do{ c  <- letter
                  ; cs <- many (alphaNum <|> char '_')
                  ; return (c:cs)
                  }

many1 :: Stream s m t => ParsecT s u m a -> ParsecT s u m [a]

many1 p applies the parser p one or more times. Returns a list of the returned values of p.

  word  = many1 letter

manyTill :: Stream s m t => ParsecT s u m a -> ParsecT s u m end -> ParsecT s u m [a]

manyTill p end applies parser p zero or more times until parser end succeeds. Returns the list of values returned by p. This parser can be used to scan comments:

  simpleComment   = do{ string "<!--"
                      ; manyTill anyChar (try (string "-->"))
                      }

Note the overlapping parsers anyChar and string "-->", and therefore the use of the try combinator.

(<|>) :: ParsecT s u m a -> ParsecT s u m a -> ParsecT s u m a

This combinator implements choice. The parser p <|> q first applies p. If it succeeds, the value of p is returned. If p fails without consuming any input, parser q is tried. This combinator is defined equal to the mplus member of the MonadPlus class and the (<|>) member of Alternative.

The parser is called predictive since q is only tried when parser p didn't consume any input (i.e.. the look ahead is 1). This non-backtracking behaviour allows for both an efficient implementation of the parser combinators and the generation of good error messages.

(<?>) :: ParsecT s u m a -> String -> ParsecT s u m a

The parser p ? msg behaves as parser p, but whenever the parser p fails without consuming any input, it replaces expect error messages with the expect error message msg.

This is normally used at the end of a set alternatives where we want to return an error message in terms of a higher level construct rather than returning all possible characters. For example, if the expr parser from the try example would fail, the error message is: '...: expecting expression'. Without the (<?>) combinator, the message would be like '...: expecting "let" or letter', which is less friendly.

choice :: Stream s m t => [ParsecT s u m a] -> ParsecT s u m a

choice ps tries to apply the parsers in the list ps in order, until one of them succeeds. Returns the value of the succeeding parser.

try :: ParsecT s u m a -> ParsecT s u m a

The parser try p behaves like parser p, except that it pretends that it hasn't consumed any input when an error occurs.

This combinator is used whenever arbitrary look ahead is needed. Since it pretends that it hasn't consumed any input when p fails, the (<|>) combinator will try its second alternative even when the first parser failed while consuming input.

The try combinator can for example be used to distinguish identifiers and reserved words. Both reserved words and identifiers are a sequence of letters. Whenever we expect a certain reserved word where we can also expect an identifier we have to use the try combinator. Suppose we write:

  expr        = letExpr <|> identifier <?> "expression"

  letExpr     = do{ string "let"; ... }
  identifier  = many1 letter

If the user writes "lexical", the parser fails with: unexpected 'x', expecting 't' in "let". Indeed, since the (<|>) combinator only tries alternatives when the first alternative hasn't consumed input, the identifier parser is never tried (because the prefix "le" of the string "let" parser is already consumed). The right behaviour can be obtained by adding the try combinator:

  expr        = letExpr <|> identifier <?> "expression"

  letExpr     = do{ try (string "let"); ... }
  identifier  = many1 letter

sepBy :: Stream s m t => ParsecT s u m a -> ParsecT s u m sep -> ParsecT s u m [a]

sepBy p sep parses zero or more occurrences of p, separated by sep. Returns a list of values returned by p.

  commaSep p  = p `sepBy` (symbol ",")

sepBy1 :: Stream s m t => ParsecT s u m a -> ParsecT s u m sep -> ParsecT s u m [a]

sepBy1 p sep parses one or more occurrences of p, separated by sep. Returns a list of values returned by p.

sepEndBy :: Stream s m t => ParsecT s u m a -> ParsecT s u m sep -> ParsecT s u m [a]

sepEndBy p sep parses zero or more occurrences of p, separated and optionally ended by sep, ie. haskell style statements. Returns a list of values returned by p.

  haskellStatements  = haskellStatement `sepEndBy` semi

sepEndBy1 :: Stream s m t => ParsecT s u m a -> ParsecT s u m sep -> ParsecT s u m [a]

sepEndBy1 p sep parses one or more occurrences of p, separated and optionally ended by sep. Returns a list of values returned by p.

endBy :: Stream s m t => ParsecT s u m a -> ParsecT s u m sep -> ParsecT s u m [a]

endBy p sep parses zero or more occurrences of p, seperated and ended by sep. Returns a list of values returned by p.

   cStatements  = cStatement `endBy` semi

endBy1 :: Stream s m t => ParsecT s u m a -> ParsecT s u m sep -> ParsecT s u m [a]

endBy1 p sep parses one or more occurrences of p, seperated and ended by sep. Returns a list of values returned by p.

option :: Stream s m t => a -> ParsecT s u m a -> ParsecT s u m a

option x p tries to apply parser p. If p fails without consuming input, it returns the value x, otherwise the value returned by p.

  priority  = option 0 (do{ d <- digit
                          ; return (digitToInt d) 
                          })

optional :: Stream s m t => ParsecT s u m a -> ParsecT s u m ()

optional p tries to apply parser p. It will parse p or nothing. It only fails if p fails after consuming input. It discards the result of p.

optionMaybe :: Stream s m t => ParsecT s u m a -> ParsecT s u m (Maybe a)

optionMaybe p tries to apply parser p. If p fails without consuming input, it return Nothing, otherwise it returns Just the value returned by p.

getState :: Monad m => ParsecT s u m u

Returns the current user state.

setState :: Monad m => u -> ParsecT s u m ()

An alias for putState for backwards compatibility.

updateState :: Monad m => (u -> u) -> ParsecT s u m ()

An alias for modifyState for backwards compatibility.

data SourcePos

The abstract data type SourcePos represents source positions. It contains the name of the source (i.e. file name), a line number and a column number. SourcePos is an instance of the Show, Eq and Ord class.

Instances

Eq SourcePos
Data SourcePos
Ord SourcePos
Show SourcePos
Typeable SourcePos

getPosition :: Monad m => ParsecT s u m SourcePos

Returns the current source position. See also SourcePos.

setPosition :: Monad m => SourcePos -> ParsecT s u m ()

setPosition pos sets the current source position to pos.

sourceColumn :: SourcePos -> Column

Extracts the column number from a source position.

sourceLine :: SourcePos -> Line

Extracts the line number from a source position.

newPos :: SourceName -> Line -> Column -> SourcePos

Create a new SourcePos with the given source name, line number and column number.

token

Arguments

:: Stream s Identity t
=> (t -> String)	Token pretty-printing function.
-> (t -> SourcePos)	Computes the position of a token.
-> (t -> Maybe a)	Matching function for the token to parse.
-> Parsec s u a

The parser token showTok posFromTok testTok accepts a token t with result x when the function testTok t returns Just x. The source position of the t should be returned by posFromTok t and the token can be shown using showTok t.

This combinator is expressed in terms of tokenPrim. It is used to accept user defined token streams. For example, suppose that we have a stream of basic tokens tupled with source positions. We can than define a parser that accepts single tokens as:

  mytoken x
    = token showTok posFromTok testTok
    where
      showTok (pos,t)     = show t
      posFromTok (pos,t)  = pos
      testTok (pos,t)     = if x == t then Just t else Nothing