mmark-0.0.5.1: Strict markdown processor for writers

Copyright© 2017–2018 Mark Karpov
LicenseBSD 3 clause
MaintainerMark Karpov <markkarpov92@gmail.com>
Stabilityexperimental
Portabilityportable
Safe HaskellNone
LanguageHaskell2010

Text.MMark

Contents

Description

MMark (read “em-mark”) is a strict markdown processor for writers. “Strict” means that not every input is considered valid markdown document and parse errors are possible and even desirable, because they allow to spot markup issues without searching for them in rendered document. If a markdown document passes MMark parser, then it'll likely produce HTML without quirks. This feature makes it a good choice for writers and bloggers.

MMark and Common Mark

MMark mostly tries to follow the Common Mark specification as given here:

http://spec.commonmark.org/0.28/

However, due to the fact that we do not allow inputs that do not make sense, and also try to guard against common mistakes (like writing ##My header and having it rendered as a paragraph starting with hashes) MMark obviously can't follow the specification precisely. In particular, parsing of inlines differs considerably from Common Mark.

Another difference between Common Mark and MMark is that the latter supports more (pun alert) common markdown extensions out-of-the-box. In particular, MMark supports:

  • parsing of an optional YAML block
  • strikeout using ~~this~~ syntax
  • superscript using ^this^ syntax
  • subscript using ~this~ syntax
  • automatic assignment of ids to headers
  • pipe tables (as on GitHub)

One do not need to enable or tweak anything for these to work, they are built-in features.

The readme contains a more detailed description of differences between Common Mark and MMark.

How to use the library

The module is intended to be imported qualified:

import Text.MMark (MMark)
import qualified Text.MMark as MMark

Working with MMark happens in three stages:

  1. Parsing of markdown document.
  2. Applying extensions, which optionally may require scanning of previously parsed document (for example to build a table of contents).
  3. Rendering of HTML document.

The structure of the documentation below corresponds to these stages and should clarify the details.

“Getting started” example

Here is a complete example of a program that reads a markdown file named "input.md" and outputs an HTML file named "output.html":

{-# LANGUAGE OverloadedStrings #-}

module Main (main) where

import qualified Data.Text.IO      as T
import qualified Data.Text.Lazy.IO as TL
import qualified Lucid             as L
import qualified Text.MMark        as MMark

main :: IO ()
main = do
  let input = "input.md"
  txt <- T.readFile input -- (1)
  case MMark.parse input txt of -- (2)
    Left errs -> putStrLn (MMark.parseErrorsPretty txt errs) -- (3)
    Right r -> TL.writeFile "output.html" -- (6)
      . L.renderText -- (5)
      . MMark.render -- (4)
      $ r

Let's break it down:

  1. We read a source markdown file as strict Text.
  2. The source is fed into the parse function which does the parsing. It can either fail with a collection of parse errors or succeed returning a value of the opaque MMark type.
  3. If parsing fails, we pretty-print the parse errors with parseErrorsPretty.
  4. Then we just render the document with render first to Lucid's Html ().
  5. …and then to lazy Text with renderText.
  6. Finally we write the result as "output.html".

Other modules of interest

The Text.MMark module contains all the “core” functionality one may need. However, one of the main selling points of MMark is that it's possible to write your own extensions which stay highly composable (if done right), so proliferation of third-party extensions is to be expected and encouraged. To write an extension of your own import the Text.MMark.Extension module, which has some documentation focusing on extension writing.

Synopsis

Parsing

data MMark Source #

Representation of complete markdown document. You can't look inside of MMark on purpose. The only way to influence an MMark document you obtain as a result of parsing is via the extension mechanism.

Instances

Show MMark Source #

Dummy instance.

Since: 0.0.5.0

Methods

showsPrec :: Int -> MMark -> ShowS #

show :: MMark -> String #

showList :: [MMark] -> ShowS #

NFData MMark Source # 

Methods

rnf :: MMark -> () #

data MMarkErr Source #

MMark custom parse errors.

Constructors

YamlParseError String

YAML error that occurred during parsing of a YAML block

NonFlankingDelimiterRun (NonEmpty Char)

This delimiter run should be in left- or right- flanking position

ListStartIndexTooBig Word

Ordered list start numbers must be nine digits or less

Since: 0.0.2.0

ListIndexOutOfOrder Word Word

The index in an ordered list is out of order, first number is the actual index we ran into, the second number is the expected index

Since: 0.0.2.0

DuplicateReferenceDefinition Text

Duplicate reference definitions are not allowed

Since: 0.0.3.0

CouldNotFindReferenceDefinition Text [Text]

Could not find this reference definition, the second argument is the collection of close names (typo corrections)

Since: 0.0.3.0

InvalidNumericCharacter Int

This numeric character is invalid

Since: 0.0.3.0

UnknownHtmlEntityName Text

Unknown HTML5 entity name

Since: 0.0.3.0

Instances

Eq MMarkErr Source # 
Data MMarkErr Source # 

Methods

gfoldl :: (forall d b. Data d => c (d -> b) -> d -> c b) -> (forall g. g -> c g) -> MMarkErr -> c MMarkErr #

gunfold :: (forall b r. Data b => c (b -> r) -> c r) -> (forall r. r -> c r) -> Constr -> c MMarkErr #

toConstr :: MMarkErr -> Constr #

dataTypeOf :: MMarkErr -> DataType #

dataCast1 :: Typeable (* -> *) t => (forall d. Data d => c (t d)) -> Maybe (c MMarkErr) #

dataCast2 :: Typeable (* -> * -> *) t => (forall d e. (Data d, Data e) => c (t d e)) -> Maybe (c MMarkErr) #

gmapT :: (forall b. Data b => b -> b) -> MMarkErr -> MMarkErr #

gmapQl :: (r -> r' -> r) -> r -> (forall d. Data d => d -> r') -> MMarkErr -> r #

gmapQr :: (r' -> r -> r) -> r -> (forall d. Data d => d -> r') -> MMarkErr -> r #

gmapQ :: (forall d. Data d => d -> u) -> MMarkErr -> [u] #

gmapQi :: Int -> (forall d. Data d => d -> u) -> MMarkErr -> u #

gmapM :: Monad m => (forall d. Data d => d -> m d) -> MMarkErr -> m MMarkErr #

gmapMp :: MonadPlus m => (forall d. Data d => d -> m d) -> MMarkErr -> m MMarkErr #

gmapMo :: MonadPlus m => (forall d. Data d => d -> m d) -> MMarkErr -> m MMarkErr #

Ord MMarkErr Source # 
Read MMarkErr Source # 
Show MMarkErr Source # 
Generic MMarkErr Source # 

Associated Types

type Rep MMarkErr :: * -> * #

Methods

from :: MMarkErr -> Rep MMarkErr x #

to :: Rep MMarkErr x -> MMarkErr #

NFData MMarkErr Source # 

Methods

rnf :: MMarkErr -> () #

ShowErrorComponent MMarkErr Source # 
type Rep MMarkErr Source # 
type Rep MMarkErr = D1 * (MetaData "MMarkErr" "Text.MMark.Parser.Internal.Type" "mmark-0.0.5.1-CKdUe5xSiho4aM5Kft9v0b" False) ((:+:) * ((:+:) * ((:+:) * (C1 * (MetaCons "YamlParseError" PrefixI False) (S1 * (MetaSel (Nothing Symbol) NoSourceUnpackedness NoSourceStrictness DecidedLazy) (Rec0 * String))) (C1 * (MetaCons "NonFlankingDelimiterRun" PrefixI False) (S1 * (MetaSel (Nothing Symbol) NoSourceUnpackedness NoSourceStrictness DecidedLazy) (Rec0 * (NonEmpty Char))))) ((:+:) * (C1 * (MetaCons "ListStartIndexTooBig" PrefixI False) (S1 * (MetaSel (Nothing Symbol) NoSourceUnpackedness NoSourceStrictness DecidedLazy) (Rec0 * Word))) (C1 * (MetaCons "ListIndexOutOfOrder" PrefixI False) ((:*:) * (S1 * (MetaSel (Nothing Symbol) NoSourceUnpackedness NoSourceStrictness DecidedLazy) (Rec0 * Word)) (S1 * (MetaSel (Nothing Symbol) NoSourceUnpackedness NoSourceStrictness DecidedLazy) (Rec0 * Word)))))) ((:+:) * ((:+:) * (C1 * (MetaCons "DuplicateReferenceDefinition" PrefixI False) (S1 * (MetaSel (Nothing Symbol) NoSourceUnpackedness NoSourceStrictness DecidedLazy) (Rec0 * Text))) (C1 * (MetaCons "CouldNotFindReferenceDefinition" PrefixI False) ((:*:) * (S1 * (MetaSel (Nothing Symbol) NoSourceUnpackedness NoSourceStrictness DecidedLazy) (Rec0 * Text)) (S1 * (MetaSel (Nothing Symbol) NoSourceUnpackedness NoSourceStrictness DecidedLazy) (Rec0 * [Text]))))) ((:+:) * (C1 * (MetaCons "InvalidNumericCharacter" PrefixI False) (S1 * (MetaSel (Nothing Symbol) NoSourceUnpackedness NoSourceStrictness DecidedLazy) (Rec0 * Int))) (C1 * (MetaCons "UnknownHtmlEntityName" PrefixI False) (S1 * (MetaSel (Nothing Symbol) NoSourceUnpackedness NoSourceStrictness DecidedLazy) (Rec0 * Text))))))

parse Source #

Arguments

:: FilePath

File name (only to be used in error messages), may be empty

-> Text

Input to parse

-> Either (NonEmpty (ParseError Char MMarkErr)) MMark

Parse errors or parsed document

Parse a markdown document in the form of a strict Text value and either report parse errors or return an MMark document. Note that the parser has the ability to report multiple parse errors at once.

parseErrorsPretty Source #

Arguments

:: Text

Original input for parser

-> NonEmpty (ParseError Char MMarkErr)

Collection of parse errors

-> String

Result of pretty-printing

Pretty-print a collection of parse errors returned from parse.

Pro tip: if you would like to pretty-print a single ParseError, use parseErrorPretty_ (mkPos 4), because Common Mark suggests that we should assume tab width 4, and that's what we do in the parser.

Extensions

data Extension Source #

An extension. You can apply extensions with useExtension and useExtensions functions. The Text.MMark.Extension module provides tools for writing your own extensions.

Note that Extension is an instance of Semigroup and Monoid, i.e. you can combine several extensions into one. Since the (<>) operator is right-associative and mconcat is a right fold under the hood, the expression

l <> r

means that the extension r will be applied before the extension l, similar to how Endo works. This may seem counter-intuitive, but only with this logic we get consistency of ordering with more complex expressions:

e2 <> e1 <> e0 == e2 <> (e1 <> e0)

Here, e0 will be applied first, then e1, then e2. The same applies to expressions involving mconcat—extensions closer to beginning of the list passed to mconcat will be applied later.

useExtension :: Extension -> MMark -> MMark Source #

Apply an Extension to an MMark document. The order in which you apply Extensions does matter. Extensions you apply first take effect first. The extension system is designed in such a way that in many cases the order doesn't matter, but sometimes the difference is important.

useExtensions :: [Extension] -> MMark -> MMark Source #

Apply several Extensions to an MMark document.

This is a simple shortcut:

useExtensions exts = useExtension (mconcat exts)

As mentioned in the docs for useExtension, the order in which you apply extensions matters. Extensions closer to beginning of the list are applied later, i.e. the last extension in the list is applied first.

Scanning

runScanner Source #

Arguments

:: MMark

Document to scan

-> Fold Bni a

Fold to use

-> a

Result of scanning

Scan an MMark document efficiently in one pass. This uses the excellent Fold type, which see.

Take a look at the Text.MMark.Extension module if you want to create scanners of your own.

runScannerM Source #

Arguments

:: Monad m 
=> MMark

Document to scan

-> FoldM m Bni a

FoldM to use

-> m a

Result of scanning

Like runScanner, but allows to run scanners with monadic context.

To bring Fold and FoldM types to the “least common denominator” use generalize and simplify.

Since: 0.0.2.0

projectYaml :: MMark -> Maybe Value Source #

Extract contents of an optional YAML block that may have been parsed.

Rendering

render :: MMark -> Html () Source #

Render a MMark markdown document. You can then render Html () to various things: