Safe Haskell | None |
---|---|
Language | Haskell2010 |
This library lets you parse Markdown into a hierarchical structure (delimited by headings). For instance, let's say your document looks like this:
This is the preface. First chapter ======================================== This chapter doesn't have sections. Second chapter ======================================== First section -------------------- Here's some text. Second section -------------------- And more text.
It can be represented as a tree:
preface
: "This is the preface."sections
: *heading
: "First chapter"content
: "This chapter doesn't have sections."sections
: [] *heading
: "Second chapter"sections
: *heading
: "First section"content
: "Here's some text."sections
: [] *heading
: "Second section"content
: "And more text."sections
: []
That's what this library does. Moreover, it lets you access the Markdown source of every node of the tree.
In most cases the only thing you need to do is something like this:
nodesToDocument
.commonmarkToNodesWithSource
[optSafe
,optNormalize
]
You can preprocess parsed Markdown after doing commonmarkToNodesWithSource
as long as you don't add or remove any top-level nodes.
Synopsis
- commonmarkToNodesWithSource :: [CMarkOption] -> Text -> WithSource [Node]
- nodesToDocument :: WithSource [Node] -> Document () ()
- data WithSource a = WithSource Text a
- getSource :: WithSource a -> Text
- stripSource :: WithSource a -> a
- data Section a b = Section {
- level :: Int
- heading :: WithSource [Node]
- headingAnn :: a
- content :: WithSource [Node]
- contentAnn :: b
- data Document a b = Document {
- preface :: WithSource [Node]
- prefaceAnn :: b
- sections :: Forest (Section a b)
- flattenDocument :: Document a b -> WithSource [Node]
- flattenSection :: Section a b -> WithSource [Node]
- flattenTree :: Tree (Section a b) -> WithSource [Node]
- flattenForest :: Forest (Section a b) -> WithSource [Node]
Parse Markdown to trees
commonmarkToNodesWithSource :: [CMarkOption] -> Text -> WithSource [Node] Source #
commonmarkToNodesWithSource
parses Markdown with the given options and
extracts nodes from the initial DOCUMENT
node.
nodesToDocument :: WithSource [Node] -> Document () () Source #
Turn a list of Markdown nodes into a tree.
data WithSource a Source #
A data type for annotating things with their source. In this library we
only use WithSource [Node]
, which stands for “some Markdown nodes + source”.
Instances
getSource :: WithSource a -> Text Source #
Extract source from WithSource
(it's stored there in a field).
stripSource :: WithSource a -> a Source #
Extract data from WithSource
.
A section in the Markdown tree.
Sections do not contain subsections; i.e. Section
isn't recursive and the
tree structure is provided by Data.Tree.
In a Section a b
, the heading is coupled with a value of type a
, and
content – with a value of type b
. This is occasionally useful.
Section | |
|
Instances
(Eq a, Eq b) => Eq (Section a b) Source # | |
(Data a, Data b) => Data (Section a b) Source # | |
Defined in CMark.Sections gfoldl :: (forall d b0. Data d => c (d -> b0) -> d -> c b0) -> (forall g. g -> c g) -> Section a b -> c (Section a b) # gunfold :: (forall b0 r. Data b0 => c (b0 -> r) -> c r) -> (forall r. r -> c r) -> Constr -> c (Section a b) # toConstr :: Section a b -> Constr # dataTypeOf :: Section a b -> DataType # dataCast1 :: Typeable t => (forall d. Data d => c (t d)) -> Maybe (c (Section a b)) # dataCast2 :: Typeable t => (forall d e. (Data d, Data e) => c (t d e)) -> Maybe (c (Section a b)) # gmapT :: (forall b0. Data b0 => b0 -> b0) -> Section a b -> Section a b # gmapQl :: (r -> r' -> r) -> r -> (forall d. Data d => d -> r') -> Section a b -> r # gmapQr :: (r' -> r -> r) -> r -> (forall d. Data d => d -> r') -> Section a b -> r # gmapQ :: (forall d. Data d => d -> u) -> Section a b -> [u] # gmapQi :: Int -> (forall d. Data d => d -> u) -> Section a b -> u # gmapM :: Monad m => (forall d. Data d => d -> m d) -> Section a b -> m (Section a b) # gmapMp :: MonadPlus m => (forall d. Data d => d -> m d) -> Section a b -> m (Section a b) # gmapMo :: MonadPlus m => (forall d. Data d => d -> m d) -> Section a b -> m (Section a b) # | |
(Show a, Show b) => Show (Section a b) Source # | |
Generic (Section a b) Source # | |
type Rep (Section a b) Source # | |
Defined in CMark.Sections type Rep (Section a b) = D1 (MetaData "Section" "CMark.Sections" "cmark-sections-0.3.0.1-Bbogd1UD79eL1RVOfgh7et" False) (C1 (MetaCons "Section" PrefixI True) ((S1 (MetaSel (Just "level") NoSourceUnpackedness NoSourceStrictness DecidedLazy) (Rec0 Int) :*: S1 (MetaSel (Just "heading") NoSourceUnpackedness NoSourceStrictness DecidedLazy) (Rec0 (WithSource [Node]))) :*: (S1 (MetaSel (Just "headingAnn") NoSourceUnpackedness NoSourceStrictness DecidedLazy) (Rec0 a) :*: (S1 (MetaSel (Just "content") NoSourceUnpackedness NoSourceStrictness DecidedLazy) (Rec0 (WithSource [Node])) :*: S1 (MetaSel (Just "contentAnn") NoSourceUnpackedness NoSourceStrictness DecidedLazy) (Rec0 b))))) |
The whole parsed Markdown tree. In a Document a b
, headings are
annotated with a
and content blocks – with b
.
Document | |
|
Instances
(Eq b, Eq a) => Eq (Document a b) Source # | |
(Data a, Data b) => Data (Document a b) Source # | |
Defined in CMark.Sections gfoldl :: (forall d b0. Data d => c (d -> b0) -> d -> c b0) -> (forall g. g -> c g) -> Document a b -> c (Document a b) # gunfold :: (forall b0 r. Data b0 => c (b0 -> r) -> c r) -> (forall r. r -> c r) -> Constr -> c (Document a b) # toConstr :: Document a b -> Constr # dataTypeOf :: Document a b -> DataType # dataCast1 :: Typeable t => (forall d. Data d => c (t d)) -> Maybe (c (Document a b)) # dataCast2 :: Typeable t => (forall d e. (Data d, Data e) => c (t d e)) -> Maybe (c (Document a b)) # gmapT :: (forall b0. Data b0 => b0 -> b0) -> Document a b -> Document a b # gmapQl :: (r -> r' -> r) -> r -> (forall d. Data d => d -> r') -> Document a b -> r # gmapQr :: (r' -> r -> r) -> r -> (forall d. Data d => d -> r') -> Document a b -> r # gmapQ :: (forall d. Data d => d -> u) -> Document a b -> [u] # gmapQi :: Int -> (forall d. Data d => d -> u) -> Document a b -> u # gmapM :: Monad m => (forall d. Data d => d -> m d) -> Document a b -> m (Document a b) # gmapMp :: MonadPlus m => (forall d. Data d => d -> m d) -> Document a b -> m (Document a b) # gmapMo :: MonadPlus m => (forall d. Data d => d -> m d) -> Document a b -> m (Document a b) # | |
(Show b, Show a) => Show (Document a b) Source # | |
Generic (Document a b) Source # | |
type Rep (Document a b) Source # | |
Defined in CMark.Sections type Rep (Document a b) = D1 (MetaData "Document" "CMark.Sections" "cmark-sections-0.3.0.1-Bbogd1UD79eL1RVOfgh7et" False) (C1 (MetaCons "Document" PrefixI True) (S1 (MetaSel (Just "preface") NoSourceUnpackedness NoSourceStrictness DecidedLazy) (Rec0 (WithSource [Node])) :*: (S1 (MetaSel (Just "prefaceAnn") NoSourceUnpackedness NoSourceStrictness DecidedLazy) (Rec0 b) :*: S1 (MetaSel (Just "sections") NoSourceUnpackedness NoSourceStrictness DecidedLazy) (Rec0 (Forest (Section a b)))))) |
Work with parsed trees
Note that you can use (<>
) to combine WithSource
nodes together. It will
concatenate sources and parsed Markdown.
I'm not sure how valid this operation is for Markdown, but probably more-or-less valid (when you exclude corner cases like missing newlines at the end and duplicate links). Maybe cmark doesn't even allow duplicate links, I don't know.
flattenDocument :: Document a b -> WithSource [Node] Source #
Turn the whole parsed-and-broken-down Document
into a list of nodes.
flattenSection :: Section a b -> WithSource [Node] Source #
Turn a section into a list of nodes.
flattenTree :: Tree (Section a b) -> WithSource [Node] Source #
flattenForest :: Forest (Section a b) -> WithSource [Node] Source #