taggy-0.1.1: Efficient and simple HTML/XML parsing library

Stabilityexperimental
Maintaineralpmestan@gmail.com
Safe HaskellNone

Text.Taggy.DOM

Description

This module will help you represent an HTML or XML document as a tree and let you traverse it in whatever way you like.

This is especially useful when used in conjunction with taggy-lens, or with the Text.Taggy.Combinators module (which is used by taggy-lens).

Synopsis

Documentation

type AttrName = TextSource

An attribute name is just a Text value

type AttrValue = TextSource

An attribute value is just a Text value

data Element Source

An Element here refers to a tag name, the attributes specified withing that tag, and all the children nodes of that element. An Element is basically anything but "raw" content.

Constructors

Element 

Fields

eltName :: !Text

name of the element. e.g a for a

eltAttrs :: !(HashMap AttrName AttrValue)

a (hash)map from attribute names to attribute values

eltChildren :: [Node]

children Nodes

Instances

data Node Source

A Node is either an Element or some raw text.

Instances

Eq Node 
Show Node 
AsMarkup Node

A Node is convertible to Markup

nodeChildren :: Node -> [Node]Source

Get the children of a node.

If called on some raw text, this function returns [].

parseDOM :: Bool -> Text -> [Node]Source

Parse an HTML or XML document as a DOM tree.

The Bool argument lets you specify whether you want to convert HTML entities to their corresponding unicode characters, just like in Text.Taggy.Parser.

 parseDOM convertEntities = domify . taggyWith cventities

domify :: [Tag] -> [Node]Source

Transform a list of tags (produced with taggyWith) into a list of toplevel nodes. If the document you're working on is valid, there should only be one toplevel node, but let's not assume we're living in an ideal world.

untilClosed :: Text -> ([Node], [Tag]) -> ([Node], [Tag])Source