tagsoup-ht-0.2: alternative parser for the tagsoup package

Text.HTML.TagSoup.HT.Tag

Contents

Synopsis

type definitions

type Attribute char = (String, [char])Source

An HTML attribute id="name" generates ("id","name")

data T char Source

An HTML element, a document is [T]. There is no requirement for Open and Close to match.

The type parameter char lets you choose between Char for interpreted HTML entity references and HTMLChar.T for uninterpreted HTML entity. You will most oftenly want plain Char, since HTMLChar.T is only necessary if you want to know, whether a non-ASCII character was encoded as HTML entity or as non-ASCII Unicode character.

Constructors

Open String [Attribute char]

An open tag with Attributes in their original order.

Close String

A closing tag

Text [char]

A text node, guaranteed not to be the empty string

Comment String

A comment

Special String String

A tag like <!DOCTYPE ...>

Processing String (Processing char)

A tag like <?xml ...>

Warning String

Mark a syntax error in the input file

Instances

Eq char => Eq (T char) 
Ord char => Ord (T char) 
Show char => Show (T char) 

data Processing char Source

Instances

Eq char => Eq (Processing char) 
Ord char => Ord (Processing char) 
Show char => Show (Processing char) 

check for certain tag types

isOpen :: T char -> BoolSource

Test if a T is a Open

maybeOpen :: T char -> Maybe (String, [Attribute char])Source

isClose :: T char -> BoolSource

Test if a T is a Close

isText :: T char -> BoolSource

Test if a T is a Text

maybeText :: T char -> Maybe [char]Source

Extract the string from within Text, otherwise Nothing

innerText :: [T char] -> [char]Source

Extract all text content from tags (similar to Verbatim found in HaXml)

tag processing

canonicalizeSoup :: [T char] -> [T char]Source

canonicalize :: T char -> T charSource

Turns all tag names to lower case and converts DOCTYPE to upper case.

textFromCData :: T Char -> T CharSource

Replace CDATA sections by plain text.