tagsoup-0.6: Parsing and extracting information from (possibly malformed) HTML documents

Text.HTML.TagSoup.Type

Contents

Data structures and parsing
Tag identification
Extraction

Description

The central type in TagSoup

Synopsis

Data structures and parsing

data Tag Source

An HTML element, a document is [Tag]. There is no requirement for TagOpen and TagClose to match

Constructors

TagOpen String [Attribute]	An open tag with `Attribute`s in their original order.
TagClose String	A closing tag
TagText String	A text node, guaranteed not to be the empty string
TagComment String	A comment
TagWarning String	Meta: Mark a syntax error in the input file
TagPosition !Row !Column	Meta: The position of a parsed element

Instances

Eq Tag
Ord Tag
Show Tag
TagRep Tag

type Attribute = (String, String)Source

An HTML attribute id="name" generates ("id","name")

type Row = Int Source

type Column = Int Source

Tag identification

isTagOpen :: Tag -> Bool Source

Test if a Tag is a TagOpen

isTagClose :: Tag -> Bool Source

Test if a Tag is a TagClose

isTagText :: Tag -> Bool Source

Test if a Tag is a TagText

isTagWarning :: Tag -> Bool Source

Test if a Tag is a TagWarning

isTagOpenName :: String -> Tag -> Bool Source

Returns True if the Tag is TagOpen and matches the given name

isTagCloseName :: String -> Tag -> Bool Source

Returns True if the Tag is TagClose and matches the given name

Extraction

fromTagText :: Tag -> String Source

Extract the string from within TagText, crashes if not a TagText

fromAttrib :: String -> Tag -> String Source

Extract an attribute, crashes if not a TagOpen. Returns "" if no attribute present.

maybeTagText :: Tag -> Maybe String Source

Extract the string from within TagText, otherwise Nothing

maybeTagWarning :: Tag -> Maybe String Source

Extract the string from within TagWarning, otherwise Nothing

innerText :: [Tag] -> String Source

Extract all text content from tags (similar to Verbatim found in HaXml)