tagsoup-0.12.6: Parsing and extracting information from (possibly malformed) HTML/XML documents

Safe HaskellSafe-Infered

Text.HTML.TagSoup.Entity

Description

This module converts between HTML/XML entities (i.e. &) and the characters they represent.

Synopsis

Documentation

lookupEntity :: String -> Maybe CharSource

Lookup an entity, using lookupNumericEntity if it starts with # and lookupNamedEntity otherwise

lookupNamedEntity :: String -> Maybe CharSource

Lookup a named entity, using htmlEntities

 lookupNamedEntity "amp" == Just '&'
 lookupNamedEntity "haskell" == Nothing

lookupNumericEntity :: String -> Maybe CharSource

Lookup a numeric entity, the leading '#' must have already been removed.

 lookupNumericEntity "65" == Just 'A'
 lookupNumericEntity "x41" == Just 'A'
 lookupNumericEntity "x4E" === Just 'N'
 lookupNumericEntity "x4e" === Just 'N'
 lookupNumericEntity "Haskell" == Nothing
 lookupNumericEntity "" == Nothing
 lookupNumericEntity "89439085908539082" == Nothing

escapeXMLChar :: Char -> Maybe StringSource

Escape a character before writing it out to XML.

 escapeXMLChar 'a' == Nothing
 escapeXMLChar '&' == Just "amp"

xmlEntities :: [(String, Int)]Source

A table mapping XML entity names to code points. Does not include apos as Internet Explorer does not know about it.

htmlEntities :: [(String, Int)]Source

A table mapping HTML entity names to code points