tagchup-0.4.0.2: alternative package for processing of tag soups

Text.HTML.Tagchup.Parser

Description

Parse a string into our custom tag soup data structure.

The parser works only on proper Unicode texts. That is, you must have decoded it before, e.g. using decoding functions from hxt or encoding package. Text.HTML.Tagchup.Process.findMetaEncoding can assist you retrieving the character set encoding from meta information of the document at hand.

Synopsis

Documentation

class C char => CharType char Source

Instances

runSoup :: (C source, StringType sink, Attribute name, Tag name) => source -> [T name sink]Source

Like runSoupWithPositions but hides source file positions.

runSoupWithPositions :: (C source, StringType sink, Attribute name, Tag name) => source -> [T name sink]Source

Parse an HTML document to a list of T. Automatically expands out escape characters.

runSoupWithPositionsName :: (C source, StringType sink, Attribute name, Tag name) => FilePath -> source -> [T name sink]Source

runTag :: (C source, StringType sink, Show sink, Attribute name, Tag name, Show name) => source -> T name sinkSource

Parse a single tag, throws an error if there is a syntax error. This is useful for parsing a match pattern.

runInnerOfTag :: (StringType sink, Show sink, Attribute name, Tag name, Show name) => String -> T name sinkSource

Parse the inner of a single tag. That is, runTag "<bla>" is the same as runInnerOfTag "bla".