hxt-7.4: A collection of tools for processing XML with Haskell.Source codeContentsIndex
Text.XML.HXT.DOM.EditFilters
Description

XML editing filters

Version : $Id: EditFilters.hs,v 1.5 20061112 14:52:59 hxml Exp $

Synopsis
canonicalizeTree :: XmlFilter -> XmlFilter
canonicalizeAllNodes :: XmlFilter
canonicalizeForXPath :: XmlFilter
collapseXText :: XmlFilter
collapseAllXText :: XmlFilter
indentDoc :: XmlFilter
removeWhiteSpace :: XmlFilter
removeAllWhiteSpace :: XmlFilter
removeDocWhiteSpace :: XmlFilter
removeComment :: XmlFilter
removeAllComment :: XmlFilter
transfCdata :: XmlFilter
transfAllCdata :: XmlFilter
transfCdataEscaped :: XmlFilter
transfAllCdataEscaped :: XmlFilter
transfCharRef :: XmlFilter
transfAllCharRef :: XmlFilter
escapeXmlDoc :: XmlFilter
escapeXmlText :: XmlFilter
escapeXmlAttrValue :: XmlFilter
unparseXmlDoc :: XmlFilter
numberLinesInXmlDoc :: XmlFilter
numberLines :: String -> String
treeRepOfXmlDoc :: XmlFilter
haskellRepOfXmlDoc :: XmlFilter
addHeadlineToXmlDoc :: XmlFilter
addXmlPiToDoc :: XmlFilter
Documentation
canonicalizeTree :: XmlFilter -> XmlFilterSource

Applies some Canonical XML rules to the nodes of a tree.

The rule differ slightly for canonical XML and XPath in handling of comments

Note: This is not the whole canonicalization as it is specified by the W3C Recommendation. Adding attribute defaults or sorting attributes in lexicographic order is done by the transform function of module Text.XML.HXT.Validator.Validation. Replacing entities or line feed normalization is done by the parser.

Not implemented yet:

  • Whitespace within start and end tags is normalized
  • Special characters in attribute values and character content are replaced by character references

see canonicalizeAllNodes and canonicalizeForXPath

canonicalizeAllNodes :: XmlFilterSource

canonicalize tree and remove comments and <?xml ... ?> declarations

see canonicalizeTree

canonicalizeForXPath :: XmlFilterSource

Canonicalize a tree for XPath Comment nodes are not removed

see canonicalizeTree

collapseXText :: XmlFilterSource
Collects sequences of child XText nodes into one XText node.
collapseAllXText :: XmlFilterSource

Applies collapseXText recursively.

see also : collapseXText

indentDoc :: XmlFilterSource

filter for indenting a document tree for pretty printing.

the tree is traversed for inserting whitespace for tag indentation.

whitespace is only inserted or changed at places, where it isn't significant, is's not inserted between tags and text containing non whitespace chars.

whitespace is only inserted or changed at places, where it's not significant. preserving whitespace may be controlled in a document tree by a tag attribute xml:space

allowed values for this attribute are default | preserve.

input is a complete document tree. result the semantically equivalent formatted tree.

see also : removeDocWhiteSpace

removeWhiteSpace :: XmlFilterSource

simple filter for removing whitespace.

no check on sigificant whitespace is done.

see also : removeAllWhiteSpace, removeDocWhiteSpace

removeAllWhiteSpace :: XmlFilterSource

simple recursive filter for removing all whitespace.

removes all text nodes in a tree that consist only of whitespace.

see also : removeWhiteSpace, removeDocWhiteSpace

removeDocWhiteSpace :: XmlFilterSource

filter for removing all not significant whitespace.

the tree traversed for removing whitespace between tags, that was inserted for indentation and readability. whitespace is only removed at places, where it's not significat preserving whitespace may be controlled in a document tree by a tag attribute xml:space

allowed values for this attribute are default | preserve

input is root node of the document to be cleaned up output the semantically equivalent simplified tree

see also : indentDoc, removeAllWhiteSpace

removeComment :: XmlFilterSource
remove Comments
removeAllComment :: XmlFilterSource
remove all Comments recursively
transfCdata :: XmlFilterSource
converts CDATA section in normal text sections
transfAllCdata :: XmlFilterSource
converts CDATA sections in whole document tree
transfCdataEscaped :: XmlFilterSource
converts CDATA section in normal text nodes
transfAllCdataEscaped :: XmlFilterSource
converts CDATA sections in whole document tree into normal text nodes
transfCharRef :: XmlFilterSource
converts character references to normal text
transfAllCharRef :: XmlFilterSource
recursively converts all character references to normal text
escapeXmlDoc :: XmlFilterSource

convert the special XML chars ", <, >, & and ' in a document to char references, attribute values are converted with escapeXmlAttrValue

see also: escapeXmlText, escapeXmlAttrValue

escapeXmlText :: XmlFilterSource

convert the special XML chars in a text or comment node into character references

see also escapeXmlDoc

escapeXmlAttrValue :: XmlFilterSource

convert the special XML chars in an attribute value into charachter references. Not only the XML specials but also \n, \r and \t are converted

see also: escapeXmlDoc, escapeXmlText

unparseXmlDoc :: XmlFilterSource

convert a document tree into an output string representation with respect to the output encoding.

The children of the document root are stubstituted by a single text node for the text representation of the document.

Encoding of the document is performed with respect to the output-encoding attribute in the root node, or if not present, of the encoding attribute for the original input encoding. If the encoding is not specified or not supported, UTF-8 is taken.

numberLinesInXmlDoc :: XmlFilterSource

convert a document into a text and add line numbers to the text representation.

Result is a root node with a single text node as child. Useful for debugging and trace output. see also : haskellRepOfXmlDoc, treeRepOfXmlDoc

numberLines :: String -> StringSource
treeRepOfXmlDoc :: XmlFilterSource

convert a document into a text representation in tree form.

Useful for debugging and trace output. see also : haskellRepOfXmlDoc, numberLinesInXmlDoc

haskellRepOfXmlDoc :: XmlFilterSource

convert a document into a Haskell representation (with show).

Useful for debugging and trace output. see also : treeRepOfXmlDoc, numberLinesInXmlDoc

addHeadlineToXmlDoc :: XmlFilterSource
addXmlPiToDoc :: XmlFilterSource
Produced by Haddock version 2.3.0