hxt-filter-8.4.2: A collection of tools for processing XML with Haskell (Filter variant).

Text.XML.HXT.DOM.EditFilters

Description

XML editing filters

Synopsis

Documentation

canonicalizeTree :: XmlFilter -> XmlFilterSource

Applies some Canonical XML rules to the nodes of a tree.

The rule differ slightly for canonical XML and XPath in handling of comments

Note: This is not the whole canonicalization as it is specified by the W3C Recommendation. Adding attribute defaults or sorting attributes in lexicographic order is done by the transform function of module Text.XML.HXT.Validator.Validation. Replacing entities or line feed normalization is done by the parser.

Not implemented yet:

  • Whitespace within start and end tags is normalized
  • Special characters in attribute values and character content are replaced by character references

see canonicalizeAllNodes and canonicalizeForXPath

canonicalizeAllNodes :: XmlFilterSource

canonicalize tree and remove comments and <?xml ... ?> declarations

see canonicalizeTree

canonicalizeForXPath :: XmlFilterSource

Canonicalize a tree for XPath Comment nodes are not removed

see canonicalizeTree

collapseXText :: XmlFilterSource

Collects sequences of child XText nodes into one XText node.

collapseAllXText :: XmlFilterSource

Applies collapseXText recursively.

see also : collapseXText

indentDoc :: XmlFilterSource

filter for indenting a document tree for pretty printing.

the tree is traversed for inserting whitespace for tag indentation.

whitespace is only inserted or changed at places, where it isn't significant, is's not inserted between tags and text containing non whitespace chars.

whitespace is only inserted or changed at places, where it's not significant. preserving whitespace may be controlled in a document tree by a tag attribute xml:space

allowed values for this attribute are default | preserve.

input is a complete document tree. result the semantically equivalent formatted tree.

see also : removeDocWhiteSpace

removeWhiteSpace :: XmlFilterSource

simple filter for removing whitespace.

no check on sigificant whitespace is done.

see also : removeAllWhiteSpace, removeDocWhiteSpace

removeAllWhiteSpace :: XmlFilterSource

simple recursive filter for removing all whitespace.

removes all text nodes in a tree that consist only of whitespace.

see also : removeWhiteSpace, removeDocWhiteSpace

removeDocWhiteSpace :: XmlFilterSource

filter for removing all not significant whitespace.

the tree traversed for removing whitespace between tags, that was inserted for indentation and readability. whitespace is only removed at places, where it's not significat preserving whitespace may be controlled in a document tree by a tag attribute xml:space

allowed values for this attribute are default | preserve

input is root node of the document to be cleaned up output the semantically equivalent simplified tree

see also : indentDoc, removeAllWhiteSpace

removeComment :: XmlFilterSource

remove Comments

removeAllComment :: XmlFilterSource

remove all Comments recursively

transfCdata :: XmlFilterSource

converts CDATA section in normal text sections

transfAllCdata :: XmlFilterSource

converts CDATA sections in whole document tree

transfCdataEscaped :: XmlFilterSource

converts CDATA section in normal text nodes

transfAllCdataEscaped :: XmlFilterSource

converts CDATA sections in whole document tree into normal text nodes

transfCharRef :: XmlFilterSource

converts character references to normal text

transfAllCharRef :: XmlFilterSource

recursively converts all character references to normal text

escapeXmlDoc :: XmlFilterSource

convert the special XML chars ", <, >, & and ' in a document to char references, attribute values are converted with escapeXmlAttrValue

see also: escapeXmlText, escapeXmlAttrValue

escapeXmlText :: XmlFilterSource

convert the special XML chars in a text or comment node into character references

see also escapeXmlDoc

escapeXmlAttrValue :: XmlFilterSource

convert the special XML chars in an attribute value into charachter references. Not only the XML specials but also \n, \r and \t are converted

see also: escapeXmlDoc, escapeXmlText

unparseXmlDoc :: XmlFilterSource

convert a document tree into an output string representation with respect to the output encoding.

The children of the document root are stubstituted by a single text node for the text representation of the document.

Encoding of the document is performed with respect to the output-encoding attribute in the root node, or if not present, of the encoding attribute for the original input encoding. If the encoding is not specified or not supported, UTF-8 is taken.

numberLinesInXmlDoc :: XmlFilterSource

convert a document into a text and add line numbers to the text representation.

Result is a root node with a single text node as child. Useful for debugging and trace output. see also : haskellRepOfXmlDoc, treeRepOfXmlDoc

treeRepOfXmlDoc :: XmlFilterSource

convert a document into a text representation in tree form.

Useful for debugging and trace output. see also : haskellRepOfXmlDoc, numberLinesInXmlDoc

haskellRepOfXmlDoc :: XmlFilterSource

convert a document into a Haskell representation (with show).

Useful for debugging and trace output. see also : treeRepOfXmlDoc, numberLinesInXmlDoc