This module provides for simple DOM traversal. It is inspired by XPath. There are two central concepts here:
- A
Cursor
represents a node in the DOM. It also contains information on the node's location. While theNode
datatype will only know of its children, aCursor
knows about its parent and siblings as well. (The underlying mechanism allowing this is called a zipper, see http://www.haskell.org/haskellwiki/Zipper and http://www.haskell.org/haskellwiki/Tying_the_Knot.) - An
Axis
, in its simplest form, takes aCursor
and returns a list ofCursor
s. It is used for selections, such as finding children, ancestors, etc. Axes can be chained together to express complex rules, such as all children named foo.
The terminology used in this module is taken directly from the XPath specification: http://www.w3.org/TR/xpath/. For those familiar with XPath, the one major difference is that attributes are not considered nodes in this module.
- type Cursor = Cursor Node
- type Axis = Cursor -> [Cursor]
- fromDocument :: Document -> Cursor
- fromNode :: Node -> Cursor
- cut :: Cursor -> Cursor
- parent :: Axis node
- precedingSibling :: Axis node
- followingSibling :: Axis node
- child :: Cursor node -> [Cursor node]
- node :: Cursor node -> node
- preceding :: Axis node
- following :: Axis node
- ancestor :: Axis node
- descendant :: Axis node
- orSelf :: Axis node -> Axis node
- check :: Boolean b => (Cursor -> b) -> Axis
- checkNode :: Boolean b => (Node -> b) -> Axis
- checkElement :: Boolean b => (Element -> b) -> Axis
- checkName :: Boolean b => (Name -> b) -> Axis
- anyElement :: Axis
- element :: Name -> Axis
- laxElement :: Text -> Axis
- content :: Cursor -> [Text]
- attribute :: Name -> Cursor -> [Text]
- laxAttribute :: Text -> Cursor -> [Text]
- hasAttribute :: Name -> Axis
- attributeIs :: Name -> Text -> Axis
- (&|) :: (Cursor node -> [a]) -> (a -> b) -> Cursor node -> [b]
- (&/) :: Axis node -> (Cursor node -> [a]) -> Cursor node -> [a]
- (&//) :: Axis node -> (Cursor node -> [a]) -> Cursor node -> [a]
- (&.//) :: Axis node -> (Cursor node -> [a]) -> Cursor node -> [a]
- ($|) :: Cursor node -> (Cursor node -> a) -> a
- ($/) :: Cursor node -> (Cursor node -> [a]) -> [a]
- ($//) :: Cursor node -> (Cursor node -> [a]) -> [a]
- ($.//) :: Cursor node -> (Cursor node -> [a]) -> [a]
- (>=>) :: Monad m => (a -> m b) -> (b -> m c) -> a -> m c
- class Boolean a where
- force :: Failure e f => e -> [a] -> f a
- forceM :: Failure e f => e -> [f a] -> f a
Data types
type Cursor = Cursor NodeSource
A cursor: contains an XML Node
and pointers to its children, ancestors and siblings.
type Axis = Cursor -> [Cursor]Source
The type of an Axis that returns a list of Cursors. They are roughly modeled after http://www.w3.org/TR/xpath/#axes.
Axes can be composed with >=>
, where e.g. f >=> g
means that on all results of
the f
axis, the g
axis will be applied, and all results joined together.
Because Axis is just a type synonym for Cursor -> [Cursor]
, it is possible to use
other standard functions like >>=
or concatMap
similarly.
The operators &|
, &/
, &//
and &.//
can be used to combine axes so that the second
axis works on the context nodes, children, descendants, respectively the context node as
well as its descendants of the results of the first axis.
The operators $|
, $/
, $//
and $.//
can be used to apply an axis (right-hand side)
to a cursor so that it is applied on the cursor itself, its children, its descendants,
respectively itself and its descendants.
Note that many of these operators also work on generalised Axes that can return lists of something other than Cursors, for example Content elements.
Production
fromDocument :: Document -> CursorSource
Cut a cursor off from its parent. The idea is to allow restricting the scope of queries on it.
Axes
The parent axis. As described in XPath: the parent axis contains the parent of the context node, if there is one.
Every node but the root element of the document has a parent. Parent nodes
will always be NodeElement
s.
precedingSibling :: Axis nodeSource
The preceding-sibling axis. XPath: the preceding-sibling axis contains all the preceding siblings of the context node [...].
followingSibling :: Axis nodeSource
The following-sibling axis. XPath: the following-sibling axis contains all the following siblings of the context node [...].
child :: Cursor node -> [Cursor node]Source
The child axis. XPath: the child axis contains the children of the context node.
The preceding axis. XPath: the preceding axis contains all nodes in the same document as the context node that are before the context node in document order, excluding any ancestors and excluding attribute nodes and namespace nodes.
The following axis. XPath: the following axis contains all nodes in the same document as the context node that are after the context node in document order, excluding any descendants and excluding attribute nodes and namespace nodes.
The ancestor axis. XPath: the ancestor axis contains the ancestors of the context node; the ancestors of the context node consist of the parent of context node and the parent's parent and so on; thus, the ancestor axis will always include the root node, unless the context node is the root node.
descendant :: Axis nodeSource
The descendant axis. XPath: the descendant axis contains the descendants of the context node; a descendant is a child or a child of a child and so on; thus the descendant axis never contains attribute or namespace nodes.
orSelf :: Axis node -> Axis nodeSource
Modify an axis by adding the context node itself as the first element of the result list.
Filters
checkElement :: Boolean b => (Element -> b) -> AxisSource
Filter elements that don't pass a check, and remove all non-elements.
checkName :: Boolean b => (Name -> b) -> AxisSource
Filter elements that don't pass a name check, and remove all non-elements.
Remove all non-elements. Compare roughly to XPath: A node test * is true for any node of the principal node type. For example, child::* will select all element children of the context node [...].
Select only those elements with a matching tag name. XPath: A node test that is a QName is true if and only if the type of the node (see [5 Data Model]) is the principal node type and has an expanded-name equal to the expanded-name specified by the QName.
laxElement :: Text -> AxisSource
Select only those elements with a loosely matching tag name. Namespace and case are ignored. XPath: A node test that is a QName is true if and only if the type of the node (see [5 Data Model]) is the principal node type and has an expanded-name equal to the expanded-name specified by the QName.
content :: Cursor -> [Text]Source
Select only text nodes, and directly give the Content
values. XPath:
The node test text() is true for any text node.
Note that this is not strictly an Axis
, but will work with most combinators.
attribute :: Name -> Cursor -> [Text]Source
Select attributes on the current element (or nothing if it is not an element). XPath: the attribute axis contains the attributes of the context node; the axis will be empty unless the context node is an element
Note that this is not strictly an Axis
, but will work with most combinators.
The return list of the generalised axis contains as elements lists of Content
elements, each full list representing an attribute value.
laxAttribute :: Text -> Cursor -> [Text]Source
Select attributes on the current element (or nothing if it is not an element). Namespace and case are ignored. XPath: the attribute axis contains the attributes of the context node; the axis will be empty unless the context node is an element
Note that this is not strictly an Axis
, but will work with most combinators.
The return list of the generalised axis contains as elements lists of Content
elements, each full list representing an attribute value.
hasAttribute :: Name -> AxisSource
Select only those element nodes with the given attribute.
attributeIs :: Name -> Text -> AxisSource
Select only those element nodes containing the given attribute key/value pair.
Operators
(&|) :: (Cursor node -> [a]) -> (a -> b) -> Cursor node -> [b]Source
Apply a function to the result of an axis.
(&/) :: Axis node -> (Cursor node -> [a]) -> Cursor node -> [a]Source
Combine two axes so that the second works on the children of the results of the first.
(&//) :: Axis node -> (Cursor node -> [a]) -> Cursor node -> [a]Source
Combine two axes so that the second works on the descendants of the results of the first.
(&.//) :: Axis node -> (Cursor node -> [a]) -> Cursor node -> [a]Source
Combine two axes so that the second works on both the result nodes, and their descendants.
($/) :: Cursor node -> (Cursor node -> [a]) -> [a]Source
Apply an axis to the children of a 'Cursor node'.
($//) :: Cursor node -> (Cursor node -> [a]) -> [a]Source
Apply an axis to the descendants of a 'Cursor node'.
($.//) :: Cursor node -> (Cursor node -> [a]) -> [a]Source
Apply an axis to a 'Cursor node' as well as its descendants.
(>=>) :: Monad m => (a -> m b) -> (b -> m c) -> a -> m c
Left-to-right Kleisli composition of monads.
Type classes
Something that can be used in a predicate check as a boolean.