dom-selector-0.2.0.1: DOM traversal by CSS selectors for xml-conduit package

Safe HaskellNone

Text.XML.Scraping

Contents

Description

Scraping (innerHTML/innerText) and modification (node removal) functions.

Synopsis

InnerHTML / InnerText

class GetInner elem whereSource

Type class for getting lazy text representation of HTML element(s). This can be used for Node, Cursor, [Node], and [Cursor].

Methods

innerHtml :: elem -> TextSource

''innerHtml'' of the element(s).

innerText :: elem -> TextSource

''innerText'' of the element(s).

toHtml :: elem -> TextSource

''toHtml'' of the element(s).

Attirbutes

class GetAttribute elem whereSource

Methods

ename :: elem -> Maybe TextSource

Tag name of element node. Returns Nothing if the node is not an element.

eid :: elem -> Maybe TextSource

Returns an element id. If node is not an element or does not have an id, returns Nothing.

eclass :: elem -> [Text]Source

Returns element classes. If node is not an element or does not have a class, returns an empty list.

getMeta :: Text -> elem -> [Text]Source

Searches a meta with a specified name under a cursor, and gets a ''content'' field.

Removing descendant nodes

These functions work on Node or [Node]

remove :: (Node -> Bool) -> Node -> NodeSource

Removes descendant nodes that satisfy predicate, and returns a new updated Node. This is a general function, and internally used for other remove* functions in this module.

removeDepth :: (Node -> Bool) -> Int -> Node -> NodeSource

Similar to remove, but with a limit of depth.

removeTags :: [String] -> [Node] -> [Node]Source

Remove all descendant nodes with specified tag names.

removeQueries :: [String] -> [Node] -> [Node]Source

Remove all descendant nodes that match any of query strings. ''removeQuery'' in ver 0.1 was merged into this.

rmElem :: String -> String -> [String] -> [Node] -> [Node]Source

Remove descendant nodes that match specified tag, id, and class (similar to remove, but more specific.) If you pass an empty string to tag or id, that does not filter tag or id (Read the source code for details).

 rmElem ''div'' ''div-id'' [''div-class'', ''div-class2''] nodes = newnodes

Other

nodeHaving :: (Node -> Bool) -> Node -> BoolSource

Checks whether the node contains any descendant (and self) node that satisfies predicate. To return false, this function needs to traverse all descendant elements, so this is not efficient.

Deprecated

removeQuery :: String -> [Node] -> [Node]Source

Deprecated: Use removeQueries instead.

Remove all descendant nodes that match a query string.