hexpat-0.16: XML parser/formatter based on expatSource codeContentsIndex
Text.XML.Expat.Tree
Contents
Tree structure
Generic node manipulation
Qualified nodes
Namespaced nodes
Parse to tree
Variant that throws exceptions
SAX-style parse
Abstraction of string types
Deprecated
Description

This module provides functions to parse an XML document to a tree structure, either strictly or lazily.

The GenericXMLString type class allows you to use any string type. Three string types are provided for here: String, ByteString and Text.

Here is a complete example to get you started:

 -- | A "hello world" example of hexpat that lazily parses a document, printing
 -- it to standard out.

 import Text.XML.Expat.Tree
 import Text.XML.Expat.Format
 import System.Environment
 import System.Exit
 import System.IO
 import qualified Data.ByteString.Lazy as L

 main = do
     args <- getArgs
     case args of
         [filename] -> process filename
         otherwise  -> do
             hPutStrLn stderr "Usage: helloworld <file.xml>"
             exitWith $ ExitFailure 1

 process :: String -> IO ()
 process filename = do
     inputText <- L.readFile filename
     -- Note: Because we're not using the tree, Haskell can't infer the type of
     -- strings we're using so we need to tell it explicitly with a type signature.
     let (xml, mErr) = parse defaultParserOptions inputText :: (UNode String, Maybe XMLParseError)
     -- Process document before handling error, so we get lazy processing.
     L.hPutStr stdout $ format xml
     putStrLn ""
     case mErr of
         Nothing -> return ()
         Just err -> do
             hPutStrLn stderr $ "XML parse failed: "++show err
             exitWith $ ExitFailure 2

Error handling in strict parses is very straight forward - just check the Either return value. Lazy parses are not so simple. Here are two working examples that illustrate the ways to handle errors. Here they are:

Way no. 1 - Using a Maybe value

 import Text.XML.Expat.Tree
 import qualified Data.ByteString.Lazy as L
 import Data.ByteString.Internal (c2w)

 -- This is the recommended way to handle errors in lazy parses
 main = do
     let (tree, mError) = parse defaultParserOptions
                    (L.pack $ map c2w $ "<top><banana></apple></top>")
     print (tree :: UNode String)

     -- Note: We check the error _after_ we have finished our processing
     -- on the tree.
     case mError of
         Just err -> putStrLn $ "It failed : "++show err
         Nothing -> putStrLn "Success!"

Way no. 2 - Using exceptions

parseThrowing can throw an exception from pure code, which is generally a bad way to handle errors, because Haskell's lazy evaluation means it's hard to predict where it will be thrown from. However, it may be acceptable in situations where it's not expected during normal operation, depending on the design of your program.

 ...
 import Control.Exception.Extensible as E

 -- This is not the recommended way to handle errors.
 main = do
     do
         let tree = parseThrowing defaultParserOptions
                        (L.pack $ map c2w $ "<top><banana></apple></top>")
         print (tree :: UNode String)
         -- Because of lazy evaluation, you should not process the tree outside
         -- the 'do' block, or exceptions could be thrown that won't get caught.
     `E.catch` (\exc ->
         case E.fromException exc of
             Just (XMLParseException err) -> putStrLn $ "It failed : "++show err
             Nothing -> E.throwIO exc)
Synopsis
type Node tag text = NodeG [] tag text
data NodeG c tag text
= Element {
eName :: !tag
eAttributes :: ![(tag, text)]
eChildren :: c (NodeG c tag text)
}
| Text !text
type UNode text = Node text text
module Text.XML.Expat.Internal.NodeClass
type QNode text = Node (QName text) text
module Text.XML.Expat.Internal.Qualified
type NNode text = Node (NName text) text
module Text.XML.Expat.Internal.Namespaced
data ParserOptions tag text = ParserOptions {
parserEncoding :: Maybe Encoding
entityDecoder :: Maybe (tag -> Maybe text)
}
defaultParserOptions :: ParserOptions tag text
data Encoding
= ASCII
| UTF8
| UTF16
| ISO88591
parse :: (GenericXMLString tag, GenericXMLString text) => ParserOptions tag text -> ByteString -> (Node tag text, Maybe XMLParseError)
parse' :: (GenericXMLString tag, GenericXMLString text) => ParserOptions tag text -> ByteString -> Either XMLParseError (Node tag text)
data XMLParseError = XMLParseError String XMLParseLocation
data XMLParseLocation = XMLParseLocation {
xmlLineNumber :: Int64
xmlColumnNumber :: Int64
xmlByteIndex :: Int64
xmlByteCount :: Int64
}
parseThrowing :: (GenericXMLString tag, GenericXMLString text) => ParserOptions tag text -> ByteString -> Node tag text
data XMLParseException = XMLParseException XMLParseError
data SAXEvent tag text
= StartElement tag [(tag, text)]
| EndElement tag
| CharacterData text
| FailDocument XMLParseError
saxToTree :: GenericXMLString tag => [SAXEvent tag text] -> (Node tag text, Maybe XMLParseError)
class (Monoid s, Eq s) => GenericXMLString s where
gxNullString :: s -> Bool
gxToString :: s -> String
gxFromString :: String -> s
gxFromChar :: Char -> s
gxHead :: s -> Char
gxTail :: s -> s
gxBreakOn :: Char -> s -> (s, s)
gxFromCStringLen :: CStringLen -> IO s
gxToByteString :: s -> ByteString
eAttrs :: Node tag text -> [(tag, text)]
type Nodes tag text = [Node tag text]
type UNodes text = Nodes text text
type QNodes text = [Node (QName text) text]
type NNodes text = [Node (NName text) text]
parseTree :: (GenericXMLString tag, GenericXMLString text) => Maybe Encoding -> ByteString -> (Node tag text, Maybe XMLParseError)
parseTree' :: (GenericXMLString tag, GenericXMLString text) => Maybe Encoding -> ByteString -> Either XMLParseError (Node tag text)
parseSAX :: (GenericXMLString tag, GenericXMLString text) => Maybe Encoding -> ByteString -> [SAXEvent tag text]
parseSAXLocations :: (GenericXMLString tag, GenericXMLString text) => Maybe Encoding -> ByteString -> [(SAXEvent tag text, XMLParseLocation)]
parseTreeThrowing :: (GenericXMLString tag, GenericXMLString text) => Maybe Encoding -> ByteString -> Node tag text
parseSAXThrowing :: (GenericXMLString tag, GenericXMLString text) => Maybe Encoding -> ByteString -> [SAXEvent tag text]
parseSAXLocationsThrowing :: (GenericXMLString tag, GenericXMLString text) => Maybe Encoding -> ByteString -> [(SAXEvent tag text, XMLParseLocation)]
Tree structure
type Node tag text = NodeG [] tag textSource

A pure tree representation that uses a list as its container type.

In the hexpat package, a list of nodes has the type [Node tag text], but note that you can also use the more general type function ListOf to give a list of any node type, using that node's associated list type, e.g. ListOf (UNode Text).

data NodeG c tag text Source

The tree representation of the XML document.

c is the container type for the element's children, which is [] in the hexpat package, and a monadic list type for hexpat-iteratee.

tag is the tag type, which can either be one of several string types, or a special type from the Text.XML.Expat.Namespaced or Text.XML.Expat.Qualified modules.

text is the string type for text content.

Constructors
Element
eName :: !tag
eAttributes :: ![(tag, text)]
eChildren :: c (NodeG c tag text)
Text !text
show/hide Instances
(Functor c, List c) => MkElementClass NodeG c
(Functor c, List c) => NodeClass NodeG c
(Eq tag, Eq text) => Eq (NodeG [] tag text)
(Show tag, Show text) => Show (NodeG [] tag text)
(NFData tag, NFData text) => NFData (NodeG [] tag text)
type UNode text = Node text textSource
Type alias for a single node with unqualified tag names where tag and text are the same string type.
Generic node manipulation
module Text.XML.Expat.Internal.NodeClass
Qualified nodes
type QNode text = Node (QName text) textSource
Type alias for a single node where qualified names are used for tags
module Text.XML.Expat.Internal.Qualified
Namespaced nodes
type NNode text = Node (NName text) textSource
Type alias for a single node where namespaced names are used for tags
module Text.XML.Expat.Internal.Namespaced
Parse to tree
data ParserOptions tag text Source
Constructors
ParserOptions
parserEncoding :: Maybe EncodingThe encoding parameter, if provided, overrides the document's encoding declaration.
entityDecoder :: Maybe (tag -> Maybe text)If provided, entity references (i.e. &nbsp; and friends) will be decoded into text using the supplied lookup function
defaultParserOptions :: ParserOptions tag textSource
data Encoding Source
Encoding types available for the document encoding.
Constructors
ASCII
UTF8
UTF16
ISO88591
parseSource
:: (GenericXMLString tag, GenericXMLString text)
=> ParserOptions tag textParser options
-> ByteStringInput text (a lazy ByteString)
-> (Node tag text, Maybe XMLParseError)
Lazily parse XML to tree. Note that forcing the XMLParseError return value will force the entire parse. Therefore, to ensure lazy operation, don't check the error status until you have processed the tree.
parse'Source
:: (GenericXMLString tag, GenericXMLString text)
=> ParserOptions tag textParser options
-> ByteStringInput text (a strict ByteString)
-> Either XMLParseError (Node tag text)
Strictly parse XML to tree. Returns error message or valid parsed tree.
data XMLParseError Source
Parse error, consisting of message text and error location
Constructors
XMLParseError String XMLParseLocation
show/hide Instances
data XMLParseLocation Source
Specifies a location of an event within the input text
Constructors
XMLParseLocation
xmlLineNumber :: Int64Line number of the event
xmlColumnNumber :: Int64Column number of the event
xmlByteIndex :: Int64Byte index of event from start of document
xmlByteCount :: Int64The number of bytes in the event
show/hide Instances
Variant that throws exceptions
parseThrowingSource
:: (GenericXMLString tag, GenericXMLString text)
=> ParserOptions tag textParser options
-> ByteStringInput text (a lazy ByteString)
-> Node tag text

Lazily parse XML to tree. In the event of an error, throw XMLParseException.

parseThrowing can throw an exception from pure code, which is generally a bad way to handle errors, because Haskell's lazy evaluation means it's hard to predict where it will be thrown from. However, it may be acceptable in situations where it's not expected during normal operation, depending on the design of your program.

data XMLParseException Source
An exception indicating an XML parse error, used by the ..Throwing variants.
Constructors
XMLParseException XMLParseError
show/hide Instances
SAX-style parse
data SAXEvent tag text Source
Constructors
StartElement tag [(tag, text)]
EndElement tag
CharacterData text
FailDocument XMLParseError
show/hide Instances
(Eq tag, Eq text) => Eq (SAXEvent tag text)
(Show tag, Show text) => Show (SAXEvent tag text)
(NFData tag, NFData text) => NFData (SAXEvent tag text)
saxToTree :: GenericXMLString tag => [SAXEvent tag text] -> (Node tag text, Maybe XMLParseError)Source
A lower level function that lazily converts a SAX stream into a tree structure.
Abstraction of string types
class (Monoid s, Eq s) => GenericXMLString s whereSource
An abstraction for any string type you want to use as xml text (that is, attribute values or element text content). If you want to use a new string type with hexpat, you must make it an instance of GenericXMLString.
Methods
gxNullString :: s -> BoolSource
gxToString :: s -> StringSource
gxFromString :: String -> sSource
gxFromChar :: Char -> sSource
gxHead :: s -> CharSource
gxTail :: s -> sSource
gxBreakOn :: Char -> s -> (s, s)Source
gxFromCStringLen :: CStringLen -> IO sSource
gxToByteString :: s -> ByteStringSource
show/hide Instances
Deprecated
eAttrs :: Node tag text -> [(tag, text)]Source
type Nodes tag text = [Node tag text]Source

DEPRECATED: Use [Node tag text] instead.

Type alias for nodes.

type UNodes text = Nodes text textSource

DEPRECATED: Use [UNode text] instead.

Type alias for nodes with unqualified tag names where tag and text are the same string type. DEPRECATED.

type QNodes text = [Node (QName text) text]Source

DEPRECATED: Use [QNode text] instead.

Type alias for nodes where qualified names are used for tags

type NNodes text = [Node (NName text) text]Source

DEPRECATED: Use [NNode text] instead.

Type alias for nodes where namespaced names are used for tags.

parseTreeSource
:: (GenericXMLString tag, GenericXMLString text)
=> Maybe EncodingOptional encoding override
-> ByteStringInput text (a lazy ByteString)
-> (Node tag text, Maybe XMLParseError)

DEPREACTED: Use parse instead.

Lazily parse XML to tree. Note that forcing the XMLParseError return value will force the entire parse. Therefore, to ensure lazy operation, don't check the error status until you have processed the tree.

parseTree'Source
:: (GenericXMLString tag, GenericXMLString text)
=> Maybe EncodingOptional encoding override
-> ByteStringInput text (a strict ByteString)
-> Either XMLParseError (Node tag text)

DEPRECATED: use parse instead.

Strictly parse XML to tree. Returns error message or valid parsed tree.

parseSAXSource
:: (GenericXMLString tag, GenericXMLString text)
=> Maybe EncodingOptional encoding override
-> ByteStringInput text (a lazy ByteString)
-> [SAXEvent tag text]

DEPRECATED: Use parse instead.

Lazily parse XML to SAX events. In the event of an error, FailDocument is the last element of the output list. Deprecated in favour of new parse

parseSAXLocationsSource
:: (GenericXMLString tag, GenericXMLString text)
=> Maybe EncodingOptional encoding override
-> ByteStringInput text (a lazy ByteString)
-> [(SAXEvent tag text, XMLParseLocation)]

DEPRECATED: Use parseLocations instead.

A variant of parseSAX that gives a document location with each SAX event.

parseTreeThrowingSource
:: (GenericXMLString tag, GenericXMLString text)
=> Maybe EncodingOptional encoding override
-> ByteStringInput text (a lazy ByteString)
-> Node tag text

DEPRECATED: Use parseThrowing instead.

Lazily parse XML to tree. In the event of an error, throw XMLParseException.

parseSAXThrowingSource
:: (GenericXMLString tag, GenericXMLString text)
=> Maybe EncodingOptional encoding override
-> ByteStringInput text (a lazy ByteString)
-> [SAXEvent tag text]

DEPRECATED: Use parseThrowing instead.

Lazily parse XML to SAX events. In the event of an error, throw XMLParseException.

parseSAXLocationsThrowingSource
:: (GenericXMLString tag, GenericXMLString text)
=> Maybe EncodingOptional encoding override
-> ByteStringInput text (a lazy ByteString)
-> [(SAXEvent tag text, XMLParseLocation)]

DEPRECATED: Used parseLocationsThrowing instead.

A variant of parseSAX that gives a document location with each SAX event. In the event of an error, throw XMLParseException.

Produced by Haddock version 2.6.1