|
|
|
|
|
Description |
This module provides functions to parse an XML document to a tree structure,
either strictly or lazily.
The GenericXMLString type class allows you to use any string type. Three
string types are provided for here: String, ByteString and Text.
Here is a complete example to get you started:
-- | A "hello world" example of hexpat that lazily parses a document, printing
-- it to standard out.
import Text.XML.Expat.Tree
import Text.XML.Expat.Format
import System.Environment
import System.Exit
import System.IO
import qualified Data.ByteString.Lazy as L
main = do
args <- getArgs
case args of
[filename] -> process filename
otherwise -> do
hPutStrLn stderr "Usage: helloworld <file.xml>"
exitWith $ ExitFailure 1
process :: String -> IO ()
process filename = do
inputText <- L.readFile filename
-- Note: Because we're not using the tree, Haskell can't infer the type of
-- strings we're using so we need to tell it explicitly with a type signature.
let (xml, mErr) = parse defaultParseOptions inputText :: (UNode String, Maybe XMLParseError)
-- Process document before handling error, so we get lazy processing.
L.hPutStr stdout $ format xml
putStrLn ""
case mErr of
Nothing -> return ()
Just err -> do
hPutStrLn stderr $ "XML parse failed: "++show err
exitWith $ ExitFailure 2
Error handling in strict parses is very straightforward - just check the
Either return value. Lazy parses are not so simple. Here are two working
examples that illustrate the ways to handle errors. Here they are:
Way no. 1 - Using a Maybe value
import Text.XML.Expat.Tree
import qualified Data.ByteString.Lazy as L
import Data.ByteString.Internal (c2w)
-- This is the recommended way to handle errors in lazy parses
main = do
let (tree, mError) = parse defaultParseOptions
(L.pack $ map c2w $ "<top><banana></apple></top>")
print (tree :: UNode String)
-- Note: We check the error _after_ we have finished our processing
-- on the tree.
case mError of
Just err -> putStrLn $ "It failed : "++show err
Nothing -> putStrLn "Success!"
Way no. 2 - Using exceptions
parseThrowing can throw an exception from pure code, which is generally a bad
way to handle errors, because Haskell's lazy evaluation means it's hard to
predict where it will be thrown from. However, it may be acceptable in
situations where it's not expected during normal operation, depending on the
design of your program.
...
import Control.Exception.Extensible as E
-- This is not the recommended way to handle errors.
main = do
do
let tree = parseThrowing defaultParseOptions
(L.pack $ map c2w $ "<top><banana></apple></top>")
print (tree :: UNode String)
-- Because of lazy evaluation, you should not process the tree outside
-- the 'do' block, or exceptions could be thrown that won't get caught.
`E.catch` (\exc ->
case E.fromException exc of
Just (XMLParseException err) -> putStrLn $ "It failed : "++show err
Nothing -> E.throwIO exc)
|
|
Synopsis |
|
type Node tag text = NodeG [] tag text | | | | type UNode text = Node text text | | module Text.XML.Expat.Internal.NodeClass | | type QNode text = Node (QName text) text | | module Text.XML.Expat.Internal.Qualified | | type NNode text = Node (NName text) text | | module Text.XML.Expat.Internal.Namespaced | | data ParseOptions tag text = ParseOptions {} | | defaultParseOptions :: ParseOptions tag text | | | | parse :: (GenericXMLString tag, GenericXMLString text) => ParseOptions tag text -> ByteString -> (Node tag text, Maybe XMLParseError) | | parse' :: (GenericXMLString tag, GenericXMLString text) => ParseOptions tag text -> ByteString -> Either XMLParseError (Node tag text) | | data XMLParseError = XMLParseError String XMLParseLocation | | data XMLParseLocation = XMLParseLocation {} | | parseThrowing :: (GenericXMLString tag, GenericXMLString text) => ParseOptions tag text -> ByteString -> Node tag text | | data XMLParseException = XMLParseException XMLParseError | | | | saxToTree :: GenericXMLString tag => [SAXEvent tag text] -> (Node tag text, Maybe XMLParseError) | | class (Monoid s, Eq s) => GenericXMLString s where | | | eAttrs :: Node tag text -> [(tag, text)] | | type Nodes tag text = [Node tag text] | | type UNodes text = Nodes text text | | type QNodes text = [Node (QName text) text] | | type NNodes text = [Node (NName text) text] | | parseTree :: (GenericXMLString tag, GenericXMLString text) => Maybe Encoding -> ByteString -> (Node tag text, Maybe XMLParseError) | | parseTree' :: (GenericXMLString tag, GenericXMLString text) => Maybe Encoding -> ByteString -> Either XMLParseError (Node tag text) | | parseSAX :: (GenericXMLString tag, GenericXMLString text) => Maybe Encoding -> ByteString -> [SAXEvent tag text] | | parseSAXLocations :: (GenericXMLString tag, GenericXMLString text) => Maybe Encoding -> ByteString -> [(SAXEvent tag text, XMLParseLocation)] | | parseTreeThrowing :: (GenericXMLString tag, GenericXMLString text) => Maybe Encoding -> ByteString -> Node tag text | | parseSAXThrowing :: (GenericXMLString tag, GenericXMLString text) => Maybe Encoding -> ByteString -> [SAXEvent tag text] | | parseSAXLocationsThrowing :: (GenericXMLString tag, GenericXMLString text) => Maybe Encoding -> ByteString -> [(SAXEvent tag text, XMLParseLocation)] | | type ParserOptions tag text = ParseOptions tag text | | defaultParserOptions :: ParseOptions tag text |
|
|
|
Tree structure
|
|
|
A pure tree representation that uses a list as its container type.
In the hexpat package, a list of nodes has the type [Node tag text], but note
that you can also use the more general type function ListOf to give a list of
any node type, using that node's associated list type, e.g.
ListOf (UNode Text).
|
|
|
The tree representation of the XML document.
c is the container type for the element's children, which is [] in the
hexpat package, and a monadic list type for hexpat-iteratee.
tag is the tag type, which can either be one of several string types,
or a special type from the Text.XML.Expat.Namespaced or
Text.XML.Expat.Qualified modules.
text is the string type for text content.
| Constructors | Element | | eName :: !tag | | eAttributes :: ![(tag, text)] | | eChildren :: c (NodeG c tag text) | |
| Text !text | |
| Instances | |
|
|
|
Type alias for a single node with unqualified tag names where tag and
text are the same string type.
|
|
Generic node manipulation
|
|
module Text.XML.Expat.Internal.NodeClass |
|
Qualified nodes
|
|
|
Type alias for a single node where qualified names are used for tags
|
|
module Text.XML.Expat.Internal.Qualified |
|
Namespaced nodes
|
|
|
Type alias for a single node where namespaced names are used for tags
|
|
module Text.XML.Expat.Internal.Namespaced |
|
Parse to tree
|
|
data ParseOptions tag text | Source |
|
Constructors | ParseOptions | | overrideEncoding :: Maybe Encoding | The encoding parameter, if provided, overrides the document's
encoding declaration.
| entityDecoder :: Maybe (tag -> Maybe text) | If provided, entity references (i.e. and friends) will
be decoded into text using the supplied lookup function
|
|
|
|
|
|
|
|
Encoding types available for the document encoding.
| Constructors | |
|
|
|
|
|
|
|
|
|
Parse error, consisting of message text and error location
| Constructors | | Instances | |
|
|
|
Specifies a location of an event within the input text
| Constructors | XMLParseLocation | | xmlLineNumber :: Int64 | Line number of the event
| xmlColumnNumber :: Int64 | Column number of the event
| xmlByteIndex :: Int64 | Byte index of event from start of document
| xmlByteCount :: Int64 | The number of bytes in the event
|
|
| Instances | |
|
|
Variant that throws exceptions
|
|
|
:: (GenericXMLString tag, GenericXMLString text) | | => ParseOptions tag text | Parser options
| -> ByteString | Input text (a lazy ByteString)
| -> Node tag text | | Lazily parse XML to tree. In the event of an error, throw XMLParseException.
parseThrowing can throw an exception from pure code, which is generally a bad
way to handle errors, because Haskell's lazy evaluation means it's hard to
predict where it will be thrown from. However, it may be acceptable in
situations where it's not expected during normal operation, depending on the
design of your program.
|
|
|
|
An exception indicating an XML parse error, used by the ..Throwing variants.
| Constructors | | Instances | |
|
|
SAX-style parse
|
|
|
Constructors | StartElement tag [(tag, text)] | | EndElement tag | | CharacterData text | | FailDocument XMLParseError | |
| Instances | |
|
|
|
A lower level function that lazily converts a SAX stream into a tree structure.
|
|
Abstraction of string types
|
|
|
An abstraction for any string type you want to use as xml text (that is,
attribute values or element text content). If you want to use a
new string type with hexpat, you must make it an instance of
GenericXMLString.
| | Methods | | | Instances | |
|
|
Deprecated
|
|
|
|
|
DEPRECATED: Use [Node tag text] instead.
Type alias for nodes.
|
|
|
DEPRECATED: Use [UNode text] instead.
Type alias for nodes with unqualified tag names where tag and
text are the same string type. DEPRECATED.
|
|
|
DEPRECATED: Use [QNode text] instead.
Type alias for nodes where qualified names are used for tags
|
|
|
DEPRECATED: Use [NNode text] instead.
Type alias for nodes where namespaced names are used for tags.
|
|
|
:: (GenericXMLString tag, GenericXMLString text) | | => Maybe Encoding | Optional encoding override
| -> ByteString | Input text (a lazy ByteString)
| -> (Node tag text, Maybe XMLParseError) | | DEPREACTED: Use parse instead.
Lazily parse XML to tree. Note that forcing the XMLParseError return value
will force the entire parse. Therefore, to ensure lazy operation, don't
check the error status until you have processed the tree.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
parseSAXLocationsThrowing | Source |
|
|
|
|
|
|
DEPRECATED. Renamed to defaultParseOptions.
|
|
Produced by Haddock version 2.6.1 |