The hexpat package
This package provides a general purpose Haskell XML library using Expat to do its parsing (http://expat.sourceforge.net/ - a fast stream-oriented XML parser written in C). It is extensible to any string type, with String, ByteString and Text provided out of the box.
Basic usage: Parsing a tree (Tree), formatting a tree (Format). Other features: Helpers for processing XML trees (Proc), trees annotated with XML source location (Annotated), XML cursors (Cursor), SAX-style parse (SAX), and access to the low-level interface in case speed is paramount (Internal.IO).
The design goals are speed, speed, speed, interface simplicity and modularity.
For introduction and examples, see the Text.XML.Expat.Tree module. For benchmarks, http://haskell.org/haskellwiki/Hexpat/
If you want to do interactive I/O, an obvious option is to use lazy parsing with one of the lazy I/O functions such as hGetContents. However, this can be problematic in some applications because it doesn't handle I/O errors properly and can give no guarantee of timely resource cleanup. In these cases, chunked I/O is a better approach: Take a look at the hexpat-iteratee package.
IO is filed under Internal because it's low-level and most users won't want it. The other Internal modules are re-exported by Annotated and Tree, so you won't need to import them directly.
Credits to Iavor Diatchki and the xml (XML.Light) package for Proc and Cursor.
INSTALLATION: Unix install requires an OS package called something like libexpat-dev. On MacOSX, expat comes with Apple's optional X11 package, or you can install it from source. To install on Windows, first install the Windows binary that's available from http://expat.sourceforge.net/, then type (assuming you're using v2.0.1):
cabal install hexpat --extra-lib-dirs=C:\Program Files\Expat 2.0.1\Bin --extra-include-dirs=C:\Program Files\Expat 2.0.1\Source\Lib
Ensure libexpat.dll can be found in your system PATH (or copy it into your executable's directory).
ChangeLog: 0.15 changes intended to fix a (rare) "error: a C finalizer called back into Haskell." that seemed only to happen only on ghc6.12.X; 0.15.1 Fix broken Annotated parse; 0.16 switch from mtl to transformers; 0.17 fix mapNodeContainer & rename some things.; 0.18 rename defaultEncoding to overrideEncoding. 0.18.3 formatG and indent were demanding list items more than once (inefficient in chunked processing).
|Versions||0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.10, 0.11, 0.12, 0.13, 0.14, 0.15, 0.15.1, 0.16, 0.17, 0.18, 0.18.1, 0.18.2, 0.18.3, 0.19, 0.19.1, 0.19.2, 0.19.3, 0.19.4, 0.19.5, 0.19.6, 0.19.7, 0.19.8, 0.19.9, 0.19.10, 0.20.1, 0.20.2, 0.20.3, 0.20.4, 0.20.5, 0.20.6, 0.20.7, 0.20.8, 0.20.9|
|Dependencies||base (>=3 && <5), bytestring, containers, deepseq (==1.1.*), extensible-exceptions (==0.1.*), List (==0.4.*), text (>=0.5 && <0.8), transformers, utf8-string (==0.3.*) [details]|
|Copyright||(c) 2009 Doug Beardsley <firstname.lastname@example.org>, (c) 2009-2010 Stephen Blackheath <http://blacksapphire.com/antispam/>, (c) 2009 Gregory Collins, (c) 2008 Evan Martin <email@example.com>, (c) 2009 Matthew Pocock <firstname.lastname@example.org>, (c) 2007-2009 Galois Inc.|
|Author||Stephen Blackheath [blackh] (the primary author), Doug Beardsley, Gregory Collins, Evan Martin (who started the project), Matthew Pocock [drdozer]|
|Source repository||head: darcs get http://code.haskell.org/hexpat/|
|Uploaded||Tue Jul 27 01:37:23 UTC 2010 by StephenBlackheath|
|Distributions||FreeBSD:0.20.9, LTSHaskell:0.20.9, NixOS:0.20.9, Tumbleweed:0.20.9|
|Downloads||28347 total (82 in the last 30 days)|
|Status||Docs uploaded by user
Build status unknown [no reports yet]
For package maintainers and hackage trustees