xml-extractors-0.2.1.0: Simple wrapper over xml to extract data from parsed xml

Safe HaskellNone
LanguageHaskell2010

Text.XML.Light.Extractors

Contents

Description

A library for making extraction of information from parsed XML easier.

Example

Suppose you have an xml file of books like this:

<?xml version="1.0"?>
<library>
  <book id="1" isbn="23234-1">
    <author>John Doe</author>
    <title>Some book</title>
  </book>
  <book id="2">
    <author>You</author>
    <title>The Great Event</title>
  </book>
  ...
</library>

And a data type for a book:

data Book = Book { bookId        :: Int
                 , isbn          :: Maybe String
                 , author, title :: String
                 }

You can parse the xml file into a generic tree structure using parseXMLDoc from the xml package.

Using this library one can define extractors to extract data from the generic tree.

   library = element "library" $ children $ only $ many book

   book = element "book" $ do
            i <- attribAs "id" integer
            s <- optional (attrib "isbn")
            children $ do
              a <- element "author" $ contents $ text
              t <- element "title" $ contents $ text
              return $ Book { bookId = i, author = a, title = t, isbn = s }

   extractLibrary :: Element -> Either ExtractionErr [Book]
   extractLibrary = extractDocContents library

Notes

Synopsis

Errors

type Path = [String] Source

Location for some content.

data Err Source

Extraction errors.

Constructors

ErrExpect

Some expected content is missing

Fields

expected :: String

expected content

found :: Content

found content

ErrAttr

An expected attribute is missing

Fields

expected :: String

expected content

atElement :: Element

element with missing attribute

ErrEnd

Expected end of contents

Fields

found :: Content

found content

ErrNull

Unexpected end of contents

Fields

expected :: String

expected content

ErrMsg String 

Instances

data ExtractionErr Source

Error with a context.

Constructors

ExtractionErr 

Fields

err :: Err
 
context :: Path
 

Element extraction

extractElement :: ElementExtractor a -> Element -> Either ExtractionErr a Source

extractElement p element extracts element with p.

attrib :: String -> ElementExtractor String Source

attrib name extracts the value of attribute name.

attribAs :: String -> (String -> Either Err a) -> ElementExtractor a Source

attribAs name f extracts the value of attribute name and runs it through a conversion/validation function.

children :: ContentsExtractor a -> ElementExtractor a Source

children p extract only child elements with p.

contents :: ContentsExtractor a -> ElementExtractor a Source

contents p extract contents with p.

Contents extraction

extractContents :: ContentsExtractor a -> [Content] -> Either ExtractionErr a Source

extractContents p contents extracts the contents with p.

extractDocContents :: ContentsExtractor a -> Element -> Either ExtractionErr a Source

Using parseXMLDoc produces a single Element. Such an element can be extracted using this function.

element :: String -> ElementExtractor a -> ContentsExtractor a Source

element name p extracts a name element with p.

textAs :: (String -> Either Err a) -> ContentsExtractor a Source

Extracts text applied to a conversion function.

eoc :: ContentsExtractor () Source

Succeeds only when there is no more content.

only :: ContentsExtractor a -> ContentsExtractor a Source

only p fails if there is more contents than extracted by p.