xml-extractors-0.3.0.0: Wrapper over xml to extract data from parsed xml

Safe HaskellNone
LanguageHaskell2010

Text.XML.Light.Extractors

Contents

Description

Functions to extract data from parsed XML.

Example

Suppose you have an xml file of books like this:

<?xml version="1.0"?>
<library>
  <book id="1" isbn="23234-1">
    <author>John Doe</author>
    <title>Some book</title>
  </book>
  <book id="2">
    <author>You</author>
    <title>The Great Event</title>
  </book>
  ...
</library>

And a data type for a book:

data Book = Book { bookId        :: Int
                 , isbn          :: Maybe String
                 , author, title :: String
                 }

You can parse the xml file into a generic tree structure using parseXMLDoc from the xml package.

Using this library one can define extractors to extract data from the generic tree.

   library = element "library" $ children $ only $ many book

   book = element "book" $ do
            i <- attribAs "id" integer
            s <- optional (attrib "isbn")
            children $ do
              a <- element "author" $ contents $ text
              t <- element "title" $ contents $ text
              return $ Book { bookId = i, author = a, title = t, isbn = s }

   extractLibrary :: Element -> Either ExtractionErr [Book]
   extractLibrary = extractDocContents library

Notes

Synopsis

Errors

type Path = [String] Source

Location for some content.

For now it is a reversed list of content indices and element names. This may change to something less "stringly typed".

data Err Source

Extraction errors.

Constructors

ErrExpect

Some expected content is missing

Fields

expected :: String

expected content

found :: Content

found content

ErrAttr

An expected attribute is missing

Fields

expected :: String

expected content

atElement :: Element

element with missing attribute

ErrEnd

Expected end of contents

Fields

found :: Content

found content

ErrNull

Unexpected end of contents

Fields

expected :: String

expected content

ErrMsg String 

Instances

data ExtractionErr Source

Error with a context.

Constructors

ExtractionErr 

Fields

err :: Err
 
context :: Path
 

Element extraction

extractElement :: ElementExtractor a -> Element -> Either ExtractionErr a Source

extractElement p element extracts element with p.

attrib :: String -> ElementExtractor String Source

attrib name extracts the value of attribute name.

attribAs :: String -> (String -> Either Err a) -> ElementExtractor a Source

attribAs name f extracts the value of attribute name and runs it through a conversion/validation function.

children :: ContentsExtractor a -> ElementExtractor a Source

children p extract only child elements with p.

contents :: ContentsExtractor a -> ElementExtractor a Source

contents p extract contents with p.

Contents extraction

extractContents :: ContentsExtractor a -> [Content] -> Either ExtractionErr a Source

extractContents p contents extracts the contents with p.

extractDocContents :: ContentsExtractor a -> Element -> Either ExtractionErr a Source

Using parseXMLDoc produces a single Element. Such an element can be extracted using this function.

element :: String -> ElementExtractor a -> ContentsExtractor a Source

element name p extracts a name element with p.

textAs :: (String -> Either Err a) -> ContentsExtractor a Source

Extracts text applied to a conversion function.

choice :: [ContentsExtractor a] -> ContentsExtractor a Source

Extracts first matching.

eoc :: ContentsExtractor () Source

Succeeds only when there is no more content.

only :: ContentsExtractor a -> ContentsExtractor a Source

only p fails if there is more contents than extracted by p.

only p = p <* eoc