xml-extractors-0.4.0.2: Extension to the xml package to extract data from parsed xml

Safe HaskellNone
LanguageHaskell2010

Text.XML.Light.Extractors

Contents

Description

Functions to extract data from parsed XML.

Example

Suppose you have an xml file of books like this:

<?xml version="1.0"?>
<library>
  <book id="1" isbn="23234-1">
    <author>John Doe</author>
    <title>Some book</title>
  </book>
  <book id="2">
    <author>You</author>
    <title>The Great Event</title>
  </book>
  ...
</library>

And a data type for a book:

data Book = Book { bookId        :: Int
                 , isbn          :: Maybe String
                 , author, title :: String
                 }

You can parse the xml file into a generic tree structure using parseXMLDoc from the xml package.

Using this library one can define extractors to extract Books from the generic tree.

   book = element "book" $ do
            i <- attribAs "id" integer
            s <- optional (attrib "isbn")
            children $ do
              a <- element "author" $ contents $ text
              t <- element "title" $ contents $ text
              return Book { bookId = i, author = a, title = t, isbn = s }

   library = element "library" $ children $ only $ many book

   extractLibrary :: Element -> Either ExtractionErr [Book]
   extractLibrary = extractDocContents library

Notes

  • The only combinator can be used to exhaustively extract contents.

Synopsis

Errors

type Path = [String] Source #

Location for some content.

For now it is a reversed list of content indices (starting at 1) and element names. This may change to something less "stringly typed".

data Err Source #

Extraction errors.

Constructors

ErrExpectContent

Some expected content is missing

ErrExpectAttrib

An expected attribute is missing

Fields

ErrAttribValue

An attribute value was bad

Fields

ErrEnd

Expected end of contents

ErrNull

Unexpected end of contents

ErrMsg String 

Instances

Show Err Source # 

Methods

showsPrec :: Int -> Err -> ShowS #

show :: Err -> String #

showList :: [Err] -> ShowS #

Element extraction

extractElement :: ElementExtractor a -> Element -> Either ExtractionErr a Source #

extractElement p element extracts element with p.

attrib :: String -> ElementExtractor String Source #

attrib name extracts the value of attribute name.

attribAs :: String -> (String -> Either String a) -> ElementExtractor a Source #

attribAs name f extracts the value of attribute name and runs it through a conversion/validation function.

The conversion function takes a string with the value and returns either a description of the expected format of the value or the converted value.

children :: ContentsExtractor a -> ElementExtractor a Source #

children p extract only child elements with p.

contents :: ContentsExtractor a -> ElementExtractor a Source #

contents p extract contents with p.

Contents extraction

extractContents :: ContentsExtractor a -> [Content] -> Either ExtractionErr a Source #

extractContents p contents extracts the contents with p.

extractDocContents :: ContentsExtractor a -> Element -> Either ExtractionErr a Source #

Using parseXMLDoc produces a single Element. Such an element can be extracted using this function.

element :: String -> ElementExtractor a -> ContentsExtractor a Source #

element name p extracts a name element with p.

textAs :: (String -> Either Err a) -> ContentsExtractor a Source #

Extracts text applied to a conversion function.

choice :: [ContentsExtractor a] -> ContentsExtractor a Source #

Extracts first matching.

eoc :: ContentsExtractor () Source #

Succeeds only when there is no more content.

only :: ContentsExtractor a -> ContentsExtractor a Source #

only p fails if there is more contents than extracted by p.

only p = p <* eoc

Utils

showExtractionErr :: ExtractionErr -> String Source #

Converts an extraction error to a multi line string message.

Paths are shown according to showPath.

eitherMessageOrValue :: Either ExtractionErr a -> Either String a Source #

Convenience function to convert extraction errors to string messages using showExtractionErr.

eitherMessageOrValue = either (Left . showExtractionErr) Right

integer :: (Integral a, Read a) => String -> Either String a Source #

Reads an integer value or return Left "integer" if the read fails.

float :: (Floating a, Read a) => String -> Either String a Source #

Reads a floating point value or return Left "float" if the read fails.