microformats2-parser: A Microformats 2 parser.

[ library, public-domain, web ] [ Propose Tags ]

A parser for Microformats 2 (http:/microformats.orgwiki/microformats2), a simple way to describe structured information in HTML.


[Skip to Readme]

Downloads

Note: This package has metadata revisions in the cabal description newer than included in the tarball. To unpack the package including the revisions, use 'cabal get'.

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees

Candidates

  • No Candidates
Versions [RSS] 0.1.0, 0.1.1, 1.0.0, 1.0.1, 1.0.1.1, 1.0.1.2, 1.0.1.3, 1.0.1.4, 1.0.1.5, 1.0.1.6, 1.0.1.7, 1.0.1.8, 1.0.1.9, 1.0.2.0, 1.0.2.1, 1.0.2.2
Dependencies aeson, aeson-pretty, aeson-qq, attoparsec, base (>=4.7.0.0 && <5), base-compat (>=0.8.0), blaze-html, blaze-markup, bytestring, containers, data-default, either, errors, html-conduit, lens-aeson, microformats2-parser, network, network-uri, options, pcre-heavy, safe, scotty, tagsoup, text, time, transformers, unordered-containers, vector, wai-cli, wai-extra, xml-lens, xss-sanitize [details]
License LicenseRef-PublicDomain
Copyright 2015-2022 Val Packett <val@packett.cool>
Author Val Packett
Maintainer val@packett.cool
Revised Revision 1 made by myfreeweb at 2022-10-16T18:40:46Z
Category Web
Home page https://codeberg.org/valpackett/microformats2-parser
Bug tracker https://codeberg.org/valpackett/microformats2-parser/issues
Source repo head: git clone https://codeberg.org/valpackett/microformats2-parser.git
Uploaded by myfreeweb at 2018-06-25T15:20:55Z
Distributions
Reverse Dependencies 1 direct, 1 indirect [details]
Executables microformats2-parser
Downloads 8696 total (36 in the last 30 days)
Rating 2.0 (votes: 1) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Docs available [build log]
Last success reported on 2018-06-25 [all 1 reports]

Readme for microformats2-parser-1.0.1.8

[back to package description]

Hackage Build Status unlicense

microformats2-parser

Microformats 2 parser for Haskell! #IndieWeb

Originally created for sweetroll.

  • parses items, rels, rel-urls
  • resolves relative URLs (with support for the <base> tag), including inside of html for e-* properties
  • parses the value-class-pattern, including date and time normalization
  • handles malformed HTML (the actual HTML parser is tagstream-conduit)
  • high performance
  • extensively tested

Also check out http-link-header because you often need to read links from the Link header!

DEMO PAGE

Usage

Look at the API docs on Hackage for more info, here's a quick overview:

{-# LANGUAGE OverloadedStrings #-}

import Data.Microformats2.Parser
import Data.Default
import Network.URI

parseMf2 def $ documentRoot $ parseLBS "<body><p class=h-entry><h1 class=p-name>Yay!</h1></p></body>"

parseMf2 (def { baseUri = parseURI "https://where.i.got/that/page/from/" }) $ documentRoot $ parseLBS "<body><base href=\"base/\"><link rel=micropub href='micropub'><p class=h-entry><h1 class=p-name>Yay!</h1></p></body>"

The def is the default configuration.

The configuration includes:

  • htmlMode, an HTML parsing mode (Unsafe | Escape | Sanitize)
  • baseUri, the Maybe URI that represents the address you retrieved the HTML from, used for resolving relative addresses -- you should set it

parseMf2 will return an Aeson Value structured like canonical microformats2 JSON. lens-aeson is a good way to navigate it.

Development

Use stack to build.
Use ghci to run tests quickly with :test (see the .ghci file).

$ stack build

$ stack test

$ stack ghci

Contributing

Please feel free to submit pull requests!

By participating in this project you agree to follow the Contributor Code of Conduct and to release your contributions under the Unlicense.

The list of contributors is available on GitHub.

License

This is free and unencumbered software released into the public domain.
For more information, please refer to the UNLICENSE file or unlicense.org.