tagsoup: Parsing and extracting information from (possibly malformed) HTML documents

[ bsd3, library, xml ] [ Propose Tags ]

TagSoup is a library for extracting information out of unstructured HTML code, sometimes known as tag-soup. The HTML does not have to be well formed, or render properly within any particular framework. This library is for situations where the author of the HTML is not cooperating with the person trying to extract the information, but is also not trying to hide the information.

Downloads

Note: This package has metadata revisions in the cabal description newer than included in the tarball. To unpack the package including the revisions, use 'cabal get'.

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees

Candidates

  • No Candidates
Versions [RSS] 0.1, 0.4, 0.6, 0.8, 0.9, 0.10, 0.10.1, 0.11, 0.11.1, 0.12, 0.12.1, 0.12.2, 0.12.3, 0.12.4, 0.12.5, 0.12.6, 0.12.7, 0.12.8, 0.13, 0.13.1, 0.13.2, 0.13.3, 0.13.4, 0.13.5, 0.13.6, 0.13.7, 0.13.8, 0.13.9, 0.13.10, 0.14, 0.14.1, 0.14.2, 0.14.3, 0.14.4, 0.14.5, 0.14.6, 0.14.7, 0.14.8
Dependencies base (<4.8), mtl, network [details]
License BSD-3-Clause
Copyright 2006-8, Neil Mitchell
Author Neil Mitchell
Maintainer ndmitchell@gmail.com
Revised Revision 1 made by AdamBergmark at 2015-04-02T15:43:03Z
Category XML
Home page http://www-users.cs.york.ac.uk/~ndm/tagsoup/
Uploaded by NeilMitchell at 2008-01-14T17:57:13Z
Distributions Arch:0.14.8, Debian:0.14.8, Fedora:0.14.8, FreeBSD:0.13.3, LTSHaskell:0.14.8, NixOS:0.14.8, Stackage:0.14.8, openSUSE:0.14.8
Reverse Dependencies 94 direct, 733 indirect [details]
Executables tagsoup
Downloads 191003 total (244 in the last 30 days)
Rating 2.5 (votes: 4) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Docs uploaded by user
Build status unknown [no reports yet]