tagsoup: Parsing and extracting information from (possibly malformed) HTML documents

[ bsd3, library, xml ] [ Propose Tags ]

TagSoup is a library for extracting information out of unstructured HTML code, sometimes known as tag-soup. The HTML does not have to be well formed, or render properly within any particular framework. This library is for situations where the author of the HTML is not cooperating with the person trying to extract the information, but is also not trying to hide the information.

Versions 0.1, 0.4, 0.6, 0.8, 0.9, 0.10, 0.10.1, 0.11, 0.11.1, 0.12, 0.12.1, 0.12.2, 0.12.3, 0.12.4, 0.12.5, 0.12.6, 0.12.7, 0.12.8, 0.13, 0.13.1, 0.13.2, 0.13.3, 0.13.4, 0.13.5, 0.13.6, 0.13.7, 0.13.8, 0.13.9, 0.13.10, 0.14, 0.14.1, 0.14.2, 0.14.3, 0.14.4, 0.14.5, 0.14.6, 0.14.7
Dependencies base (<4.8), containers, mtl, network [details]
License BSD-3-Clause
Copyright 2006-8, Neil Mitchell
Author Neil Mitchell
Maintainer ndmitchell@gmail.com
Revised Revision 1 made by AdamBergmark at Thu Apr 2 15:41:50 UTC 2015
Category XML
Home page http://www-users.cs.york.ac.uk/~ndm/tagsoup/
Uploaded by NeilMitchell at Wed Apr 23 11:03:49 UTC 2008
Distributions Arch:0.14.7, Debian:0.14.6, Fedora:0.14.6, FreeBSD:0.13.3, LTSHaskell:0.14.7, NixOS:0.14.7, Stackage:0.14.7, openSUSE:0.14.7
Executables tagsoup
Downloads 147828 total (460 in the last 30 days)
Rating 2.5 (votes: 3) [estimated by rule of succession]
Your Rating
  • λ
  • λ
  • λ
Status Docs uploaded by user
Build status unknown [no reports yet]
Hackage Matrix CI

Modules

[Index]

Flags

NameDescriptionDefaultType
splitbase

Choose the new smaller, split-up base package.

EnabledAutomatic

Use -f <flag> to enable a flag, or -f -<flag> to disable that flag. More info

Downloads

Note: This package has metadata revisions in the cabal description newer than included in the tarball. To unpack the package including the revisions, use 'cabal get'.

Maintainer's Corner

For package maintainers and hackage trustees