tagsoup: Parsing and extracting information from (possibly malformed) HTML documents

[ bsd3, library, xml ] [ Propose Tags ]

TagSoup is a library for extracting information out of unstructured HTML code, sometimes known as tag-soup. The HTML does not have to be well formed, or render properly within any particular framework. This library is for situations where the author of the HTML is not cooperating with the person trying to extract the information, but is also not trying to hide the information.

Modules

[Index]

Text
- HTML
  - Text.HTML.Download
  - Text.HTML.TagSoup

Downloads

tagsoup-0.4.tar.gz [browse] (Cabal source package)
Package description (revised from the package)

Note: This package has metadata revisions in the cabal description newer than included in the tarball. To unpack the package including the revisions, use 'cabal get'.

Maintainer's Corner

Package maintainers

NeilMitchell

For package maintainers and hackage trustees

edit package information

Candidates

No Candidates

Versions [RSS]	0.1, 0.4, 0.6, 0.8, 0.9, 0.10, 0.10.1, 0.11, 0.11.1, 0.12, 0.12.1, 0.12.2, 0.12.3, 0.12.4, 0.12.5, 0.12.6, 0.12.7, 0.12.8, 0.13, 0.13.1, 0.13.2, 0.13.3, 0.13.4, 0.13.5, 0.13.6, 0.13.7, 0.13.8, 0.13.9, 0.13.10, 0.14, 0.14.1, 0.14.2, 0.14.3, 0.14.4, 0.14.5, 0.14.6, 0.14.7, 0.14.8
Dependencies	base (<4.8), mtl, network [details]
License	BSD-3-Clause
Copyright	2006-8, Neil Mitchell
Author	Neil Mitchell
Maintainer	ndmitchell@gmail.com
Revised	Revision 1 made by AdamBergmark at 2015-04-02T15:43:03Z
Category	XML
Home page	http://www-users.cs.york.ac.uk/~ndm/tagsoup/
Uploaded	by NeilMitchell at 2008-01-14T17:57:13Z
Distributions	Arch:0.14.8, Debian:0.14.8, Fedora:0.14.8, FreeBSD:0.13.3, LTSHaskell:0.14.8, NixOS:0.14.8, Stackage:0.14.8, openSUSE:0.14.8
Reverse Dependencies	94 direct, 735 indirect [details]
Executables	tagsoup
Downloads	191359 total (270 in the last 30 days)
Rating	2.5 (votes: 4) [estimated by Bayesian average]
Your Rating	λ λ λ
Status	Docs uploaded by user Build status unknown [no reports yet]