The html-parse package
This package provides a fast and reasonably robust HTML5 tokenizer built upon the attoparsec library. The parsing strategy is based upon the HTML5 parsing specification with few deviations.
The package targets similar use-cases to the venerable tagsoup library, but is significantly more efficient, achieving parsing speeds of over 50 megabytes per second on modern hardware with and typical web documents.
|Dependencies||attoparsec (==0.13.*), base (>=4.8 && <4.10), deepseq (==1.4.*), text (==1.2.*) [details]|
|Copyright||(c) 2016 Ben Gamari|
|Source repository||head: git clone git://github.com/bgamari/html-parse|
|Uploaded||Wed Nov 23 17:02:09 UTC 2016 by BenGamari|
|Downloads||109 total (18 in the last 30 days)|
|Status||Docs uploaded by user
Build status unknown [no reports yet]
Hackage Matrix CI
For package maintainers and hackage trustees