The HandsomeSoup package

[Tags: bsd3, library]

See examples and full readme on the Github page:

[Skip to ReadMe]


Versions0.1, 0.2, 0.3, 0.3.1, 0.3.2, 0.3.3, 0.3.4, 0.3.5, 0.4, 0.4.2
Change logNone available
Dependenciesbase (<5), containers, HTTP, hxt, hxt-http, MaybeT, mtl, network (<2.6), parsec, transformers [details]
AuthorAditya Bhargava
Home page
UploadedSat Aug 17 02:51:50 UTC 2013 by AdityaBhargava
UpdatedSun May 10 12:34:13 UTC 2015 by AdamBergmark to revision 1
DistributionsLTSHaskell:0.4.2, NixOS:0.4.2, Stackage:0.4.2
Downloads3811 total (49 in last 30 days)
0 []
StatusDocs uploaded by user
Build status unknown [no reports yet]




Maintainers' corner

For package maintainers and hackage trustees

Readme for HandsomeSoup-0.3.2


Current Status: Usable. Please file bugs!

HandsomeSoup is the library I wish I had when I started parsing HTML in Haskell.

It is built on top of HXT and adds a few functions that make it easier to work with HTML.

Most importantly, it adds CSS selectors to HXT. The goal of HandsomeSoup is to be a complete CSS2 selector parser for HXT.


cabal install HandsomeSoup


Nokogiri, the HTML parser for Ruby, has an example showing how to scrape Google search results. This is easy in HandsomeSoup:

main = do
    doc <- fromUrl ""
    links <- runX $ doc >>> css "h3.r a" ! "href"
    mapM_ putStrLn links

What can HandsomeSoup do for you?

Easily parse an online page using fromUrl

doc <- fromUrl ""

Or a local page using parseHtml

contents <- readFile [filename]
let doc = parseHtml contents

Easily extract elements using css

Here are some valid selectors:

doc <<< css "a"
doc <<< css "*"
doc <<< css "a#link1"
doc <<< css ""
doc <<< css "p > a"
doc <<< css "p strong"
doc <<< css "#container h1"
doc <<< css "img[width]"
doc <<< css "img[width=400]"
doc <<< css "a[class~=bar]"
doc <<< css "a:first-child"

Easily get attributes using (!)

doc <<< css "img" ! "src"
doc <<< css "a" ! "href"


Find Haddock docs on Hackage.

I also wrote The Complete Guide To Parsing HXT With Haskell.


Made by Adit.