|
|
|
Description |
This module exposes the main functionality of shpider
It allows you to quickly write crawlers, and for simple cases even without reading the page source eg.
runShpider $ do
download "http://hackage.haskell.org/packages/archive/pkg-list.html"
l : _ <- getLinksByText "shpider"
download $ linkAddress l
|
|
Synopsis |
|
|
|
Documentation |
|
module Network.Shpider.Code |
|
module Network.Shpider.State |
|
module Network.Shpider.URL |
|
module Network.Shpider.Options |
|
module Network.Shpider.Forms |
|
module Network.Shpider.Links |
|
|
Fetch whatever is at this address, and attempt to parse the content into a Page.
Return the status code with the parsed content.
|
|
|
Send a form to the URL specified in its action attribute
|
|
|
Get all links which match this text.
|
|
|
Get all links whose text matches this regex.
|
|
|
Get all links whose address matches this regex.
|
|
|
Get all forms whose action matches the given action
|
|
|
Return the links on the current page.
|
|
|
Return the forms on the current page.
|
|
|
Parse a given URL and source html into the Page datatype.
This will set the current page.
|
|
|
If stayOnDomain has been set to true, then isAuthorizedDomain returns True if the given URL is on the domain and false otherwise. If stayOnDomain has not been set to True, then it returns True.
|
|
|
withAuthorizedDomain will execute the function if the url given is an authorized domain.
See isAuthorizedDomain.
|
|
|
if keepTrack has been set, then haveVisited will return True if the given URL has been visited.
|
|
Produced by Haddock version 2.4.2 |