shpider- Web automation library in Haskell.



This module provides all the settable options in shpider.



stayOnDomain :: Bool -> Shpider ()Source

Setting this to True will forbid you to download and sendForm to any site which isn't on the domain shared by the url given in setStartPage.

setTimeOut :: Long -> Shpider ()Source

Set the CurlTimeout option. Requests will TimeOut after this number of seconds.

setStartPage :: String -> Shpider ()Source

Set the start page of your shpidering antics. The start page must be an absolute URL, if not, this will raise an error.

getStartPage :: Shpider StringSource

Return the starting URL, as set by setStartPage

onlyDownloadHtml :: Bool -> Shpider ()Source

If onlyDownloadHtml is True, then during download, shpider will make a HEAD request to see if the content type is text/html or application/xhtml+xml, and only if it is, then it will make a GET request.

setCurrentPage :: Page -> Shpider ()Source

Set the given page as the currentPage.

getCurrentPage :: Shpider PageSource

Return the current page

keepTrack :: Shpider ()Source

When keepTrack is set, shpider will remember the pages which have been visited.

addCurlOpts :: [CurlOption] -> Shpider ()Source

Add CURL options to Shpider

setCurlOpts :: [CurlOption] -> Shpider ()Source

Set Shpider's CURL options from scratch

setThrottle :: Maybe Int -> Shpider ()Source

Set download throttling, so that subsequent calls to download or sendForm block, making sure at least N micro-seconds pass. Passing a Nothing would disable any throttling.