bTP      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNO CThe abstract type describing the monadic state of a list of pairs. 7Take a monadic PairsWriter and return a list of pairs. #Make a list of pairs of pairs like   pairs $ do $ 3 =: ( " is my favourite number or " , 5 )  10 =: ( " pints have I drunk or was it " , 11 )  ^ShpiderCode describes the various contingencies which may occur during a shpider transaction.   Converts a P to a . aQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~P     /A case insensitive lookup for html attributes. 8Drops whitespace from the beginning and end of strings. Turns a String lowercase.  rant~In my humble opinion, and considering that a few different packages implement this meager code, this should be in the prelude. /rant Encode spaces in a URL  +Plain old form: Method, action and inputs. Either GET or POST. KTakes a form and fills out the inputs with the given [ ( String , String ) ].  It is convienent to use the  syntax here.   f : _ <- getFormsByAction "http:// whatever.com"  sendForm $  f $  $ do  "author" =: "Johnny"  "message" =: "Nice syntax dewd." zThe first argument is the action attribute of the form, the second is the method attribute, and the third are the inputs. $Gets all forms from a list of tags. The  which parses all forms.   PLinks have an address, corresponding to the href attribute, and some inner tex. %Parse all links from a list of tags. 2The parser responsible for getting all the links.    !The Page datatype. Holds s, s, the parsed [  ], the page source, and the page's absolute URL. "#$%&'(<The type of Shpider computations. A state transformer over ) and . )_The shpider state holds all the options for shpider transactions, the current page and all the s used when calling curl. *+,-./01@Run a Shpider computation, returning the result with the state. 21Run a Shpider computation, returning the result. The initial shpider state. S Currently, CurlTimeout is hard wired to 3, and cookies are saved in a file called cookies. 3*An empty page, containing no information. V!"#$%&'()*+,-./0123)*+,-./0!"#$%&'(321!"#$%&'"#$%&'()*+,-./0*+,-./01234jis the second url on the same domain as the first? Note: this will return False if either URL is invalid. 5%Assumes the given URL is relative to /. 6True if the url is absolute 7is the given string of form "mailto:person.com"? 8is the url a http url? 9*Get the protocol and domain from a URL eg   getDomain "widdle:// owqueer.co.uk/strangeanticsofsailors/jimmy"  -- "widdle:// owqueer.co.uk" :RGet the whole url up to and including the current folder of the present document.   getFolder "widdle:// owqueer.co.uk/strangeanticsofsailors/jimmy"  -- "widdle:// owqueer.co.uk/strangeanticsofsailors/" '     456789:456789:456789:;Setting this to  will forbid you to download and sendForm to any site which isn'+t on the domain shared by the url given in =. <QSet the CurlTimeout option. Requests will TimeOut after this number of seconds. =.Set the start page of your shpidering antics. K The start page must be an absolute URL, if not, this will raise an error. >#Return the starting URL, as set by = ?)If onlyDownloadHtml is True, then during downloadE, shpider will make a HEAD request to see if the content type is text/html or application/?xhtml+xml, and only if it is, then it will make a GET request. @Set the given page as the /. AReturn the current page BGWhen keepTrack is set, shpider will remember the pages which have been 0. ;<=>?@AB;<=>?@AB;<=>?@AB Cif B, has been set, then haveVisited will return $ if the given URL has been visited. D+Parse a given URL and source html into the "! datatype. ! This will set the current page. ERFetch whatever is at this address, and attempt to parse the content into a Page. 1 Return the status code with the parsed content. FYwithAuthorizedDomain will execute the function if the url given is an authorized domain.  See K. G9Send a form to the URL specified in its action attribute H&Return the links on the current page. I&Return the forms on the current page. J%Get all links which match this text. KIf ;7 has been set to true, then isAuthorizedDomain returns < if the given URL is on the domain and false otherwise. If ;+ has not been set to True, then it returns . L-Get all links whose text matches this regex. M4Get all forms whose action matches the given action NO0Get all links whose address matches this regex.  QRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~P       !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNO EGJLOMNHIDKFC CDEFGHIJKLMNO     !"#$$%&'())*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYWXZWX[WX\WX]WX^WX_WX`WXaWXbWXcWXdWXeWXfWXgWXhWXiWXjWXkWXlWXmWXnWXoWXpWXqWXrWXsWXtWXuWXvWXwWXxWXyWXzWX{WX|WX}WX~WXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWXWX W                                              !  "  #  $  %  &  '  (  )  *  +  +,-. shpider-0.2Network.Shpider.PairsNetwork.Shpider.CodeNetwork.Shpider.FormsNetwork.Shpider.LinksNetwork.Shpider.StateNetwork.Shpider.URLNetwork.Shpider.OptionsNetwork.ShpiderNetwork.Shpider.TextUtils PairsWriterpairs=: ShpiderCodeUnsupportedCurlStatusTimeOutUnsupportedProtocolNoHost WrongDataOffSite HttpError InvalidURLOkccToShFormmethodactioninputsMethodPOSTGET fillOutFormmkForm gatherFormsallFormstoFormLink linkAddresslinkText gatherLinksallLinksPagelinksformstagssourceaddrShpider ShpiderStateSShtmlOnlyDownloads startPagedontLeaveDomaincurlOpts currentPagevisited runShpiderSt runShpider emptyPage isSameDomain mkAbsoluteUrl isAbsoluteUrlisMailtoisHttp getDomain getFolder stayOnDomain setTimeOut setStartPage getStartPageonlyDownloadHtmlsetCurrentPagegetCurrentPage keepTrack haveVisited parsePagedownloadwithAuthorizedDomainsendForm currentLinks currentFormsgetLinksByTextisAuthorizedDomaingetLinksByTextRegexgetFormsByActiongetFormsHasActiongetLinksByAddressRegex curl-1.3.7Network.Curl.CodeCurlCodetoCodeCurlOKCurlUnspportedProtocolCurlFailedInitCurlUrlMalformatCurlUrlMalformatUserCurlCouldntResolveProxyCurlCouldntResolveHostCurlCouldntConnectCurlFtpWeirdServerReplyCurlFtpAccessDeniedCurlFtpUserPasswordIncorrectCurlFtpWeirdPassReplyCurlFtpWeirdUserReplyCurlFtpWeirdPASVReplyCurlFtpWeird227FormatCurlFtpCantGetHostCurlFtpCantReconnectCurlFtpCouldnSetBinaryCurlPartialFileCurlFtpCouldntRetrFileCurlFtpWriteErrorCurlFtpQuoteErrorCurlHttpReturnedErrorCurlWriteErrorCurlMalformatErrorCurlFtpCouldnStorFile CurlReadErrorCurlOutOfMemoryCurlOperationTimeoutCurlFtpCouldntSetAsciiCurlFtpPortFailedCurlFtpCouldntUseRestCurlFtpCouldntGetSizeCurlHttpRangeErrorCurlHttpPostErrorCurlSSLConnectErrorCurlBadDownloadResumeCurlFileCouldntReadFileCurlLDAPCannotBindCurlLDPAPSearchFailedCurlLibraryNotFoundCurlFunctionNotFoundCurlAbortedByCallbackCurlBadFunctionArgumentCurlBadCallingOrderCurlInterfaceFailedCurlBadPasswordEnteredCurlTooManyRedirectsCurlUnknownTelnetOptionCurlTelnetOptionSyntax CurlObsoleteCurlSSLPeerCertificateCurlGotNothingCurlSSLEngineNotFoundCurlSSLEngineSetFailed CurlSendError CurlRecvErrorCurlShareInUseCurlSSLCertProblem CurlSSLCipher CurlSSLCACertCurlBadContentEncodingCurlLDAPInvalidUrlCurlFilesizeExceededCurlFtpSSLFailedCurlSendFailRewindCurlSSLEngineInitFailedCurlLoginDeniedCurlTFtpNotFound CurlTFtpPermCurlTFtpDiskFullCurlTFtpIllegalCurlTFtpUnknownIdCurlTFtpExistsCurlTFtpNoSuchUserCurlConvFailed CurlConvReqdCurlSSLCACertBadFileCurlRemoveFileNotFoundCurlSSHCurlSSLShutdownFailed CurlAgainCurlSSLCRLBadFileCurlSSLIssuerError attrLookuptrim lowercase escapeSpacestagsoup-parsec-0.0.7Text.HTML.TagSoup.Parsec TagParsertagsoup-0.12.2Text.HTML.TagSoup.TypeTagghc-prim GHC.TypesIONetwork.Curl.Opts CurlOption initialStbaseGHC.Basefail>>=>>returnControl.Monad.FixmfixMonadFunctorMonadFix Control.Monad MonadPlus Data.FunctionfixmfilterapliftM5liftM4liftM3liftM2liftMunlesswhen replicateM_ replicateMfoldM_foldM zipWithM_zipWithM mapAndUnzipMjoinvoidforever<=<>=>msumforM_forMfilterMguardmapM_mapM sequence_sequence=<<mplusmzerofmaptransformers-0.2.2.0Control.Monad.Trans.Class MonadTransliftControl.Monad.IO.ClassMonadIOliftIO mtl-2.0.1.0Control.Monad.State.Classgetsmodifyputget MonadStateControl.Monad.Trans.State.LazyStateT runStateTStatestaterunState evalState execStatemapState withState evalStateT execStateT mapStateT withStateT url-2.1.2 Network.URLok_urlok_pathok_paramok_host decString encString exportParams exportURL exportHost importParams importURL add_paramsecure secure_protporthostprotocolHostHTTPFTPRawProtProtocolAbsolute HostRelative PathRelativeURLType url_paramsurl_pathurl_typeURLGHC.BoolTrue