úÎ3Ž,ĞR      !"#$%&'()*+,-./0123456789: ; < = > ? @ A B C D E F G H I J K L M N O P Q Safe SafeRSRSRSSafe SafeTUVWXYTUVWXTUVWXYSafe BAdds a prefix to a relative crawl action to get an absolute one. |      Safe   Safe ˙A crawl directive takes a content of a web page and produces crawl actions for links/forms to follow. The general idea is to specify a list of operations that in theory produces a dynamically collected tree of requests which leaves are either dead ends or end results.òAdditional, logical branching/combination of Directives is possible with: * Alternatives - evaluate both Directives in order. * Restart - evaluate completely new initial action & chain if the previous combo does not produce end results..access content to find absolute follow-up urls0as simple, but found relative urls are completed -as simple, but with access to complete result!(wait additional seconds before executing"5if given directive yields no results use add. retries#6fallback to second argument if first yields no results$=the possibility to start a new chain (when using alternative)%1not crawling anything, just a blacklisting option&chaining of directives  !"#$%&  !"#$%&  !"#$%&  !"#$%&NoneZ=Processes one step of a crawl chain: does the actual loading.[tUsed for preparation of integration tests: additionally stores the crawl result using the given file name strategy.\]Z[^_\]Z[\]Z[^_None`tMake a unique name for a crawl action - prefix is used to specify the target folder including a specific test prefix abc'()*`+d,'()*+'(()*+abc'()*`+d,Safe/name of crawl run0starting point17list of operations sequentially on all previous results2Kstore the content of a single result (the first) of the last operation step3Hstore the url of a single result (the first) of the last operation step-./0123-./0123-./0123-./0123Safe456789457689456789456789NoneeLazily evaluate each action in the sequence from left to right, and collect the results. PS: also playing around with an additional concat before returning :fgheijklmnop !"#$%&4678:mp :fgheijklmnopNoneqrqrqr None=XReturns only the first result of a completely matching branch of the crawling directive.>iReturns all possible results of the craling directive - meant to be used with lazyness in mind as needed.;<=>stu:;<=>=>;<:;<=>stu Safe?@ABCvDEFGHIJKLwMxyNOz{|}~€‚?A@BCDEFGHIJKLMNOHIJKLGNMO?@ADCEBF?@ABCvDEFGHIJKLwMxyNOz{|}~€‚ SafePQƒ„…†PQPQPQƒ„…†‡ !"#$%&&'()*+,-./0123456789:;<==>?@ABCCDEFGH I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b b c d e fghijklmnnopqrstuvwxyz{|}~  €  ‚ ƒ „ … † ‡ ˆ ‰ Š ‹ Œ  Ž   ‘ ’“)crawlchain-0.2.0.0-LoJ1GllmJM52r82MtXSminNetwork.URI.UtilNetwork.CrawlChain.StoringNetwork.CrawlChain.CrawlActionNetwork.CrawlChain.CrawlResult!Network.CrawlChain.CrawlDirective"Network.CrawlChain.CrawlingContext%Network.CrawlChain.CrawlingParameters'Network.CrawlChain.DirectiveChainResultNetwork.CrawlChain.CrawlChain"Text.HTML.CrawlChain.HtmlFiltering!Network.CrawlChain.BasicTemplatesNetwork.CrawlChain.UtilNetwork.CrawlChain.ReportNetwork.CrawlChain.CrawlingNetwork.CrawlChain.CrawlChainsNetwork.CrawlChain.DownloadingtoURI buildCurlCmdbuildAndCreateTargetDir CrawlAction GetRequest PostRequest PostParamsPostType UndefinedPostFormPostAJAXcrawlUrl addUrlPrefix$fShowPostType $fEqPostType$fShowCrawlAction$fEqCrawlActionCrawlingResultStatus CrawlingOkCrawlingRedirectCrawlingFailed CrawlResultcrawlingActioncrawlingContentcrawlingResultStatus$fShowCrawlingResultStatus$fEqCrawlingResultStatus$fShowCrawlResultCrawlDirectiveSimpleDirectiveRelativeDirectiveFollowUpDirectiveDelayDirectiveRetryDirectiveAlternativeDirectiveRestartChainDirectiveGuardDirectiveDirectiveSequenceCrawlingContextcrawlerdefaultContextstoringContextreadingContext'$fCrawlingContextDefaultCrawlingContextCrawlingParameters paramNameparamInitialActionparamCrawlDirectiveparamDoDownload paramDoStoreDirectiveChainResult resultHistory lastResultshowResultPathextractFirstResultexecuteCrawlChainexecuteActions crawlForUrl crawlChain crawlChainsMethodPOSTGETContainedTextFilter AttrFilter noUrlFilter noAttrFilter noTextFilter unevaluated extractLinksextractLinksMatchingextractLinksWithAttributesextractLinksFilteringUrlAttrsextractLinksFilteringAllfindFirstLinkAfterfindAllUrlsEndingWithextractFirstFormsearchWebTemplatesearchWebTemplateAndProcessHitslogMsg delaySecondsReport reportMsg reportDetailsshowFullReport $fShowReportcrawl crawlAndStoreCrawlActionDescriberCrawler crawlInternal ajaxRequestbufferingFilenameDefaultCrawlingContextcrawlImplementation readFromFileslazyIOsequencefollowDirectivefollowDirectiveSequence wrapResults errReportokReportreportcrawlWasNoSuccess>>+makeAbsoluteLogicMappercombineAbsoluteUrlscombineAbsoluteUrl downloadTostoreDownloadAction downloadSteplogAndReturnFirstOkputDetailsOnFailureTagSextractLinksFilteringgetSrc getTagAttrs isFormStart isFormClose isFormTag isFormStartOf tagAttributes extractFormextractFormParamfindExtraParamssplitOneOfRetainingNonEmptysearchWebActionfilterToUrlsContainingAllOffilterToUrlsContainingTextretainActionsContaining