http-conduit-downloader-1.0.16: HTTP downloader tailored for web-crawler needs.

Safe HaskellNone

Network.HTTP.Conduit.Downloader

Contents

Synopsis

Download operations

urlGetContents :: String -> IO ByteStringSource

Download single URL with default DownloaderSettings. Fails if result is not DROK.

urlGetContentsPost :: String -> ByteString -> IO ByteStringSource

Post data and download single URL with default DownloaderSettings. Fails if result is not DROK.

downloadSource

Arguments

:: Downloader 
-> String

URL

-> Maybe HostAddress

Optional resolved HostAddress

-> DownloadOptions 
-> IO DownloadResult 

Perform download

downloadGSource

Arguments

:: (Request -> ResourceT IO Request)

Function to modify Request (e.g. sign or make postRequest)

-> Downloader 
-> String

URL

-> Maybe HostAddress

Optional resolved HostAddress

-> DownloadOptions 
-> IO DownloadResult 

Generic version of download with ability to modify http-conduit Request.

data DownloadResult Source

Result of download operation.

Constructors

DROK ByteString DownloadOptions

Successful download with data and options for next download.

DRRedirect String

Redirect URL

DRError String

Error

DRNotModified

HTTP 304 Not Modified

type DownloadOptions = [String]Source

If-None-Match and/or If-Modified-Since headers.

Downloader

data DownloaderSettings Source

Settings used in downloader.

Constructors

DownloaderSettings 

Fields

dsUserAgent :: ByteString

User agent string. Default: "Mozilla/5.0 (compatible; HttpConduitDownloader/1.0; +http://hackage.haskell.org/package/http-conduit-downloader)".

Be a good crawler. Provide your User-Agent please.

dsTimeout :: Int

Download timeout. Default: 30 seconds.

dsManagerSettings :: ManagerSettings

Conduit Manager settings. Default: ManagerSettings with SSL certificate checks removed.

dsMaxDownloadSize :: Int

Download size limit. Default: 10MB.

data Downloader Source

Keeps http-conduit Manager and DownloaderSettings.

withDownloader :: (Downloader -> IO a) -> IO aSource

Create a new Downloader, use it in the provided function, and then release it.

withDownloaderSettings :: DownloaderSettings -> (Downloader -> IO a) -> IO aSource

Create a new Downloader with provided settings, use it in the provided function, and then release it.

Utils

postRequest :: ByteString -> Request -> RequestSource

Make HTTP POST request.

sinkByteString :: MonadIO m => Int -> Sink ByteString m (Maybe ByteString)Source

Sink data using 32k buffers to reduce memory fragmentation. Returns Nothing if downloaded too much data.