http-streams-0.7.2.2: An HTTP client using io-streams

MaintainerAndrew Cowie
StabilityExperimental
Safe HaskellNone
LanguageHaskell2010

Network.Http.Client

Contents

Description

Overview

A simple HTTP client library, using the Snap Framework's io-streams library to handle the streaming I/O. The http-streams API is designed for ease of use when querying web services and dealing with the result.

Given:

{-# LANGUAGE OverloadedStrings #-}

import System.IO.Streams (InputStream, OutputStream, stdout)
import qualified System.IO.Streams as Streams
import qualified Data.ByteString as S

and this library:

import Network.Http.Client

the underlying API is straight-forward. In particular, constructing the Request to send is quick and to the point:

main :: IO ()
main = do
    c <- openConnection "www.example.com" 80

     q <- buildRequest $ do
        http GET "/"
        setAccept "text/html"

     sendRequest c q emptyBody

     receiveResponse c (\p i -> do
        xm <- Streams.read i
        case xm of
            Just x    -> S.putStr x
            Nothing   -> "")

     closeConnection c

which would print the first chunk of the response back from the server. Obviously in real usage you'll do something more interesting with the Response in the handler function, and consume the entire response body from the InputStream ByteString.

Because this is all happening in IO (the defining feature of io-streams!), you can ensure resource cleanup on normal or abnormal termination by using Control.Exception's standard bracket function; see closeConnection for an example. For the common case we have a utility function which wraps bracket for you:

foo :: IO ByteString
foo = withConnection (openConnection "www.example.com" 80) doStuff

 doStuff :: Connection -> IO ByteString

There are also a set of convenience APIs that do just that, along with the tedious bits like parsing URLs. For example, to do an HTTP GET and stream the response body to stdout, you can simply do:

    get "http://www.example.com/file.txt" (\p i -> Streams.connect i stdout)

which on the one hand is "easy" while on the other exposes the the Response and InputStream for you to read from. Of course, messing around with URLs is all a bit inefficient, so if you already have e.g. hostname and path, or if you need more control over the request being created, then the underlying http-streams API is simple enough to use directly.

Synopsis

Connecting to server

type Port = Word16

data Connection Source

A connection to a web server.

Instances

openConnection :: Hostname -> Port -> IO Connection Source

In order to make a request you first establish the TCP connection to the server over which to send it.

Ordinarily you would supply the host part of the URL here and it will be used as the value of the HTTP 1.1 Host: field. However, you can specify any server name or IP addresss and set the Host: value later with setHostname when building the request.

Usage is as follows:

    c <- openConnection "localhost" 80
    ...
    closeConnection c

More likely, you'll use withConnection to wrap the call in order to ensure finalization.

HTTP pipelining is supported; you can reuse the connection to a web server, but it's up to you to ensure you match the number of requests sent to the number of responses read, and to process those responses in order. This is all assuming that the server supports pipelining; be warned that not all do. Web browsers go to extraordinary lengths to probe this; you probably only want to do pipelining under controlled conditions. Otherwise just open a new connection for subsequent requests.

Building Requests

You setup a request using the RequestBuilder monad, and get the resultant Request object by running buildRequest. The first call doesn't have to be to http, but it looks better when it is, don't you think?

Sending HTTP request

data Response :: *

Instances

Show Response 
HttpType Response 

getHostname :: Connection -> Request -> ByteString Source

Get the virtual hostname that will be used as the Host: header in the HTTP 1.1 request. Per RFC 2616 § 14.23, this will be of the form hostname:port if the port number is other than the default, ie 80 for HTTP.

sendRequest :: Connection -> Request -> (OutputStream Builder -> IO α) -> IO α Source

Having composed a Request object with the headers and metadata for this connection, you can now send the request to the server, along with the entity body, if there is one. For the rather common case of HTTP requests like GET that don't send data, use emptyBody as the output stream:

    sendRequest c q emptyBody

For PUT and POST requests, you can use fileBody or inputStreamBody to send content to the server, or you can work with the io-streams API directly:

    sendRequest c q (\o ->
        Streams.write (Just (Builder.fromString "Hello World\n")) o)

emptyBody :: OutputStream Builder -> IO () Source

Use this for the common case of the HTTP methods that only send headers and which have no entity body, i.e. GET requests.

fileBody :: FilePath -> OutputStream Builder -> IO () Source

Specify a local file to be sent to the server as the body of the request.

You use this partially applied:

    sendRequest c q (fileBody "/etc/passwd")

Note that the type of (fileBody "/path/to/file") is just what you need for the third argument to sendRequest, namely

>>> :t filePath "hello.txt"
:: OutputStream Builder -> IO ()

inputStreamBody :: InputStream ByteString -> OutputStream Builder -> IO () Source

Read from a pre-existing InputStream and pipe that through to the connection to the server. This is useful in the general case where something else has handed you stream to read from and you want to use it as the entity body for the request.

You use this partially applied:

    i <- getStreamFromVault                    -- magic, clearly
    sendRequest c q (inputStreamBody i)

This function maps "Builder.fromByteString" over the input, which will be efficient if the ByteString chunks are large.

encodedFormBody :: [(ByteString, ByteString)] -> OutputStream Builder -> IO () Source

Specify name/value pairs to be sent to the server in the manner used by web browsers when submitting a form via a POST request. Parameters will be URL encoded per RFC 2396 and combined into a single string which will be sent as the body of your request.

You use this partially applied:

    let nvs = [("name","Kermit"),
               ("type","frog")]
               ("role","stagehand")]

    sendRequest c q (encodedFormBody nvs)

Note that it's going to be up to you to call setContentType with a value of "application/x-www-form-urlencoded" when building the Request object; the postForm convenience (which uses this encodedFormBody function) takes care of this for you, obviously.

Processing HTTP response

receiveResponse :: Connection -> (Response -> InputStream ByteString -> IO β) -> IO β Source

Handle the response coming back from the server. This function hands control to a handler function you supply, passing you the Response object with the response headers and an InputStream containing the entity body.

For example, if you just wanted to print the first chunk of the content from the server:

    receiveResponse c (\p i -> do
        m <- Streams.read i
        case m of
            Just bytes -> putStr bytes
            Nothing    -> return ())

Obviously, you can do more sophisticated things with the InputStream, which is the whole point of having an io-streams based HTTP client library.

The final value from the handler function is the return value of receiveResponse, if you need it.

Throws UnexpectedCompression if it doesn't know how to handle the compression format used in the response.

receiveResponseRaw :: Connection -> (Response -> InputStream ByteString -> IO β) -> IO β Source

This is a specialized variant of receiveResponse that explicitly does not handle the content encoding of the response body stream (it will not decompress anything). Unless you really want the raw gzipped content coming down from the server, use receiveResponse.

debugHandler :: Response -> InputStream ByteString -> IO () Source

Print the response headers and response body to stdout. You can use this with receiveResponse or one of the convenience functions when testing. For example, doing:

    c <- openConnection "kernel.operationaldynamics.com" 58080

    q <- buildRequest $ do
        http GET "/time"

    sendRequest c q emptyBody

    receiveResponse c debugHandler

would print out:

HTTP/1.1 200 OK
Transfer-Encoding: chunked
Content-Type: text/plain
Vary: Accept-Encoding
Server: Snap/0.9.2.4
Content-Encoding: gzip
Date: Mon, 21 Jan 2013 06:13:37 GMT

Mon 21 Jan 13, 06:13:37.303Z

or thereabouts.

concatHandler :: Response -> InputStream ByteString -> IO ByteString Source

Sometimes you just want the entire response body as a single blob. This function concatonates all the bytes from the response into a ByteString. If using the main http-streams API, you would use it as follows:

   ...
   x' <- receiveResponse c concatHandler
   ...

The methods in the convenience API all take a function to handle the response; this function is passed directly to the receiveResponse call underlying the request. Thus this utility function can be used for get as well:

   x' <- get "http://www.example.com/document.txt" concatHandler

Either way, the usual caveats about allocating a single object from streaming I/O apply: do not use this if you are not absolutely certain that the response body will fit in a reasonable amount of memory.

Note that this function makes no discrimination based on the response's HTTP status code. You're almost certainly better off writing your own handler function.

concatHandler' :: Response -> InputStream ByteString -> IO ByteString Source

A special case of concatHandler, this function will return the entire response body as a single ByteString, but will throw HttpClientError if the response status code was other than 2xx.

jsonHandler :: FromJSON α => Response -> InputStream ByteString -> IO α Source

If you're working with a data stream that is in application/json, then chances are you're using aeson to handle the JSON to Haskell decoding. If so, then this helper function might be of use.

    v <- get "http://api.example.com/v1/" jsonHandler

This function feeds the input body to the json' attoparsec Parser in order to get the aeson Value type. This is then marshalled to your type represeting the source data, via the FromJSON typeclass.

The above example was actually insufficient; when working with aeson you need to fix the type so it knows what FromJSON instance to use. Let's say you're getting Person objects, then it would be

    v <- get "http://api.example.com/v1/person/461" jsonHandler :: IO Person

assuming your Person type had a FromJSON instance, of course.

Note

This function parses a single top level JSON object or array, which is all you're supposed to get if it's a valid document. People do all kinds of crazy things though, so beware. Also, this function (like the "concatHander" convenience) loads the entire response into memory; it's not streaming; if you're receiving a document which is (say) a very long array of objects then you may want to implement your own handler function, perhaps using "Streams.parserToInputStream" and the Parser combinators directly — with a result type of InputStream Value, perhaps — by which you could then iterate over the Values one at a time in constant space.

Resource cleanup

closeConnection :: Connection -> IO () Source

Shutdown the connection. You need to call this release the underlying socket file descriptor and related network resources. To do so reliably, use this in conjunction with openConnection in a call to bracket:

--
-- Make connection, cleaning up afterward
--

foo :: IO ByteString
foo = bracket
   (openConnection "localhost" 80)
   (closeConnection)
   (doStuff)

--
-- Actually use Connection to send Request and receive Response
--

doStuff :: Connection -> IO ByteString

or, just use withConnection.

While returning a ByteString is probably the most common use case, you could conceivably do more processing of the response in doStuff and have it and foo return a different type.

withConnection :: IO Connection -> (Connection -> IO γ) -> IO γ Source

Given an IO action producing a Connection, and a computation that needs one, runs the computation, cleaning up the Connection afterwards.

    x <- withConnection (openConnection "s3.example.com" 80) $ (\c -> do
        q <- buildRequest $ do
            http GET "/bucket42/object/149"
        sendRequest c q emptyBody
        ...
        return "blah")

which can make the code making an HTTP request a lot more straight-forward.

Wraps Control.Exception's bracket.

Convenience APIs

Some simple functions for making requests with useful defaults. There's no head function for the usual reason of needing to avoid collision with Prelude.

These convenience functions work with http and https, but note that if you retrieve an https URL, you must wrap your main function with withOpenSSL to initialize the native openssl library code.

get Source

Arguments

:: URL

Resource to GET from.

-> (Response -> InputStream ByteString -> IO β)

Handler function to receive the response from the server.

-> IO β 

Issue an HTTP GET request and pass the resultant response to the supplied handler function. This code will silently follow redirects, to a maximum depth of 5 hops.

The handler function is as for receiveResponse, so you can use one of the supplied convenience handlers if you're in a hurry:

    x' <- get "http://www.bbc.co.uk/news/" concatHandler

But as ever the disadvantage of doing this is that you're not doing anything intelligent with the HTTP response status code. If you want an exception raised in the event of a non 2xx response, you can use:

    x' <- get "http://www.bbc.co.uk/news/" concatHandler'

but for anything more refined you'll find it easy to simply write your own handler function.

Throws TooManyRedirects if more than 5 redirects are thrown.

post Source

Arguments

:: URL

Resource to POST to.

-> ContentType

MIME type of the request body being sent.

-> (OutputStream Builder -> IO α)

Handler function to write content to server.

-> (Response -> InputStream ByteString -> IO β)

Handler function to receive the response from the server.

-> IO β 

Send content to a server via an HTTP POST request. Use this function if you have an OutputStream with the body content.

postForm Source

Arguments

:: URL

Resource to POST to.

-> [(ByteString, ByteString)]

List of name=value pairs. Will be sent URL-encoded.

-> (Response -> InputStream ByteString -> IO β)

Handler function to receive the response from the server.

-> IO β 

Send form data to a server via an HTTP POST request. This is the usual use case; most services expect the body to be MIME type application/x-www-form-urlencoded as this is what conventional web browsers send on form submission. If you want to POST to a URL with an arbitrary Content-Type, use post.

put Source

Arguments

:: URL

Resource to PUT to.

-> ContentType

MIME type of the request body being sent.

-> (OutputStream Builder -> IO α)

Handler function to write content to server.

-> (Response -> InputStream ByteString -> IO β)

Handler function to receive the response from the server.

-> IO β 

Place content on the server at the given URL via an HTTP PUT request, specifying the content type and a function to write the content to the supplied OutputStream. You might see:

    put "http://s3.example.com/bucket42/object149" "text/plain"
        (fileBody "hello.txt") (\p i -> do
            putStr $ show p
            Streams.connect i stdout)

Secure connections

openConnectionSSL :: SSLContext -> Hostname -> Port -> IO Connection Source

Open a secure connection to a web server.

import OpenSSL (withOpenSSL)

main :: IO ()
main = do
    ctx <- baselineContextSSL
    c <- openConnectionSSL ctx "api.github.com" 443
    ...
    closeConnection c

If you want to tune the parameters used in making SSL connections, manually specify certificates, etc, then setup your own context:

import OpenSSL.Session (SSLContext)
import qualified OpenSSL.Session as SSL

    ...
    ctx <- SSL.context
    ...

See OpenSSL.Session.

Crypto is as provided by the system openssl library, as wrapped by the HsOpenSSL package and openssl-streams.

/There is no longer a need to call withOpenSSL explicitly; the initialization is invoked once per process for you/

baselineContextSSL :: IO SSLContext Source

Creates a basic SSL context. This is the SSL context used if you make an "https://" request using one of the convenience functions. It configures OpenSSL to use the default set of ciphers.

On Linux systems, this function also configures OpenSSL to verify certificates using the system certificates stored in /etc/ssl/certs.

On other systems, no certificate validation is performed by the generated SSLContext because there is no canonical place to find the set of system certificates. When using this library on a non-Linux system, you are encouraged to install the system certificates somewhere and create your own SSLContext.

modifyContextSSL :: (SSLContext -> IO SSLContext) -> IO () Source

Modify the context being used to configure the SSL tunnel used by the convenience API functions to make https:// connections. The default is that setup by baselineContextSSL.

establishConnection :: URL -> IO Connection Source

Given a URL, work out whether it is normal or secure, and then open the connection to the webserver including setting the appropriate default port if one was not specified in the URL. This is what powers the convenience API, but you may find it useful in composing your own similar functions.

For example (on the assumption that your server behaves when given an absolute URI as the request path), this will open a connection to server www.example.com port 443 and request /photo.jpg:

    let url = "https://www.example.com/photo.jpg"

    c <- establishConnection url
    q <- buildRequest $ do
        http GET url
    ...

Testing support

makeConnection Source

Arguments

:: ByteString

will be used as the Host: header in the HTTP request.

-> IO ()

an action to be called when the connection is terminated.

-> OutputStream ByteString

write end of the HTTP client-server connection.

-> InputStream ByteString

read end of the HTTP client-server connection.

-> IO Connection 

Create a raw Connection object from the given parts. This is primarily of use when teseting, for example:

fakeConnection :: IO Connection
fakeConnection = do
    o  <- Streams.nullOutput
    i  <- Streams.nullInput
    c  <- makeConnection "www.example.com" (return()) o i
    return c

is an idiom we use frequently in testing and benchmarking, usually replacing the InputStream with something like:

    x' <- S.readFile "properly-formatted-response.txt"
    i  <- Streams.fromByteString x'

If you're going to do that, keep in mind that you must have CR-LF pairs after each header line and between the header and body to be compliant with the HTTP protocol; otherwise, parsers will reject your message.

data Headers :: *

Instances

getHeaders :: HttpType τ => τ -> Headers

getHeadersFull :: Connection -> Request -> Headers Source

Get the headers that will be sent with this request. You likely won't need this but there are some corner cases where people need to make calculations based on all the headers before they go out over the wire.

If you'd like the request headers as an association list, import the header functions:

import Network.Http.Types

then use retreiveHeaders as follows:

>>> let kvs = retreiveHeaders $ getHeadersFull c q
>>> :t kvs
:: [(ByteString, ByteString)]

Deprecated

getRequestHeaders :: Connection -> Request -> [(ByteString, ByteString)] Source

Deprecated: use retrieveHeaders . getHeadersFull instead