http-streams-0.7.1.1: An HTTP client using io-streams

StabilityExperimental
MaintainerAndrew Cowie
Safe HaskellNone

Network.Http.Client

Contents

Description

Overview

A simple HTTP client library, using the Snap Framework's io-streams library to handle the streaming I/O. The http-streams API is designed for ease of use when querying web services and dealing with the result.

Given:

 {-# LANGUAGE OverloadedStrings #-}

 import System.IO.Streams (InputStream, OutputStream, stdout)
 import qualified System.IO.Streams as Streams
 import qualified Data.ByteString as S

and this library:

 import Network.Http.Client

the underlying API is straight-forward. In particular, constructing the Request to send is quick and to the point:

 main :: IO ()
 main = do
     c <- openConnection "www.example.com" 80

     q <- buildRequest $ do
         http GET "/"
         setAccept "text/html"

     sendRequest c q emptyBody

     receiveResponse c (\p i -> do
         xm <- Streams.read i
         case xm of
             Just x    -> S.putStr x
             Nothing   -> "")

     closeConnection c

which would print the first chunk of the response back from the server. Obviously in real usage you'll do something more interesting with the Response in the handler function, and consume the entire response body from the InputStream ByteString.

Because this is all happening in IO (the defining feature of io-streams!), you can ensure resource cleanup on normal or abnormal termination by using Control.Exception's standard bracket function; see closeConnection for an example. For the common case we have a utility function which wraps bracket for you:

 foo :: IO ByteString
 foo = withConnection (openConnection "www.example.com" 80) doStuff

 doStuff :: Connection -> IO ByteString

There are also a set of convenience APIs that do just that, along with the tedious bits like parsing URLs. For example, to do an HTTP GET and stream the response body to stdout, you can simply do:

     get "http://www.example.com/file.txt" (\p i -> Streams.connect i stdout)

which on the one hand is "easy" while on the other exposes the the Response and InputStream for you to read from. Of course, messing around with URLs is all a bit inefficient, so if you already have e.g. hostname and path, or if you need more control over the request being created, then the underlying http-streams API is simple enough to use directly.

Synopsis

Connecting to server

type Port = Word16

data Connection Source

A connection to a web server.

Instances

openConnection :: Hostname -> Port -> IO ConnectionSource

In order to make a request you first establish the TCP connection to the server over which to send it.

Ordinarily you would supply the host part of the URL here and it will be used as the value of the HTTP 1.1 Host: field. However, you can specify any server name or IP addresss and set the Host: value later with setHostname when building the request.

Usage is as follows:

     c <- openConnection "localhost" 80
     ...
     closeConnection c

More likely, you'll use withConnection to wrap the call in order to ensure finalization.

HTTP pipelining is supported; you can reuse the connection to a web server, but it's up to you to ensure you match the number of requests sent to the number of responses read, and to process those responses in order. This is all assuming that the server supports pipelining; be warned that not all do. Web browsers go to extraordinary lengths to probe this; you probably only want to do pipelining under controlled conditions. Otherwise just open a new connection for subsequent requests.

Building Requests

You setup a request using the RequestBuilder monad, and get the resultant Request object by running buildRequest. The first call doesn't have to be to http, but it looks better when it is, don't you think?

data Method

HTTP Methods, as per RFC 2616

data RequestBuilder α

The RequestBuilder monad allows you to abuse do-notation to conveniently setup a Request object.

buildRequest :: RequestBuilder α -> IO Request

Run a RequestBuilder, yielding a Request object you can use on the given connection.

     q <- buildRequest $ do
         http POST "/api/v1/messages"
         setContentType "application/json"
         setHostname "clue.example.com" 80
         setAccept "text/html"
         setHeader "X-WhoDoneIt" "The Butler"

Obviously it's up to you to later actually send JSON data.

http :: Method -> ByteString -> RequestBuilder ()

Begin constructing a Request, starting with the request line.

setHostname :: Hostname -> Port -> RequestBuilder ()

Set the [virtual] hostname for the request. In ordinary conditions you won't need to call this, as the Host: header is a required header in HTTP 1.1 and is set directly from the name of the server you connected to when calling openConnection.

setAccept :: ByteString -> RequestBuilder ()

Indicate the content type you are willing to receive in a reply from the server. For more complex Accept: headers, use setAccept'.

setAccept' :: [(ByteString, Float)] -> RequestBuilder ()

Indicate the content types you are willing to receive in a reply from the server in order of preference. A call of the form:

         setAccept' [("text/html", 1.0),
                     ("application/xml", 0.8),
                     ("*/*", 0)]

will result in an Accept: header value of text/html; q=1.0, application/xml; q=0.8, */*; q=0.0 as you would expect.

setAuthorizationBasic :: ByteString -> ByteString -> RequestBuilder ()

Set username and password credentials per the HTTP basic authentication method.

         setAuthorizationBasic "Aladdin" "open sesame"

will result in an Authorization: header value of Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ==.

Basic authentication does not use a message digest function to encipher the password; the above string is only base-64 encoded and is thus plain-text visible to any observer on the wire and all caches and servers at the other end, making basic authentication completely insecure. A number of web services, however, use SSL to encrypt the connection that then use HTTP basic authentication to validate requests. Keep in mind in these cases the secret is still sent to the servers on the other side and passes in clear through all layers after the SSL termination. Do not use basic authentication to protect secure or user-originated privacy-sensitve information.

setContentType :: ContentType -> RequestBuilder ()

Set the MIME type corresponding to the body of the request you are sending. Defaults to "text/plain", so usually you need to set this if PUTting.

setContentLength :: Int64 -> RequestBuilder ()

Specify the length of the request body, in bytes.

RFC 2616 requires that we either send a Content-Length header or use Transfer-Encoding: chunked. If you know the exact size ahead of time, then call this function; the body content will still be streamed out by io-streams in more-or-less constant space.

This function is special: in a PUT or POST request, http-streams will assume chunked transfer-encoding unless you specify a content length here, in which case you need to ensure your body function writes precisely that many bytes.

setExpectContinue :: RequestBuilder ()

Specify that this request should set the expectation that the server needs to approve the request before you send it.

This function is special: in a PUT or POST request, http-streams will wait for the server to reply with an HTTP/1.1 100 Continue status before sending the entity body. This is handled internally; you will get the real response (be it successful 2xx, client error, 4xx, or server error 5xx) in receiveResponse. In theory, it should be 417 if the expectation failed.

Only bother with this if you know the service you're talking to requires clients to send an Expect: 100-continue header and will handle it properly. Most servers don't do any precondition checking, automatically send an intermediate 100 response, and then just read the body regardless, making this a bit of a no-op in most cases.

setTransferEncoding :: RequestBuilder ()

Override the default setting about how the entity body will be sent.

This function is special: this explicitly sets the Transfer-Encoding: header to chunked and will instruct the library to actually tranfer the body as a stream (chunked transfer encoding). See setContentLength for forcing the opposite. You really won't need this in normal operation, but some people are control freaks.

setHeader :: ByteString -> ByteString -> RequestBuilder ()

Set a generic header to be sent in the HTTP request. The other methods in the RequestBuilder API are expressed in terms of this function, but we recommend you use them where offered for their stronger types.

Sending HTTP request

data Request

A description of the request that will be sent to the server. Note unlike other HTTP libraries, the request body is not a part of this object; that will be streamed out by you when actually sending the request with sendRequest.

Request has a useful Show instance that will output the request line and headers (as it will be sent over the wire but with the \r characters stripped) which can be handy for debugging.

Note that the actual Host: header is not set until the request is sent, so you will not see it in the Show instance (unless you call setHostname to override the value inherited from the Connection).

data Response

A description of the response received from the server. Note unlike other HTTP libraries, the response body is not a part of this object; that will be streamed in by you when calling receiveResponse.

Like Request, Response has a Show instance that will output the status line and response headers as they were received from the server.

getHostname :: Connection -> Request -> ByteStringSource

Get the virtual hostname that will be used as the Host: header in the HTTP 1.1 request. Per RFC 2616 § 14.23, this will be of the form hostname:port if the port number is other than the default, ie 80 for HTTP.

sendRequest :: Connection -> Request -> (OutputStream Builder -> IO α) -> IO αSource

Having composed a Request object with the headers and metadata for this connection, you can now send the request to the server, along with the entity body, if there is one. For the rather common case of HTTP requests like GET that don't send data, use emptyBody as the output stream:

     sendRequest c q emptyBody

For PUT and POST requests, you can use fileBody or inputStreamBody to send content to the server, or you can work with the io-streams API directly:

     sendRequest c q (\o ->
         Streams.write (Just (Builder.fromString "Hello World\n")) o)

emptyBody :: OutputStream Builder -> IO ()Source

Use this for the common case of the HTTP methods that only send headers and which have no entity body, i.e. GET requests.

fileBody :: FilePath -> OutputStream Builder -> IO ()Source

Specify a local file to be sent to the server as the body of the request.

You use this partially applied:

     sendRequest c q (fileBody "/etc/passwd")

Note that the type of (fileBody "/path/to/file") is just what you need for the third argument to sendRequest, namely

>>> :t filePath "hello.txt"
:: OutputStream Builder -> IO ()

inputStreamBody :: InputStream ByteString -> OutputStream Builder -> IO ()Source

Read from a pre-existing InputStream and pipe that through to the connection to the server. This is useful in the general case where something else has handed you stream to read from and you want to use it as the entity body for the request.

You use this partially applied:

     i <- getStreamFromVault                    -- magic, clearly
     sendRequest c q (inputStreamBody i)

This function maps Builder.fromByteString over the input, which will be efficient if the ByteString chunks are large.

encodedFormBody :: [(ByteString, ByteString)] -> OutputStream Builder -> IO ()Source

Specify name/value pairs to be sent to the server in the manner used by web browsers when submitting a form via a POST request. Parameters will be URL encoded per RFC 2396 and combined into a single string which will be sent as the body of your request.

You use this partially applied:

     let nvs = [("name","Kermit"),
                ("type","frog")]
                ("role","stagehand")]

     sendRequest c q (encodedFormBody nvs)

Note that it's going to be up to you to call setContentType with a value of "application/x-www-form-urlencoded" when building the Request object; the postForm convenience (which uses this encodedFormBody function) takes care of this for you, obviously.

Processing HTTP response

receiveResponse :: Connection -> (Response -> InputStream ByteString -> IO β) -> IO βSource

Handle the response coming back from the server. This function hands control to a handler function you supply, passing you the Response object with the response headers and an InputStream containing the entity body.

For example, if you just wanted to print the first chunk of the content from the server:

     receiveResponse c (\p i -> do
         m <- Streams.read i
         case m of
             Just bytes -> putStr bytes
             Nothing    -> return ())

Obviously, you can do more sophisticated things with the InputStream, which is the whole point of having an io-streams based HTTP client library.

The final value from the handler function is the return value of receiveResponse, if you need it.

Throws UnexpectedCompression if it doesn't know how to handle the compression format used in the response.

receiveResponseRaw :: Connection -> (Response -> InputStream ByteString -> IO β) -> IO βSource

This is a specialized variant of receiveResponse that explicitly does not handle the content encoding of the response body stream (it will not decompress anything). Unless you really want the raw gzipped content coming down from the server, use receiveResponse.

getStatusCode :: Response -> StatusCode

Get the HTTP response status code.

getStatusMessage :: Response -> ByteString

Get the HTTP response status message. Keep in mind that this is not normative; whereas getStatusCode values are authoritative.

getHeader :: Response -> ByteString -> Maybe ByteString

Lookup a header in the response. HTTP header field names are case-insensitive, so you can specify the name to lookup however you like. If the header is not present Nothing will be returned.

     let n = case getHeader p "Content-Length" of
                Just x' -> read x' :: Int
                Nothing -> 0

which of course is essentially what goes on inside the client library when it receives a response from the server and has to figure out how many bytes to read.

There is a fair bit of complexity in some of the other HTTP response fields, so there are a number of specialized functions for reading those values where we've found them useful.

debugHandler :: Response -> InputStream ByteString -> IO ()Source

Print the response headers and response body to stdout. You can use this with receiveResponse or one of the convenience functions when testing. For example, doing:

     c <- openConnection "kernel.operationaldynamics.com" 58080

     q <- buildRequest $ do
         http GET "/time"

     sendRequest c q emptyBody

     receiveResponse c debugHandler

would print out:

 HTTP/1.1 200 OK
 Transfer-Encoding: chunked
 Content-Type: text/plain
 Vary: Accept-Encoding
 Server: Snap/0.9.2.4
 Content-Encoding: gzip
 Date: Mon, 21 Jan 2013 06:13:37 GMT

 Mon 21 Jan 13, 06:13:37.303Z

or thereabouts.

concatHandler :: Response -> InputStream ByteString -> IO ByteStringSource

Sometimes you just want the entire response body as a single blob. This function concatonates all the bytes from the response into a ByteString. If using the main http-streams API, you would use it as follows:

    ...
    x' <- receiveResponse c concatHandler
    ...

The methods in the convenience API all take a function to handle the response; this function is passed directly to the receiveResponse call underlying the request. Thus this utility function can be used for get as well:

    x' <- get "http://www.example.com/document.txt" concatHandler

Either way, the usual caveats about allocating a single object from streaming I/O apply: do not use this if you are not absolutely certain that the response body will fit in a reasonable amount of memory.

Note that this function makes no discrimination based on the response's HTTP status code. You're almost certainly better off writing your own handler function.

concatHandler' :: Response -> InputStream ByteString -> IO ByteStringSource

A special case of concatHandler, this function will return the entire response body as a single ByteString, but will throw HttpClientError if the response status code was other than 2xx.

jsonHandler :: FromJSON α => Response -> InputStream ByteString -> IO αSource

If you're working with a data stream that is in application/json, then chances are you're using aeson to handle the JSON to Haskell decoding. If so, then this helper function might be of use.

     v <- get "http://api.example.com/v1/" jsonHandler

This function feeds the input body to the json' attoparsec Parser in order to get the aeson Value type. This is then marshalled to your type represeting the source data, via the FromJSON typeclass.

The above example was actually insufficient; when working with aeson you need to fix the type so it knows what FromJSON instance to use. Let's say you're getting Person objects, then it would be

     v <- get "http://api.example.com/v1/person/461" jsonHandler :: IO Person

assuming your Person type had a FromJSON instance, of course.

Note

This function parses a single top level JSON object or array, which is all you're supposed to get if it's a valid document. People do all kinds of crazy things though, so beware. Also, this function (like the concatHander convenience) loads the entire response into memory; it's not streaming; if you're receiving a document which is (say) a very long array of objects then you may want to implement your own handler function, perhaps using Streams.parserToInputStream and the Parser combinators directly — with a result type of InputStream Value, perhaps — by which you could then iterate over the Values one at a time in constant space.

Resource cleanup

closeConnection :: Connection -> IO ()Source

Shutdown the connection. You need to call this release the underlying socket file descriptor and related network resources. To do so reliably, use this in conjunction with openConnection in a call to bracket:

 --
 -- Make connection, cleaning up afterward
 --

 foo :: IO ByteString
 foo = bracket
    (openConnection "localhost" 80)
    (closeConnection)
    (doStuff)

 --
 -- Actually use Connection to send Request and receive Response
 --

 doStuff :: Connection -> IO ByteString

or, just use withConnection.

While returning a ByteString is probably the most common use case, you could conceivably do more processing of the response in doStuff and have it and foo return a different type.

withConnection :: IO Connection -> (Connection -> IO γ) -> IO γSource

Given an IO action producing a Connection, and a computation that needs one, runs the computation, cleaning up the Connection afterwards.

     x <- withConnection (openConnection "s3.example.com" 80) $ (\c -> do
         q <- buildRequest $ do
             http GET "/bucket42/object/149"
         sendRequest c q emptyBody
         ...
         return "blah")

which can make the code making an HTTP request a lot more straight-forward.

Wraps Control.Exception's bracket.

Convenience APIs

Some simple functions for making requests with useful defaults. There's no head function for the usual reason of needing to avoid collision with Prelude.

These convenience functions work with http and https, but note that if you retrieve an https URL, you must wrap your main function with withOpenSSL to initialize the native openssl library code.

getSource

Arguments

:: URL

Resource to GET from.

-> (Response -> InputStream ByteString -> IO β)

Handler function to receive the response from the server.

-> IO β 

Issue an HTTP GET request and pass the resultant response to the supplied handler function. This code will silently follow redirects, to a maximum depth of 5 hops.

The handler function is as for receiveResponse, so you can use one of the supplied convenience handlers if you're in a hurry:

     x' <- get "http://www.bbc.co.uk/news/" concatHandler

But as ever the disadvantage of doing this is that you're not doing anything intelligent with the HTTP response status code. If you want an exception raised in the event of a non 2xx response, you can use:

     x' <- get "http://www.bbc.co.uk/news/" concatHandler'

but for anything more refined you'll find it easy to simply write your own handler function.

Throws TooManyRedirects if more than 5 redirects are thrown.

postSource

Arguments

:: URL

Resource to POST to.

-> ContentType

MIME type of the request body being sent.

-> (OutputStream Builder -> IO α)

Handler function to write content to server.

-> (Response -> InputStream ByteString -> IO β)

Handler function to receive the response from the server.

-> IO β 

Send content to a server via an HTTP POST request. Use this function if you have an OutputStream with the body content.

postFormSource

Arguments

:: URL

Resource to POST to.

-> [(ByteString, ByteString)]

List of name=value pairs. Will be sent URL-encoded.

-> (Response -> InputStream ByteString -> IO β)

Handler function to receive the response from the server.

-> IO β 

Send form data to a server via an HTTP POST request. This is the usual use case; most services expect the body to be MIME type application/x-www-form-urlencoded as this is what conventional web browsers send on form submission. If you want to POST to a URL with an arbitrary Content-Type, use post.

putSource

Arguments

:: URL

Resource to PUT to.

-> ContentType

MIME type of the request body being sent.

-> (OutputStream Builder -> IO α)

Handler function to write content to server.

-> (Response -> InputStream ByteString -> IO β)

Handler function to receive the response from the server.

-> IO β 

Place content on the server at the given URL via an HTTP PUT request, specifying the content type and a function to write the content to the supplied OutputStream. You might see:

     put "http://s3.example.com/bucket42/object149" "text/plain"
         (fileBody "hello.txt") (\p i -> do
             putStr $ show p
             Streams.connect i stdout)

Secure connections

openConnectionSSL :: SSLContext -> Hostname -> Port -> IO ConnectionSource

Open a secure connection to a web server.

You need to wrap this (and subsequent code using this connection) within a call to withOpenSSL:

 import OpenSSL (withOpenSSL)

 main :: IO ()
 main = withOpenSSL $ do
     ctx <- baselineContextSSL
     c <- openConnectionSSL ctx "api.github.com" 443
     ...
     closeConnection c

If you want to tune the parameters used in making SSL connections, manually specify certificates, etc, then setup your own context:

 import OpenSSL.Session (SSLContext)
 import qualified OpenSSL.Session as SSL

     ...
     ctx <- SSL.context
     ...

See OpenSSL.Session.

Crypto is as provided by the system openssl library, as wrapped by the HsOpenSSL package and openssl-streams.

baselineContextSSL :: IO SSLContextSource

Creates a basic SSL context. This is the SSL context used if you make an "https://" request using one of the convenience functions. It configures OpenSSL to use the default set of ciphers.

On Linux systems, this function also configures OpenSSL to verify certificates using the system certificates stored in /etc/ssl/certs.

On other systems, no certificate validation is performed by the generated SSLContext because there is no canonical place to find the set of system certificates. When using this library on a non-Linux system, you are encouraged to install the system certificates somewhere and create your own SSLContext.

modifyContextSSL :: (SSLContext -> IO SSLContext) -> IO ()Source

Modify the context being used to configure the SSL tunnel used by the convenience API functions to make https: connections. The default is that setup by baselineContextSSL.

establishConnection :: URL -> IO ConnectionSource

Given a URL, work out whether it is normal or secure, and then open the connection to the webserver including setting the appropriate default port if one was not specified in the URL. This is what powers the convenience API, but you may find it useful in composing your own similar functions.

For example (on the assumption that your server behaves when given an absolute URI as the request path), this will open a connection to server www.example.com port 443 and request /photo.jpg:

     let url = "https://www.example.com/photo.jpg"

     c <- establishConnection url
     q <- buildRequest $ do
         http GET url
     ...

Testing support

makeConnectionSource

Arguments

:: ByteString

will be used as the Host: header in the HTTP request.

-> IO ()

an action to be called when the connection is terminated.

-> OutputStream ByteString

write end of the HTTP client-server connection.

-> InputStream ByteString

read end of the HTTP client-server connection.

-> IO Connection 

Create a raw Connection object from the given parts. This is primarily of use when teseting, for example:

 fakeConnection :: IO Connection
 fakeConnection = do
     o  <- Streams.nullOutput
     i  <- Streams.nullInput
     c  <- makeConnection "www.example.com" (return()) o i
     return c

is an idiom we use frequently in testing and benchmarking, usually replacing the InputStream with something like:

     x' <- S.readFile "properly-formatted-response.txt"
     i  <- Streams.fromByteString x'

If you're going to do that, keep in mind that you must have CR-LF pairs after each header line and between the header and body to be compliant with the HTTP protocol; otherwise, parsers will reject your message.

data Headers

The map of headers in a Request or Response. Note that HTTP header field names are case insensitive, so if you call setHeader on a field that's already defined but with a different capitalization you will replace the existing value.

Instances

getHeaders :: HttpType τ => τ -> Headers

Get the Headers from a Request or Response. Most people do not need this; for most cases you just need to get a header or two from the response, for which you can use getHeader. On the other hand, if you do need to poke around in the raw headers,

 import Network.Http.Types

will give you functions like lookupHeader and updateHeader to to work with.

getHeadersFull :: Connection -> Request -> HeadersSource

Get the headers that will be sent with this request. You likely won't need this but there are some corner cases where people need to make calculations based on all the headers before they go out over the wire.

If you'd like the request headers as an association list, import the header functions:

 import Network.Http.Types

then use retreiveHeaders as follows:

>>> let kvs = retreiveHeaders $ getHeadersFull c q
>>> :t kvs
:: [(ByteString, ByteString)]

Deprecated

getRequestHeaders :: Connection -> Request -> [(ByteString, ByteString)]Source

Deprecated: use retrieveHeaders . getHeadersFull instead