purebred-email-0.4.1: types and parser for email messages (including MIME)

Safe HaskellNone
LanguageHaskell2010

Data.MIME

Contents

Description

MIME messages (RFC 2045, RFC 2046, RFC 2183 and friends).

This module extends Data.RFC5322 with types for handling MIME messages. It provides the mime parsing helper function for use with message.

Synopsis

Overview / HOWTO

Creating and serialising mail

Create an inline, plain text message and render it:

λ> import Data.MIME
λ> msg = createTextPlainMessage "Hello, world!"
λ> s = renderMessage msg
λ> L.putStrLn s
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Content-Type: text/plain; charset=us-ascii

Hello, world!

Set the From and To headers:

λ> alice = Mailbox Nothing (AddrSpec "alice" (DomainDotAtom ("example" :| ["com"])))
λ> bob = Mailbox Nothing (AddrSpec "bob" (DomainDotAtom ("example" :| ["net"])))
λ> msgFromAliceToBob = set (headerFrom defaultCharsets [alice] . set (headerTo defaultCharsets) [Single bob] $ msg
λ> L.putStrLn (renderMessage msgFromAliceToBob)
MIME-Version: 1.0
From: aliceexample.com
To: bobexample.net
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Content-Type: text/plain; charset=us-ascii

Hello, world!

The headerFrom, headerTo, headerCC and headerBCC lenses are the most convenient interface for reading and setting the sender and recipient addresses. Note that you would usually not manually construct email addresses manually as was done above. Instead you would usually read it from another email or configuration, or parse addresses from user input.

The Subject header is set via headerSubject. Other single-valued headers can be set via headerText.

λ> :{
| L.putStrLn . renderMessage $
|   set (headerText defaultCharsets Comments) (Just "와")
|   . set (headerSubject defaultCharsets) (Just "Hi from Alice")
|   $ msgFromAliceToBob
| :}

MIME-Version: 1.0
Comments: =?utf-8?B?7JmA?=
Subject: Hi from Alice
From: aliceexample.com
To: bobexample.net
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Content-Type: text/plain; charset=us-ascii

Hello, world!

Create a multipart message with attachment:

λ> attachment = createAttachment "application/json" (Just "data.json") "{"foo":42}"
λ> msg2 = createMultipartMixedMessage "boundary" [msg, attachment]
λ> s2 = renderMessage msg2
λ> L.putStrLn s2
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary=boundary

--boundary
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Content-Type: text/plain; charset=us-ascii

Hello, world!
--boundary
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename=data.json
Content-Type: application/json

{"foo":42}
--boundary--

NOTE: if you only need to write a serialised Message to an IO handle, buildMessage is more efficient than renderMessage.

Parsing mail

Most often you will parse a message like this:

λ> parsedMessage = parse (message mime) s2
λ> :t parsedMessage
parsedMessage :: Either String MIMEMessage
λ> parsedMessage == Right msg2
True

The message function builds a parser for a message. It is abstracted over the body type; the argument is a function that can inspect headers and return a parser for the body. If you are parsing MIME messages (or plain RFC 5322 messages), the mime function is the right one to use.

Inspecting messages

Parsing an email is nice, but your normally want to get at the content inside. One of the most important tasks is finding entities of interest, e.g. attachments, plain text or HTML bodies. The entities optic is a fold over all leaf entities in the message. That is, all the non-multipart bodies. You can use filtered to refine the query.

For example, let's say you want to find the first text/plain entity in a message. Define a predicate with the help of the matchContentType function:

λ> isTextPlain = matchContentType "text" (Just "plain") . view contentType
λ> :t isTextPlain
isTextPlain :: HasHeaders s => s -> Bool
λ> isTextPlain msg
True
λ> isTextPlain msg2
False

Now we can use the predicate to construct a fold and retrieve the body. If there is no matching entity the result would be Nothing.

λ> firstOf (entities . filtered isTextPlain . body) msg2
Just "Hello, world!"

For attachments you are normally interested in the binary data and possibly the filename (if specified). In the following example we retrieve all attachments, and their filenames, as a list of tuples (although there is only one in the message). Note that

Get the (optional) filenames and (decoded) body of all attachments, as a list of tuples. The attachments optic selects non-multipart entities with Content-Disposition: attachment. The attachments fold targets all entities with Content-Disposition: attachment. The transferDecoded' optic undoes the Content-Transfer-Encoding of the entity.

λ> getFilename = preview (contentDisposition . _Just . filename defaultCharsets)
λ> getBody = preview (transferDecoded' . _Right . body)
λ> getAttachment = liftA2 (,) getFilename getBody
λ> toListOf (attachments . to getAttachment) msg2
[(Just "data.json",Just "{"foo":42}")]

Finally, note that the filename optic takes an argument: it is a function for looking up a character set. Supporting every possible character encoding is a bit tricky so we let the user supply a map of supported charsets, and provide defaultCharsets which supports ASCII, UTF-8 and ISO-8859-1.

λ> :t filename
filename
  :: (HasParameters a, Applicative f) =>
     CharsetLookup -> (T.Text -> f T.Text) -> a -> f a
λ> :t defaultCharsets
defaultCharsets :: CharsetLookup
λ> :i CharsetLookup
type CharsetLookup = CI Char8.ByteString -> Maybe Data.MIME.Charset.Charset

Unicode support

In Australia we say "Hello world" upside down:

λ> msg3 = createTextPlainMessage "ɥǝןןo ʍoɹןp"
λ> L.putStrLn $ renderMessage msg3
MIME-Version: 1.0
Content-Transfer-Encoding: base64
Content-Disposition: inline
Content-Type: text/plain; charset=utf-8

yaXHndef159vIMqNb8m5159w

Charset set and transfer encoding are handled automatically. If the message only includes characters representable in ASCII, the charset will be us-ascii, otherwise utf-8.

To read the message as Text you must perform transfer decoding and charset decoding. The transferDecoded optic performs transfer decoding, as does its sibling transferDecoded' which is monomorphic in the error type. Similarly, charsetText and charsetText' perform text decoding according to the character set.

If you don't mind throwing away decoding errors, the simplest way to get the text of a message is:

λ> Just ent = firstOf (entities . filtered isTextPlain) msg3
λ> :t ent
ent :: WireEntity
λ> text = preview (transferDecoded' . _Right . charsetText' defaultCharsets . _Right) ent
λ> :t text
text :: Maybe T.Text
λ> traverse_ T.putStrLn text
ɥǝןןo ʍoɹןp

As mentioned earlier, functions that perform text decoding take a CharsetLookup parameter, and we provide defaultCharsets for convenience.

API

MIME data type

data MIME Source #

MIME message body. Either a single Part, or Multipart. Only the body is represented; preamble and epilogue are not.

Instances
Eq MIME Source # 
Instance details

Defined in Data.MIME

Methods

(==) :: MIME -> MIME -> Bool #

(/=) :: MIME -> MIME -> Bool #

Show MIME Source # 
Instance details

Defined in Data.MIME

Methods

showsPrec :: Int -> MIME -> ShowS #

show :: MIME -> String #

showList :: [MIME] -> ShowS #

RenderMessage MIME Source #

Sets the MIME-Version: 1.0 header.

Instance details

Defined in Data.MIME

EqMessage MIME Source #

Ignores the presence/absense of MIME-Version header

Instance details

Defined in Data.MIME

type MessageContext MIME Source # 
Instance details

Defined in Data.MIME

mime :: Headers -> BodyHandler MIME Source #

Top-level MIME body parser that uses headers to decide how to parse the body.

Do not use this parser for parsing a nested message. This parser should only be used when the message you want to parse is the whole input. If you use it to parse a nested message it will treat the remainder of the outer message(s) as part of the epilogue.

Preambles and epilogues are discarded.

This parser accepts non-MIME messages, and treats them as a single part.

data EncStateByte Source #

Entity requires content-transfer-encoding to send, and may require charset decoding to read.

Instances
HasCharset ByteEntity Source #

RFC 2046 §4.1.2. defines the default character set to be US-ASCII.

Instance details

Defined in Data.MIME

Associated Types

type Decoded ByteEntity :: Type Source #

type Decoded ByteEntity Source # 
Instance details

Defined in Data.MIME

Accessing and processing entities

entities :: Traversal' MIMEMessage WireEntity Source #

Get all leaf entities from the MIME message. Entities that failed to parse are skipped.

attachments :: Traversal' MIMEMessage WireEntity Source #

Leaf entities with Content-Disposition: attachment

isAttachment :: HasHeaders a => a -> Bool Source #

MIMEMessage content disposition is an Attachment

transferDecoded :: (HasTransferEncoding a, AsTransferEncodingError e, Profunctor p, Contravariant f) => Optic' p f a (Either e (TransferDecoded a)) Source #

Perform content transfer decoding.

transferDecoded' :: (HasTransferEncoding a, Profunctor p, Contravariant f) => Optic' p f a (Either TransferEncodingError (TransferDecoded a)) Source #

Perform content transfer decoding (monomorphic error type).

charsetDecoded :: (HasCharset a, AsCharsetError e) => CharsetLookup -> forall p f. (Profunctor p, Contravariant f) => Optic' p f a (Either e (Decoded a)) Source #

Structure with the encoded data replaced with Text

Header processing

decodeEncodedWords :: CharsetLookup -> ByteString -> Text Source #

RFC 2047 and RFC 2231 define the encoded-words mechanism for embedding non-ASCII data in headers. This function locates encoded-words and decodes them.

λ> T.putStrLn $ decodeEncodedWords defaultCharsets "hello =?utf-8?B?5LiW55WM?=!"
hello 世界!

If parsing fails or the encoding is unrecognised the encoded-word is left unchanged in the result.

λ> T.putStrLn $ decodeEncodedWords defaultCharsets "=?utf-8?B?bogus?="
=?utf-8?B?bogus?=

λ> T.putStrLn $ decodeEncodedWords defaultCharsets "=?utf-8?X?unrecognised_encoding?="
=?utf-8?X?unrecognised_encoding?=

Language specification is supported (the datum is discarded).

λ> T.putStrLn $ decodeEncodedWords defaultCharsets "=?utf-8*es?Q?hola_mundo!?="
hola mundo!

Content-Type header

contentType :: HasHeaders a => Lens' a ContentType Source #

Lens to the content-type header. Probably not a lawful lens.

If the header is not specified or is syntactically invalid, defaultContentType is used. For more info see https://tools.ietf.org/html/rfc2045#section-5.2.

If the Content-Transfer-Encoding is unrecognised, the actual Content-Type value is ignored and application/octet-stream is returned, as required by https://tools.ietf.org/html/rfc2049#section-2.

When setting, if the header already exists it is replaced, otherwise it is added. Unrecognised Content-Transfer-Encoding is ignored when setting.

data ContentType Source #

Content-Type header (RFC 2183). Use parameters to access the parameters. Example:

ContentType "text" "plain" (Parameters [("charset", "utf-8")])

You can also use -XOverloadedStrings but be aware the conversion is non-total (throws an error if it cannot parse the string).

Instances
Eq ContentType Source #

Equality of Content-Type. Type and subtype are compared case-insensitively and parameters are also compared. Use matchContentType if you just want to match on the media type while ignoring parameters.

Instance details

Defined in Data.MIME

Show ContentType Source # 
Instance details

Defined in Data.MIME

IsString ContentType Source #

NON-TOTAL parses the Content-Type (including parameters) and throws an error if the parse fails

Instance details

Defined in Data.MIME

Generic ContentType Source # 
Instance details

Defined in Data.MIME

Associated Types

type Rep ContentType :: Type -> Type #

NFData ContentType Source # 
Instance details

Defined in Data.MIME

Methods

rnf :: ContentType -> () #

HasParameters ContentType Source # 
Instance details

Defined in Data.MIME

type Rep ContentType Source # 
Instance details

Defined in Data.MIME

matchContentType Source #

Arguments

:: CI ByteString

type

-> Maybe (CI ByteString)

optional subtype

-> ContentType 
-> Bool 

Match content type. If Nothing is given for subtype, any subtype is accepted.

ctEq :: ContentType -> ContentType -> Bool Source #

Deprecated: Use matchContentType instead

Are the type and subtype the same? (parameters are ignored)

parseContentType :: Parser ContentType Source #

Parser for Content-Type header

showContentType :: ContentType -> Text Source #

Rendered content type field value for displaying

mimeBoundary :: Traversal' ContentType ByteString Source #

Get the boundary, if specified

Content-Type values

contentTypeMultipartMixed :: ByteString -> ContentType Source #

multipart/mixed; boundary=asdf

defaultContentType :: ContentType Source #

text/plain; charset=us-ascii

Content-Disposition header

contentDisposition :: HasHeaders a => Lens' a (Maybe ContentDisposition) Source #

Access Content-Disposition header.

Unrecognised disposition types are coerced to Attachment in accordance with RFC 2183 §2.8 which states: Unrecognized disposition types should be treated as attachment.

This optic does not distinguish between missing header or malformed value.

data ContentDisposition Source #

Content-Disposition header (RFC 2183).

Use parameters to access the parameters.

data DispositionType Source #

Constructors

Inline 
Attachment 
Instances
Eq DispositionType Source # 
Instance details

Defined in Data.MIME

Show DispositionType Source # 
Instance details

Defined in Data.MIME

Generic DispositionType Source # 
Instance details

Defined in Data.MIME

Associated Types

type Rep DispositionType :: Type -> Type #

NFData DispositionType Source # 
Instance details

Defined in Data.MIME

Methods

rnf :: DispositionType -> () #

type Rep DispositionType Source # 
Instance details

Defined in Data.MIME

type Rep DispositionType = D1 (MetaData "DispositionType" "Data.MIME" "purebred-email-0.4.1-LKGlLUD09Vd8pTVewOembz" False) (C1 (MetaCons "Inline" PrefixI False) (U1 :: Type -> Type) :+: C1 (MetaCons "Attachment" PrefixI False) (U1 :: Type -> Type))

filename :: HasParameters a => CharsetLookup -> Traversal' a Text Source #

Traverse the value of the filename parameter (if present).

filenameParameter :: HasParameters a => Lens' a (Maybe EncodedParameterValue) Source #

Access the filename parameter as a Maybe (ParameterValue B.ByteString).

This can be used to read or set the filename parameter (see also the newParameter convenience function):

λ> let hdrs = Headers [("Content-Disposition", "attachment")]
λ> set (contentDisposition . filenameParameter) (Just (newParameter "foo.txt")) hdrs
Headers [("Content-Disposition","attachment; filename=foo.txt")]

Mail creation

Common use cases

createTextPlainMessage :: Text -> MIMEMessage Source #

Create an inline, text/plain, utf-8 encoded message

createAttachment :: ContentType -> Maybe FilePath -> ByteString -> MIMEMessage Source #

Create an attachment from the given file contents. Optionally set the filename parameter to the given file path.

createAttachmentFromFile :: ContentType -> FilePath -> IO MIMEMessage Source #

Create an attachment from a given file path. Note: The filename content disposition is set to the given FilePath. For privacy reasons, you can unset/change it. See filename for examples.

createMultipartMixedMessage Source #

Arguments

:: ByteString

Boundary

-> NonEmpty MIMEMessage

parts

-> MIMEMessage 

Create a mixed MIMEMessage with an inline text/plain part and multiple attachments

encapsulate :: MIMEMessage -> MIMEMessage Source #

Encapsulate a message as a message/rfc822 message. You can use this in creating forwarded or bounce messages.

Setting headers

headerSubject :: HasHeaders a => CharsetLookup -> Lens' a (Maybe Text) Source #

Subject header. See headerText for details of conversion to Text.

headerText :: HasHeaders a => CharsetLookup -> CI ByteString -> Lens' a (Maybe Text) Source #

Single-valued header with Text value via encoded-words. The conversion to/from Text is total (encoded-words that failed to be decoded are passed through unchanged). Therefore Nothing means that the header was not present.

This function is suitable for the Subject header.

replyHeaderReferences :: HasHeaders a => Getter a (Maybe ByteString) Source #

Returns a space delimited ByteString with values from identification fields from the parents message Headers. Rules to gather the values are in accordance to RFC5322 - 3.6.4 as follows sorted by priority (first has precedence):

  • Values from References and Message-ID (if any)
  • Values from In-Reply-To and Message-ID (if any)
  • Value from Message-ID (in case it's the first reply to a parent mail)
  • Otherwise Nothing is returned indicating that the replying mail should not have a References field.

Re-exports

defaultCharsets :: CharsetLookup Source #

Supports US-ASCII, UTF-8 and ISO-8859-1.