Copyright	(c) 2010 Jasper Van der Jeugt (c) 2010-2011 Simon Meier
License	BSD3-style (see LICENSE)
Maintainer	Simon Meier <iridcode@gmail.com>
Portability	GHC
Safe Haskell	Trustworthy
Language	Haskell98

Data.ByteString.Builder.Extra

Contents

Execution strategies
Controlling chunk boundaries
Low level execution
Host-specific binary encodings

Description

Extra functions for creating and executing Builders. They are intended for application-specific fine-tuning the performance of Builders.

Synopsis

toLazyByteStringWith :: AllocationStrategy -> ByteString -> Builder -> ByteString
data AllocationStrategy
safeStrategy :: Int -> Int -> AllocationStrategy
untrimmedStrategy :: Int -> Int -> AllocationStrategy
smallChunkSize :: Int
defaultChunkSize :: Int
byteStringCopy :: ByteString -> Builder
byteStringInsert :: ByteString -> Builder
byteStringThreshold :: Int -> ByteString -> Builder
lazyByteStringCopy :: ByteString -> Builder
lazyByteStringInsert :: ByteString -> Builder
lazyByteStringThreshold :: Int -> ByteString -> Builder
flush :: Builder
type BufferWriter = Ptr Word8 -> Int -> IO (Int, Next)
data Next
- = Done
- | More !Int BufferWriter
- | Chunk !ByteString BufferWriter
runBuilder :: Builder -> BufferWriter
intHost :: Int -> Builder
int16Host :: Int16 -> Builder
int32Host :: Int32 -> Builder
int64Host :: Int64 -> Builder
wordHost :: Word -> Builder
word16Host :: Word16 -> Builder
word32Host :: Word32 -> Builder
word64Host :: Word64 -> Builder
floatHost :: Float -> Builder
doubleHost :: Double -> Builder

Execution strategies

toLazyByteStringWith Source #

Arguments

:: AllocationStrategy	Buffer allocation strategy to use
-> ByteString	Lazy `ByteString` to use as the tail of the generated lazy `ByteString`
-> Builder	`Builder` to execute
-> ByteString	Resulting lazy `ByteString`

Heavy inlining. Execute a Builder with custom execution parameters.

This function is inlined despite its heavy code-size to allow fusing with the allocation strategy. For example, the default Builder execution function toLazyByteString is defined as follows.

{-# NOINLINE toLazyByteString #-}
toLazyByteString =
  toLazyByteStringWith (safeStrategy smallChunkSize defaultChunkSize) L.empty

where L.empty is the zero-length lazy ByteString.

In most cases, the parameters used by toLazyByteString give good performance. A sub-performing case of toLazyByteString is executing short (<128 bytes) Builders. In this case, the allocation overhead for the first 4kb buffer and the trimming cost dominate the cost of executing the Builder. You can avoid this problem using

toLazyByteStringWith (safeStrategy 128 smallChunkSize) L.empty

This reduces the allocation and trimming overhead, as all generated ByteStrings fit into the first buffer and there is no trimming required, if more than 64 bytes and less than 128 bytes are written.

data AllocationStrategy Source #

A buffer allocation strategy for executing Builders.

safeStrategy Source #

Arguments

:: Int	Size of first buffer
-> Int	Size of successive buffers
-> AllocationStrategy	An allocation strategy that guarantees that at least half of the allocated memory is used for live data

Use this strategy for generating lazy ByteStrings whose chunks are likely to survive one garbage collection. This strategy trims buffers that are filled less than half in order to avoid spilling too much memory.

untrimmedStrategy Source #

Arguments

:: Int	Size of the first buffer
-> Int	Size of successive buffers
-> AllocationStrategy	An allocation strategy that does not trim any of the filled buffers before converting it to a chunk

Use this strategy for generating lazy ByteStrings whose chunks are discarded right after they are generated. For example, if you just generate them to write them to a network socket.

smallChunkSize :: Int Source #

The recommended chunk size. Currently set to 4k, less the memory management overhead

defaultChunkSize :: Int Source #

The chunk size used for I/O. Currently set to 32k, less the memory management overhead

Controlling chunk boundaries

byteStringCopy :: ByteString -> Builder Source #

Construct a Builder that copies the strict ByteString.

Use this function to create Builders from smallish (<= 4kb) ByteStrings or if you need to guarantee that the ByteString is not shared with the chunks generated by the Builder.

byteStringInsert :: ByteString -> Builder Source #

Construct a Builder that always inserts the strict ByteString directly as a chunk.

This implies flushing the output buffer, even if it contains just a single byte. You should therefore use byteStringInsert only for large (> 8kb) ByteStrings. Otherwise, the generated chunks are too fragmented to be processed efficiently afterwards.

byteStringThreshold :: Int -> ByteString -> Builder Source #

Construct a Builder that copies the strict ByteStrings, if it is smaller than the treshold, and inserts it directly otherwise.

For example, byteStringThreshold 1024 copies strict ByteStrings whose size is less or equal to 1kb, and inserts them directly otherwise. This implies that the average chunk-size of the generated lazy ByteString may be as low as 513 bytes, as there could always be just a single byte between the directly inserted 1025 byte, strict ByteStrings.

lazyByteStringCopy :: ByteString -> Builder Source #

Construct a Builder that copies the lazy ByteString.

lazyByteStringInsert :: ByteString -> Builder Source #

Construct a Builder that inserts all chunks of the lazy ByteString directly.

lazyByteStringThreshold :: Int -> ByteString -> Builder Source #

Construct a Builder that uses the thresholding strategy of byteStringThreshold for each chunk of the lazy ByteString.

flush :: Builder Source #

Flush the current buffer. This introduces a chunk boundary.

Low level execution

type BufferWriter = Ptr Word8 -> Int -> IO (Int, Next) Source #

A BufferWriter represents the result of running a Builder. It unfolds as a sequence of chunks of data. These chunks come in two forms:

an IO action for writing the Builder's data into a user-supplied memory buffer.
a pre-existing chunks of data represented by a strict ByteString

While this is rather low level, it provides you with full flexibility in how the data is written out.

The BufferWriter itself is an IO action: you supply it with a buffer (as a pointer and length) and it will write data into the buffer. It returns a number indicating how many bytes were actually written (which can be 0). It also returns a Next which describes what comes next.

data Next Source #

After running a BufferWriter action there are three possibilities for what comes next:

Constructors

Done	This means we're all done. All the builder data has now been written.
More !Int BufferWriter	This indicates that there may be more data to write. It gives you the next `BufferWriter` action. You should call that action with an appropriate buffer. The int indicates the minimum buffer size required by the next `BufferWriter` action. That is, if you call the next action you must supply it with a buffer length of at least this size.
Chunk !ByteString BufferWriter	In addition to the data that has just been written into your buffer by the `BufferWriter` action, it gives you a pre-existing chunk of data as a `ByteString`. It also gives you the following `BufferWriter` action. It is safe to run this following action using a buffer with as much free space as was left by the previous run action.

runBuilder :: Builder -> BufferWriter Source #

Turn a Builder into its initial BufferWriter action.

Host-specific binary encodings

intHost :: Int -> Builder Source #

Encode a single native machine Int. The Int is encoded in host order, host endian form, for the machine you're on. On a 64 bit machine the Int is an 8 byte value, on a 32 bit machine, 4 bytes. Values encoded this way are not portable to different endian or int sized machines, without conversion.

int16Host :: Int16 -> Builder Source #

Encode a Int16 in native host order and host endianness.

int32Host :: Int32 -> Builder Source #

Encode a Int32 in native host order and host endianness.

int64Host :: Int64 -> Builder Source #

Encode a Int64 in native host order and host endianness.

wordHost :: Word -> Builder Source #

Encode a single native machine Word. The Word is encoded in host order, host endian form, for the machine you're on. On a 64 bit machine the Word is an 8 byte value, on a 32 bit machine, 4 bytes. Values encoded this way are not portable to different endian or word sized machines, without conversion.

word16Host :: Word16 -> Builder Source #

Encode a Word16 in native host order and host endianness.

word32Host :: Word32 -> Builder Source #

Encode a Word32 in native host order and host endianness.

word64Host :: Word64 -> Builder Source #

Encode a Word64 in native host order and host endianness.

floatHost :: Float -> Builder Source #

Encode a Float in native host order. Values encoded this way are not portable to different endian machines, without conversion.

doubleHost :: Double -> Builder Source #

Encode a Double in native host order.