fast-builder-0.1.1.0: Fast ByteString Builder

Safe HaskellNone
LanguageHaskell2010

Data.ByteString.FastBuilder

Contents

Description

An efficient implementation of ByteString builder.

In many cases, this module works as a drop-in replacement for Data.ByteString.Builder, and should improve speed.

Performance tips

fast-builder should be faster than the standard builder in most situations. However, by following certain code patterns, you can often achive even more efficient code, with almost no memory allocation aside from the resulting ByteString itself. The below are a list of hints for writing efficient code for constructing builders.

Return builders directly from your function

Once you construct a builder, it's usually a good idea to just return it from your function. Avoid storing it in a data structure or passing it to another function, unless they are going to be eliminated by the compiler. Schematically, prefer this:

good :: YourDataStructure -> Builder
good d = serializeThis (this d) <> serializeThat (that d)

over:

bad0 :: YourDataStructure -> (Int, Builder)
bad0 d
   = (compute d, serializeThis (this d) <> serializeThat (that d))

or:

bad1 :: YourDataStructure -> Builder
bad1 d = serializeMore d (serializeThis (this d))

An important special case of this general rule is to prefer foldr over foldl' when serializing a list, and to prefer structural recursion over tail recursion in general.

Use rebuild

When your function returns a different builder depending on the input, it's usually a good idea to use rebuild to wrap the whole body of your function. See the documentation for rebuild for details.

Background

Why is it good to return builders directly? It is because they are implemented as functions. When storing a function in a data structure or passing it around, you need to first allocate a closure for it. However, if you are just returning it, the returned function can be merged with your function, creating a function with a larger arity. For example, GHC can compile the good function above into a 5-ary function, which requires no runtime allocation (the exact arity depends on the library version).

Watch out for lazy ByteString generation

When using toLazyByteString, if you consume the result in a bound thread, performance degrades significantly. See the documentation for toLazyByteString for details.

Synopsis

The type

data Builder Source #

Builder is an auxiliary type for efficiently generating a long ByteString. It is isomorphic to lazy ByteString, but offers constant-time concatanation via <>.

Use toLazyByteString to turn a Builder into a ByteString

Running a builder

toLazyByteString :: Builder -> ByteString Source #

Turn a Builder into a lazy ByteString.

Performance hint: when the resulting ByteString does not fit in one chunk, this function forks a thread. Due to this, the performance degrades sharply if you use this function from a bound thread. Note in particular that the main thread is a bound thread when you use ghc -threaded.

To avoid this problem, do one of these:

  • Make sure the resulting ByteString is consumed in an unbound thread. Consider using runInUnboundThread for this.
  • Use other function to run the Builder instead. Functions that don't return a lazy ByteString do not have this issue.
  • Link your program without -threaded.

toLazyByteStringWith :: Int -> Int -> Builder -> ByteString Source #

Like toLazyByteString, but allows the user to specify the initial and the subsequent desired buffer sizes.

hPutBuilder :: Handle -> Builder -> IO () Source #

Output a Builder to a Handle.

hPutBuilderLen :: Handle -> Builder -> IO Int Source #

Output a Builder to a Handle. Returns the number of bytes written.

hPutBuilderWith :: Handle -> Int -> Int -> Builder -> IO Int Source #

Like hPutBuffer, but allows the user to specify the initial and the subsequent desired buffer sizes. This function may be useful for setting large buffer when high throughput I/O is needed.

Performance tuning

rebuild :: Builder -> Builder Source #

rebuild b is equivalent to b, but it allows GHC to assume that b will be run at most once. This can enable various optimizations that greately improve performance.

There are two types of typical situations where a use of rebuild is often a win:

  • When constructing a builder using a recursive function. e.g. rebuild $ foldr ....
  • When constructing a builder using a conditional expression. e.g. rebuild $ case x of ...

Basic builders

primBounded :: BoundedPrim a -> a -> Builder Source #

Turn a value of type a into a Builder, using the given BoundedPrim.

primFixed :: FixedPrim a -> a -> Builder Source #

Turn a value of type a into a Builder, using the given FixedPrim.

byteStringInsert :: ByteString -> Builder Source #

Turn a ByteString to a Builder. When possible, the given ByteString will not be copied, and inserted directly into the output instead.

byteStringCopy :: ByteString -> Builder Source #

Turn a ByteString to a Builder. The ByteString will be copied to the buffer, regardless of the size.

byteStringThreshold :: Int -> ByteString -> Builder Source #

Turn a ByteString to a Builder. If the size of the ByteString is larger than the given threshold, avoid copying it as much as possible.

Single byte

Little endian

Big endian

Host-dependent size and byte order, non-portable

Decimal

Hexadecimal

Fixed-width hexadecimal

UTF-8

ASCII

ISO-8859-1