bytestring-plain-0.1.0.0: Plain byte strings ('ForeignPtr'-less 'ByteString's)

PortabilityGHC
Stabilityexperimental
Safe HaskellNone

Data.ByteString.Plain

Contents

Description

This module is intended to be imported qualified in order to avoid name clashes with the Prelude, Data.ByteString, and Data.ByteString.Lazy modules. E.g.:

 import qualified Data.ByteString.Plain as PB

Synopsis

The plain ByteString type and representation

data ByteString Source

Compact heap representation a (strict) ByteString can be (un)wrapped to/from.

This data type depends on the ordinary ByteString type to be useful but comes with a different cost-model.

This representation avoids the ForeignPtr indirection, and the offset/length slice representation for shared ByteStrings, and is therefore suitable if you need to store many small strings in a data records or for use as keys in container types. On the other hand, string operations on ByteString would require re-allocations, and thus are not supported. If you need to perform such operations convert and operate on conventional ByteStrings instead.

This structure supports UNPACK, and then has an overhead of only 3 words (beyond the word-padded storage of the byte-string payload), as it's basically just a pointer to a MutableByteArray#:

 data ByteString = PBS !(MutableByteArray# RealWorld)

In contrast, a single non-shared unpacked (PlainPtr-backed) ByteString field exhibits an overhead of 8 words:

 data ByteString = PS {-# UNPACK #-} !(ForeignPtr Word8) -- payload (2 words)
                      {-# UNPACK #-} !Int                -- offset (1 word)
                      {-# UNPACK #-} !Int                -- length (1 word)

 data ForeignPtr a = ForeignPtr Addr# ForeignPtrContents -- 2 words w/o info-ptr

 data ForeignPtrContents -- 1 word needed for info-ptr
     = PlainForeignPtr {...}
     | MallocPtr {...}
     | PlainPtr (MutableByteArray# RealWorld)  -- common case (1 word)

 data MutableByteArray# s -- 2 words + payload

As an optimization, all zero-length strings are mapped to the singleton empty value.

Introducing and eliminating ByteStrings

empty :: ByteStringSource

Singleton value the empty ByteString is mapped to/from.

fromStrict :: ByteString -> ByteStringSource

Extract ByteString from ByteString

If possible, the internally used MutableByteArray# is shared with the original ByteString in which case the conversion is cheap.

However, if necessary, a trimmed copy of the original ByteString will be created via copy resulting in a newly allocated MutableByteArray#.

N.B.: Because strict ByteStrings use pinned memory internally also plain ByteStrings use pinned memory and thereby increase the potential for memory fragmentation as the garbage collector is not allowed to move pinned memory areas.

Depending on the use case, it might be beneficial to apply some form of memoizing to the fromStrict conversion (also known as Hash consing or String interning).

toStrict :: ByteString -> ByteStringSource

Convert a plain ByteString back into a ByteString.

This effectively wraps the plain ByteString into a ForeignPtr and a plain ByteString type.

Basic operations