bytestring: Fast, compact, strict and lazy byte strings with a list interface

[ bsd3, data, library ] [ Propose Tags ]

An efficient compact, immutable byte string type (both strict and lazy) suitable for binary or 8-bit character data.

The ByteString type represents sequences of bytes or 8-bit characters. It is suitable for high performance use, both in terms of large data quantities, or high speed requirements. The ByteString functions follow the same style as Haskell's ordinary lists, so it is easy to convert code from using String to ByteString.

Two ByteString variants are provided:

  • Strict ByteStrings keep the string as a single large array. This makes them convenient for passing data between C and Haskell.

  • Lazy ByteStrings use a lazy list of strict chunks which makes it suitable for I/O streaming tasks.

The Char8 modules provide a character-based view of the same underlying ByteString types. This makes it convenient to handle mixed binary and 8-bit character content (which is common in many file formats and network protocols).

The Builder module provides an efficient way to build up ByteStrings in an ad-hoc way by repeated concatenation. This is ideal for fast serialisation or pretty printing.

ByteStrings are not designed for Unicode. For Unicode strings you should use the Text type from the text package.

These modules are intended to be imported qualified, to avoid name clashes with Prelude functions, e.g.

import qualified Data.ByteString as BS

[Skip to Readme]

Downloads

Note: This package has metadata revisions in the cabal description newer than included in the tarball. To unpack the package including the revisions, use 'cabal get'.

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees

Candidates

  • No Candidates
Versions [RSS] 0.9, 0.9.0.1, 0.9.0.2, 0.9.0.3, 0.9.0.4, 0.9.1.0, 0.9.1.1, 0.9.1.2, 0.9.1.3, 0.9.1.4, 0.9.1.5, 0.9.1.6, 0.9.1.7, 0.9.1.8, 0.9.1.9, 0.9.1.10, 0.9.2.0, 0.9.2.1, 0.10.0.0, 0.10.0.1, 0.10.0.2, 0.10.2.0, 0.10.4.0, 0.10.4.1, 0.10.6.0, 0.10.8.0, 0.10.8.1, 0.10.8.2, 0.10.9.0, 0.10.10.0, 0.10.10.1, 0.10.12.0, 0.10.12.1, 0.11.0.0, 0.11.1.0, 0.11.2.0, 0.11.3.0, 0.11.3.1, 0.11.4.0, 0.11.5.0, 0.11.5.1, 0.11.5.2, 0.11.5.3, 0.12.0.0, 0.12.0.1, 0.12.0.2, 0.12.1.0
Dependencies base (>=4.2 && <4.15), deepseq (>=1.1 && <1.4), ghc-prim [details]
License BSD-3-Clause
Copyright Copyright (c) Don Stewart 2005-2009, (c) Duncan Coutts 2006-2012, (c) David Roundy 2003-2005, (c) Jasper Van der Jeugt 2010, (c) Simon Meier 2010-2011.
Author Don Stewart, Duncan Coutts
Maintainer Don Stewart <dons00@gmail.com>, Duncan Coutts <duncan@community.haskell.org>
Revised Revision 2 made by sjakobi at 2020-09-11T20:54:02Z
Category Data
Bug tracker dons00@gmail.com, duncan@community.haskell.org
Source repo head: darcs get http://darcs.haskell.org/bytestring/
Uploaded by DuncanCoutts at 2012-09-20T09:21:15Z
Distributions Arch:0.11.4.0, Fedora:0.11.4.0
Reverse Dependencies 5964 direct, 8665 indirect [details]
Downloads 113587 total (541 in the last 30 days)
Rating 2.75 (votes: 18) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Docs uploaded by user
Build status unknown [no reports yet]

Readme for bytestring-0.10.2.0

[back to package description]
------------------------------------------------------------------------
               ByteString : Fast, packed strings of bytes
------------------------------------------------------------------------

This library provides the Data.ByteString library -- strict and lazy
byte arrays manipulable as strings -- providing very time and space
efficient string and IO operations.

For very large data requirements, or constraints on heap size,
Data.ByteString.Lazy is provided, a lazy list of bytestring chunks.
Efficient processing of multi-gigabyte data can be achieved this way.

Requirements:
        > Cabal
        > GHC 6.4 or greater, or hugs

Building:
        > runhaskell Setup.lhs configure --prefix=/f/g
        > runhaskell Setup.lhs build
        > runhaskell Setup.lhs install

After installation, you can run the testsuite as follows:
    
        > cd tests ; make
    or
        > cd tests ; make hugs

For the full test and benchmark suite, you need GHC and Hugs:

        > cd tests ; make everything

Authors:
    ByteString was derived from the GHC PackedString library,
    originally written by Bryan O'Sullivan, and then by Simon Marlow.
    It was adapted, and greatly extended for darcs by David Roundy, and
    others. Don Stewart cleaned up and further extended the implementation.
    Duncan Coutts wrote much of the .Lazy code. Don, Duncan and Roman
    Leshchinskiy wrote the fusion system.

------------------------------------------------------------------------

Performance, some random numbers (with GHC):

This table compares the performance of common operations ByteString,
from various string libraries.

Size of test data: 21256k, Linux 3.2Ghz P4

                          FPS7       SPS     PS      [a]    
++                        0.028      !       !       1.288   
length                    0.000      0.000   0.000   0.131   
pack                      0.303      0.502   0.337   -       
unpack                    3.319*     1.630   7.445   -       
compare                   0.000      0.000   0.000   0.000   
index                     0.000      0.000   0.000   0.000   
map                       2.762*     2.917   4.813   7.286   
filter                    0.304      2.805   0.954   0.305   
take                      0.000      0.000   0.024   0.005   
drop                      0.000      0.000   11.768  0.130   
takeWhile                 0.000      1.498   0.000   0.000   
dropWhile                 0.000      1.985   8.447   0.130   
span                      0.000      9.289   11.144  0.131   
break                     0.000      9.383   11.268  0.133   
lines                     0.052      1.114   1.367   2.790   
unlines                   0.048      !       !       10.950  
words                     1.344      2.128   5.644   4.184   
unwords                   0.016      !       !       1.305   
reverse                   0.024      12.997  13.018  1.622   
concat                    0.000      12.701  11.459  1.163   
cons                      0.016      2.064   8.358   0.131   
empty                     0.000      0.000   0.000   0.000   
head                      0.000      0.000   0.000   0.000   
tail                      0.000      0.000   14.490  0.130   
elem                      0.000      1.490   0.001   0.000   
last                      0.000      -       -       0.143   
init                      0.000      -       -       1.147   
inits                     0.414      -       -       !       
tails                     0.460      -       -       1.136   
intersperse               0.040      -       -       10.517  
any                       0.000      -       -       0.000   
all                       0.000      -       -       0.000   
sort                      0.168      -       -       !
maximum                   0.024      -       -       0.183
minimum                   0.025      -       -       0.185
replicate                 0.000      -       -       0.053   
findIndex                 0.096
find                      0.120      -       -       0.000   
elemIndex                 0.000      -       -       0.000   
elemIndicies              0.008      -       -       0.314   
foldl                     0.148
spanEnd                   0.000
snoc                      0.016
filterChar                0.031      
filterNotChar             0.124
join                      0.016      
split                     0.032      
findIndices               0.408      
splitAt                   0.000      
lineIndices               0.029      
breakOn                   0.000      
breakSpace                0.000 
splitWith                 0.329 
dropSpace                 0.000 
dropSpaceEnd              0.000 
joinWithChar              0.017
join /                    0.016 
zip                       0.960 
zipWith                   0.892 
isSubstringOf             0.039 
isPrefixOf                0.000 
isSuffixOf                0.000
count                     0.021

Key: FPS6 = FPS 0.6
     SPS  = Simon Marlow's packedstring prototype
     PS   = Data.PackedString
     [a]  = [Char]

     -    = no function exists
     !    = stack or memory exhaustion

------------------------------------------------------------------------

== Stress testing really big strings

Doing some stress testing of FPS, here are some results for 0.5G strings.

3.2Ghz box, 2G physical mem.

Size of test data: 524288k
Size of test data: 524288k
                Char8   Word8

Effectively O(1) or O(m) where m < n
    all             0.000   0.000   
    any             0.000   0.004   
    break           0.000   0.000   
    breakChar       0.000   0.000   
    breakSpace      0.000   
    compare         0.000   
    concat          0.000   
    drop            0.000   
    dropSpace       0.000   
    dropSpaceEnd    0.000   
    dropWhile       0.000   0.000   
    elem            0.000   0.000   
    elemIndex       0.000   0.000   
    elemIndexLast   0.000   0.000   
    empty           0.000   
    head            0.000   0.000   
    index           0.000   0.000   
    init            0.000   
    last            0.000   0.000   
    length          0.000   
    notElem         0.000   0.000   
    span            0.000   0.000   
    spanChar        0.000   0.000   
    spanEnd         0.000   0.000   
    splitAt         0.000   
    tail            0.000   
    take            0.000   
    takeWhile       0.000   0.000   
    isPrefixOf      0.000   
    isSuffixOf      0.000   
    addr1           0.000   
    addr2           0.000   

O(n)
    ++              0.676   
    map             6.080   5.868   
    cons            0.396   0.396   
    snoc            0.400   0.400   
    find            3.240   
    split           1.204   1.200   
    lines           2.000   
    foldl           3.804   
    unwords         0.552   
    reverse         0.884   
    findIndex       3.128   
    filterChar      0.756   0.732   
    filter/='f'     8.265   7.012   
    filterNotChar   4.456   3.388   
    join            0.400   
    sort            4.344   
    maximum         0.776   0.764   
    minimum         0.772   0.776   
    replicate       0.008   0.000   
    elemIndices     0.240   0.240   
    lineIndices     1.092   
    joinWithChar    0.400   0.400   
    isSubstringOf   0.052   
    count           0.748   

slow O(n)
    words           38.722  
    group           77.261  
    groupBy         96.226  
    inits           32.430  
    tails           23.225  
    findIndices     13.841  15.825  
    splitWith       18.445  19.225  
    zip             33.926  
    zipWith         33.562