Copyright	© Herbert Valerio Riedel 2017
License	BSD3
Maintainer	hvr@gnu.org
Stability	stable
Safe Haskell	Trustworthy
Language	Haskell2010

Data.Text.Short

Contents

The ShortText type
Basic operations
Conversions
- String
- Text
- ByteString

Description

Memory-efficient representation of Unicode text strings.

Synopsis

The `ShortText` type

data ShortText Source #

A compact representation of Unicode strings.

This type relates to Text as ShortByteString relates to ByteString by providing a more compact type. Please consult the documentation of Data.ByteString.Short for more information.

Currently, a boxed unshared Text has a memory footprint of 6 words (i.e. 48 bytes on 64-bit systems) plus 2 or 4 bytes per code-point (due to the internal UTF-16 representation). Each Text value which can share its payload with another Text requires only 4 words additionally. Unlike ByteString, Text use unpinned memory.

In comparison, the footprint of a boxed ShortText is only 4 words (i.e. 32 bytes on 64-bit systems) plus 123/4 bytes per code-point (due to the internal UTF-8 representation). It can be shown that for realistic data UTF-16 has a space overhead of 50% over UTF-8.

Instances

Eq ShortText Source #
Methods (==) :: ShortText -> ShortText -> Bool # (/=) :: ShortText -> ShortText -> Bool #
Ord ShortText Source #
Methods compare :: ShortText -> ShortText -> Ordering # (<) :: ShortText -> ShortText -> Bool # (<=) :: ShortText -> ShortText -> Bool # (>) :: ShortText -> ShortText -> Bool # (>=) :: ShortText -> ShortText -> Bool # max :: ShortText -> ShortText -> ShortText # min :: ShortText -> ShortText -> ShortText #
Read ShortText Source #
Methods readsPrec :: Int -> ReadS ShortText # readList :: ReadS [ShortText] # readPrec :: ReadPrec ShortText # readListPrec :: ReadPrec [ShortText] #
Show ShortText Source #
Methods showsPrec :: Int -> ShortText -> ShowS # show :: ShortText -> String # showList :: [ShortText] -> ShowS #
IsString ShortText Source #	Behaviour for `[U+D800 .. U+DFFF]` matches the `IsString` instance for `Text`
Methods fromString :: String -> ShortText #
Semigroup ShortText Source #
Methods (<>) :: ShortText -> ShortText -> ShortText # sconcat :: NonEmpty ShortText -> ShortText # stimes :: Integral b => b -> ShortText -> ShortText #
Monoid ShortText Source #
Methods mempty :: ShortText # mappend :: ShortText -> ShortText -> ShortText # mconcat :: [ShortText] -> ShortText #
Binary ShortText Source #	The `Binary` encoding matches the one for `Text`
Methods put :: ShortText -> Put # get :: Get ShortText # putList :: [ShortText] -> Put #
NFData ShortText Source #
Methods rnf :: ShortText -> () #
Hashable ShortText Source #
Methods hashWithSalt :: Int -> ShortText -> Int # hash :: ShortText -> Int #

Basic operations

null :: ShortText -> Bool Source #

\(\mathcal{O}(1)\) Test whether a ShortText is empty.

length :: ShortText -> Int Source #

\(\mathcal{O}(n)\) Count the number of Unicode code-points in a ShortText.

isAscii :: ShortText -> Bool Source #

\(\mathcal{O}(n)\) Test whether ShortText contains only ASCII code-points (i.e. only U+0000 through U+007F).

Conversions

`String`

fromString :: String -> ShortText Source #

\(\mathcal{O}(n)\) Construct/pack from String

Note: This function is total because it replaces the (invalid) code-points U+D800 through U+DFFF with the replacement character U+FFFD.

toString :: ShortText -> String Source #

\(\mathcal{O}(n)\) Convert to String

`Text`

fromText :: Text -> ShortText Source #

\(\mathcal{O}(n)\) Construct ShortText from Text

This is currently not \(\mathcal{O}(1)\) because currently Text uses UTF-16 as its internal representation. In the event that Text will change its internal representation to UTF-8 this operation will become \(\mathcal{O}(1)\).

toText :: ShortText -> Text Source #

\(\mathcal{O}(n)\) Convert to Text

This is currently not \(\mathcal{O}(1)\) because currently Text uses UTF-16 as its internal representation. In the event that Text will change its internal representation to UTF-8 this operation will become \(\mathcal{O}(1)\).