text-2.0.2: An efficient packed Unicode text type.
Copyright(c) 2008 2009 Tom Harper
(c) 2009 2010 Bryan O'Sullivan
(c) 2009 Duncan Coutts
LicenseBSD-style
Maintainerbos@serpentine.com
Stabilityexperimental
PortabilityGHC
Safe HaskellSafe-Inferred
LanguageHaskell2010

Data.Text.Internal

Description

A module containing private Text internals. This exposes the Text representation and low level construction functions. Modules which extend the Text system may need to use this module.

You should not use this module unless you are determined to monkey with the internals, as the functions here do just about nothing to preserve data invariants. You have been warned!

Synopsis

Types

Internally, the Text type is represented as an array of Word8 UTF-8 code units. The offset and length fields in the constructor are in these units, not units of Char.

Invariants that all functions must maintain:

  • Since the Text type uses UTF-8 internally, it cannot represent characters in the reserved surrogate code point range U+D800 to U+DFFF. To maintain this invariant, the safe function maps Char values in this range to the replacement character (U+FFFD, '�').
  • Offset and length must point to a valid UTF-8 sequence of bytes. Violation of this may cause memory access violation and divergence.

data Text Source #

A space efficient, packed, unboxed Unicode text type.

Constructors

Text 

Fields

  • !Array

    bytearray encoded as UTF-8

  • !Int

    offset in bytes (not in Char!), pointing to a start of UTF-8 sequence

  • !Int

    length in bytes (not in Char!), pointing to an end of UTF-8 sequence

Instances

Instances details
Data Text Source #

This instance preserves data abstraction at the cost of inefficiency. We omit reflection services for the sake of data abstraction.

This instance was created by copying the updated behavior of Data.Set.Set and Data.Map.Map. If you feel a mistake has been made, please feel free to submit improvements.

The original discussion is archived here: could we get a Data instance for Data.Text.Text?

The followup discussion that changed the behavior of Set and Map is archived here: Proposal: Allow gunfold for Data.Map, ...

Instance details

Defined in Data.Text

Methods

gfoldl :: (forall d b. Data d => c (d -> b) -> d -> c b) -> (forall g. g -> c g) -> Text -> c Text #

gunfold :: (forall b r. Data b => c (b -> r) -> c r) -> (forall r. r -> c r) -> Constr -> c Text #

toConstr :: Text -> Constr #

dataTypeOf :: Text -> DataType #

dataCast1 :: Typeable t => (forall d. Data d => c (t d)) -> Maybe (c Text) #

dataCast2 :: Typeable t => (forall d e. (Data d, Data e) => c (t d e)) -> Maybe (c Text) #

gmapT :: (forall b. Data b => b -> b) -> Text -> Text #

gmapQl :: (r -> r' -> r) -> r -> (forall d. Data d => d -> r') -> Text -> r #

gmapQr :: forall r r'. (r' -> r -> r) -> r -> (forall d. Data d => d -> r') -> Text -> r #

gmapQ :: (forall d. Data d => d -> u) -> Text -> [u] #

gmapQi :: Int -> (forall d. Data d => d -> u) -> Text -> u #

gmapM :: Monad m => (forall d. Data d => d -> m d) -> Text -> m Text #

gmapMp :: MonadPlus m => (forall d. Data d => d -> m d) -> Text -> m Text #

gmapMo :: MonadPlus m => (forall d. Data d => d -> m d) -> Text -> m Text #

IsString Text Source #

Performs replacement on invalid scalar values:

>>> :set -XOverloadedStrings
>>> "\55555" :: Text
"\65533"
Instance details

Defined in Data.Text

Methods

fromString :: String -> Text #

Monoid Text Source # 
Instance details

Defined in Data.Text

Methods

mempty :: Text #

mappend :: Text -> Text -> Text #

mconcat :: [Text] -> Text #

Semigroup Text Source #

Since: 1.2.2.0

Instance details

Defined in Data.Text

Methods

(<>) :: Text -> Text -> Text #

sconcat :: NonEmpty Text -> Text #

stimes :: Integral b => b -> Text -> Text #

IsList Text Source #

Performs replacement on invalid scalar values:

>>> :set -XOverloadedLists
>>> ['\55555'] :: Text
"\65533"

Since: 1.2.0.0

Instance details

Defined in Data.Text

Associated Types

type Item Text #

Methods

fromList :: [Item Text] -> Text #

fromListN :: Int -> [Item Text] -> Text #

toList :: Text -> [Item Text] #

Read Text Source # 
Instance details

Defined in Data.Text

Show Text Source # 
Instance details

Defined in Data.Text.Show

Methods

showsPrec :: Int -> Text -> ShowS #

show :: Text -> String #

showList :: [Text] -> ShowS #

PrintfArg Text Source #

Since: 1.2.2.0

Instance details

Defined in Data.Text

Binary Text Source #

Since: 1.2.1.0

Instance details

Defined in Data.Text

Methods

put :: Text -> Put #

get :: Get Text #

putList :: [Text] -> Put #

NFData Text Source # 
Instance details

Defined in Data.Text

Methods

rnf :: Text -> () #

Eq Text Source # 
Instance details

Defined in Data.Text

Methods

(==) :: Text -> Text -> Bool #

(/=) :: Text -> Text -> Bool #

Ord Text Source # 
Instance details

Defined in Data.Text

Methods

compare :: Text -> Text -> Ordering #

(<) :: Text -> Text -> Bool #

(<=) :: Text -> Text -> Bool #

(>) :: Text -> Text -> Bool #

(>=) :: Text -> Text -> Bool #

max :: Text -> Text -> Text #

min :: Text -> Text -> Text #

Lift Text Source #

Since: 1.2.4.0

Instance details

Defined in Data.Text

Methods

lift :: Quote m => Text -> m Exp #

liftTyped :: forall (m :: Type -> Type). Quote m => Text -> Code m Text #

type Item Text Source # 
Instance details

Defined in Data.Text

type Item Text = Char

Construction

text Source #

Arguments

:: Array

bytearray encoded as UTF-8

-> Int

offset in bytes (not in Char!), pointing to a start of UTF-8 sequence

-> Int

length in bytes (not in Char!), pointing to an end of UTF-8 sequence

-> Text 

Construct a Text without invisibly pinning its byte array in memory if its length has dwindled to zero.

textP :: Array -> Int -> Int -> Text Source #

Deprecated: Use text instead

Safety

safe :: Char -> Char Source #

Map a Char to a Text-safe value.

Unicode Surrogate code points are not included in the set of Unicode scalar values, but are unfortunately admitted as valid Char values by Haskell. They cannot be represented in a Text. This function remaps those code points to the Unicode replacement character (U+FFFD, '�'), and leaves other code points unchanged.

Code that must be here for accessibility

empty :: Text Source #

O(1) The empty Text.

empty_ :: Text Source #

A non-inlined version of empty.

append :: Text -> Text -> Text Source #

O(n) Appends one Text to the other by copying both of them into a new Text.

Utilities

firstf :: (a -> c) -> Maybe (a, b) -> Maybe (c, b) Source #

Apply a function to the first element of an optional pair.

Checked multiplication

mul :: Int -> Int -> Int infixl 7 Source #

Checked multiplication. Calls error if the result would overflow.

mul32 :: Int32 -> Int32 -> Int32 infixl 7 Source #

Checked multiplication. Calls error if the result would overflow.

mul64 :: Int64 -> Int64 -> Int64 infixl 7 Source #

Checked multiplication. Calls error if the result would overflow.

Debugging

showText :: Text -> String Source #

A useful show-like function for debugging purposes.

Conversions

pack :: String -> Text Source #

O(n) Convert a String into a Text. Performs replacement on invalid scalar values, so unpack . pack is not id:

>>> Data.Text.unpack (pack "\55555")
"\65533"