typed-encoding-0.4.1.0: Type safe string transformations

Safe HaskellSafe
LanguageHaskell2010

Data.TypedEncoding.Conv.Text.Encoding

Description

Since: 0.2.2.0

Synopsis

Documentation

>>> :set -XScopedTypeVariables -XOverloadedStrings -XDataKinds -XFlexibleContexts -XTypeApplications
>>> import Test.QuickCheck
>>> import Test.QuickCheck.Instances.Text()
>>> import Test.QuickCheck.Instances.ByteString()
>>> import qualified Data.ByteString.Char8 as B8
>>> import Data.Char
>>> import Data.Either
>>> import Data.TypedEncoding
>>> import Data.TypedEncoding.Conv.Text
>>> let emptyUTF8B = unsafeSetPayload () "" :: Enc '["r-UTF8"] () B.ByteString
>>> :{
instance Arbitrary (Enc '["r-UTF8"] () B.ByteString) where 
     arbitrary =  fmap (fromRight emptyUTF8B) 
                  . flip suchThat isRight 
                  . fmap (encodeFAll @'["r-UTF8"] @(Either EncodeEx) @(). toEncoding ()) $ arbitrary 
instance Arbitrary (Enc '["r-UTF8"] () T.Text) where 
     arbitrary =  fmap (unsafeSetPayload ()) 
                        arbitrary 
instance Arbitrary (Enc '["r-ASCII"] () B.ByteString) where 
     arbitrary =  fmap (unsafeSetPayload ()) 
                  . flip suchThat (B8.all isAscii) 
                       $ arbitrary 
instance Arbitrary (Enc '["r-ASCII"] () T.Text) where 
     arbitrary =  fmap (unsafeSetPayload ()) 
                  . flip suchThat (T.all isAscii) 
                       $ arbitrary 
:}

decodeUtf8 :: forall xs c t y ys encs. (UnSnoc xs ~ (,) ys y, Superset "r-UTF8" y, encs ~ RemoveRs ys, AllEncodeInto "r-UTF8" encs) => Enc xs c ByteString -> Enc xs c Text Source #

With given constraints decodeUtf8 and encodeUtf8 can be used on subsets of "r-UTF8"

Note: For example, the ByteString encoding of "xd800" (11101101 10100000 10000000 ed a0 80) is considered invalid UTF8 by the Text library To be consistent we make the same assumption of also restricting representable Unicode chars as in Unicode.D76.

>>> TE.decodeUtf8 "\237\160\128"
"*** Exception: Cannot decode byte '\xed': Data.Text.Internal.Encoding.decodeUtf8: Invalid UTF-8 stream

The "xdfff" case (11101101 10111111 10111111 ed bf bf): >>> TE.decodeUtf8 "237191191" "*** Exception: Cannot decode byte '\xed': Data.Text.Internal.Encoding.decodeUtf8: Invalid UTF-8 stream

>>> displ . decodeUtf8 $ (unsafeSetPayload () "Hello" :: Enc '["r-ASCII"] () B.ByteString)
"Enc '[r-ASCII] () (Text Hello)"

"r-UTF8" is redundant:

>>> displ . utf8Demote . decodeUtf8 $ (unsafeSetPayload () "Hello" :: Enc '["r-UTF8"] () B.ByteString)
"Enc '[] () (Text Hello)"

decodeUtf8 and encodeUtf8 form isomorphism

\x -> getPayload x == (getPayload . encodeUtf8 . decodeUtf8 @ '["r-UTF8"] @() $ x)
\x -> getPayload x == (getPayload . decodeUtf8 . encodeUtf8 @ '["r-UTF8"] @() $ x)

These nicely work as iso's for "r-ASCII" subset

\x -> getPayload x == (getPayload . encodeUtf8 . decodeUtf8 @ '["r-ASCII"] @() $ x)
\x -> getPayload x == (getPayload . decodeUtf8 . encodeUtf8 @ '["r-ASCII"] @() $ x)

Similarly to pack this function makes unverified assumption that the encoding stack xs does invalidate UTF8 byte layout. This is safe for any "r-" encoding as well as any of the "enc-" and "do-" encodings that can be currently found in this library. Future versions of this method are likely to introduce constraints that guarantee better type safety.

See Data.TypedEncoding.Conv for more detailed discussion.

Since: 0.4.0.0

encodeUtf8 :: forall xs c t y ys encs. (UnSnoc xs ~ (,) ys y, Superset "r-UTF8" y, encs ~ RemoveRs ys, AllEncodeInto "r-UTF8" encs) => Enc xs c Text -> Enc xs c ByteString Source #

>>> displ $ encodeUtf8 $ utf8Promote $ toEncoding () ("text" :: T.Text)
"Enc '[r-UTF8] () (ByteString text)"

See decodeUtf8. Similar type safety concerns apply.

See Data.TypedEncoding.Conv for more detailed discussion.

Since: 0.4.0.0