Copyright	Otakar Smrz 2005-2016
License	GPL
Maintainer	otakar-smrz users.sf.net
Stability	provisional
Portability	portable
Safe Haskell	Safe
Language	Haskell98

Encode

Contents

Classes
Types
Methods

Description

The Haskell analogy to the Encode module in Perl: http://search.cpan.org/dist/Encode/

Encode.Arabic Encode.Mapper Encode.Unicode

Synopsis

Classes

class Encoding e where Source

Encodings are represented as distinct datatypes of the Encoding class, which defines two essential methods:

encode: turning a list of 'internal code points' into a String, and
decode: converting the lists in the opposite direction.

Developing a new encoding means to write a new module with a structure similar to this:

   module MyEncModule (MyEncType (..)) where
  
   import Encode
  
   data MyEncType = MyEncName | MyEncAlias deriving (Enum, Show)
  
   instance Encoding MyEncType where
  
       encode enc data = show data         -- your choices ...
  
       decode enc data = map (toEnum . fromEnum) data

Encode.Unicode.UTF8 is one concrete implementation that realizes and illustrates this template. Encode.Arabic.Buckwalter implements symmetric recoding using finite maps, and Encode.Arabic.ArabTeX makes use of monadic parsing and the PureFP library.

Minimal complete definition

Nothing

Methods

encode :: e -> [UPoint] -> [Char] Source

decode :: e -> [Char] -> [UPoint] Source

Types

data UPoint Source

The datatype introduced for the internal representation of Unicode code points is currently defined as newtype UPoint = UPoint CSpace. The shift to code points UPoint from characters Char is intentional, as Unicode support in Haskell is not yet fully implemented, and code points are, anyway, different entities. Since the UPoint type is an instance of the Enum class, the type's constructor and destructor functions are available as toEnum and fromEnum, respectively.

The UPoint datatype should be the transfer point on the way from one encoding into another. It should not be the terminal stop, though. The encode method should be used systematically, and not show, even if it might temporarily produce somehow appealing results.

Instances

Enum UPoint Source
Eq UPoint Source
Ord UPoint Source
Show UPoint Source

type CSpace = Word Source

The CSpace type denotes the code space, and it is a synonym to Word.

Encoding ArabTeX Source
Encoding ZDMG Source
Encoding Buckwalter Source
Encoding Parkinson Source
Encoding Habash Source
Encoding ISIRI3342 Source
Encoding ASMO449 Source
Encoding DOSFarsi Source
Encoding DOSArabic Source
Encoding MacFarsi Source
Encoding MacArabic Source
Encoding ISOArabic Source
Encoding WinArabic Source
Encoding UTF8 Source
Encoding Unicode Source