pdf-toolbox-content-0.0.5.0: A collection of tools for processing PDF files

Safe HaskellNone
LanguageHaskell98

Pdf.Toolbox.Content.UnicodeCMap

Description

Unicode CMap defines mapping from glyphs to text

Synopsis

Documentation

data UnicodeCMap Source

Unicode character map

Font dictionary can contain "ToUnicode" key -- reference to a stream with unicode CMap

parseUnicodeCMap :: ByteString -> Either String UnicodeCMap Source

Parse content of unicode CMap

unicodeCMapNextGlyph :: UnicodeCMap -> ByteString -> Maybe (Int, ByteString) Source

Take the next glyph code from string, also returns the rest of the string

unicodeCMapDecodeGlyph :: UnicodeCMap -> Int -> Maybe Text Source

Convert glyph to text

Note: one glyph can represent more then one char, e.g. for ligatures