Extended character code handling.
Wumpus uses SVG style escaping to embed character codes or names in regular strings:
"regular ascii text &#egrave; more ascii text"
i.e. character names and codes are delimited by
&# on the
; on the right.
In Wumpus strings both character names and character codes can be embedded - it seems conventional for PostScript to use names e.g.:
(myst) show /egrave glyphshow (re) show
... and SVG to use codes, e.g.:
To accommodate both Wumpus defines a TextEncoder record which provides a two-way mapping between character codes and glyph names for a character set.
An instance needs:
- The functions for looking up codes by glyph-name and glyph-name by code.
- The name of the encoding - this is printed in the xml
prologue of the SVG file as the
encodingattribute. Latin 1's official name is "ISO-8859-1".
- Fallback glyph-names and char codes in case lookup fails.
Wumpus.Core.TextLatin1 defines an implementation for Latin 1.