Maintainer	Andrew Cowie
Stability	Experimental
Safe Haskell	None
Language	Haskell2010

Data.Locator

Contents

English16: locators humans can exchange
Latin25: a visually distinct character set
Base62: binary without punctuation
Deprecated functions

Description

Background

We had a need for identifiers that could be used by humans.

The requirement to be able to say these over the phone complicates matters. Most people have approached this problem by using a phonetic alphabet. The trouble comes when you hear people saying stuff like "A as in ... uh, Apple?" (should be Alpha, of course) and "U as in ... um, what's a word that starts with U?" It gets worse. Ever been to a GPG keysigning? Listen to people attempt to read out the digits of their key fingerprints. ...C 3 E D 0 0 0 0 0 0 0 2 B D B D... "Did you say 'C' or 'D'?" and "how many zeros was that?" Brutal.

So what we need is a symbol set where each digit is unambigious and doesn't collide with the phonetics of another symbol. This package provides English16, a set of 16 letters and numbers that, when spoken in English, have unique pronounciation and have been very successful in verbal communications over noisy links.

Ironically, however, when used in written applications the English16 set is a bit restrictive. When looking at them they don't have much variety (it turned out they're very blocky—so much so you have to squint). If the application is transcription or identification visually then the criteria is shapes that are distinct, rather than their sound. For these uses we provide Latin25, a set of 25 symbols useful for identifiers in automated systems that nevertheless have to be operated or debugged by humans.

Finally, also included is code to work in base 62, which is simply ['0'-'9', 'A'-'Z', and 'a'-'z']. These are frequently used to express short codes in URL redirectors; you may find them a more useful encoding for expressing numbers than base 16 hexidecimal.

Synopsis

class (Ord α, Enum α, Bounded α) => Locator α where
- locatorToDigit :: α -> Char
- digitToLocator :: Char -> α
data English16
- = Zero
- | One
- | Two
- | Charlie
- | Four
- | Foxtrot
- | Hotel
- | Seven
- | Eight
- | Nine
- | Kilo
- | Lima
- | Mike
- | Romeo
- | XRay
- | Yankee
fromEnglish16 :: [Char] -> Int
toEnglish16 :: Int -> String
toEnglish16a :: Int -> Int -> String
hashStringToEnglish16a :: Int -> ByteString -> ByteString
data Latin25
- = Zero'
- | One'
- | Three'
- | Four'
- | Seven'
- | Eight'
- | Nine'
- | Alpha'
- | Charlie'
- | Echo'
- | Golf'
- | Hotel'
- | Juliet'
- | Kilo'
- | Lima'
- | Mike'
- | November'
- | Papa'
- | Sierra'
- | Tango'
- | Victor'
- | Whiskey'
- | XRay'
- | Yankee'
- | Zulu'
fromLatin25 :: String -> Int
toLatin25 :: Int -> String
hashStringToLatin25 :: Int -> ByteString -> ByteString
toBase62 :: Integer -> String
fromBase62 :: String -> Integer
padWithZeros :: Int -> String -> String
hashStringToBase62 :: Int -> ByteString -> ByteString
fromLocator16 :: [Char] -> Int
toLocator16 :: Int -> String
toLocator16a :: Int -> Int -> String
hashStringToLocator16a :: Int -> ByteString -> ByteString

English16: locators humans can exchange

This was somewhat inspired by the record locators used by the civilian air travel industry, but with the restriction that the symbol set is carefully chosen (aviation locators do heroic things like excluding 'I' but not much else) and, in the case of Locator16a, to not repeat symbols. They're not a reversable encoding, but assuming you're just generating identifiers and storing them somewhere, they're quite handy.

TODO link to paper with pronunciation study when published.

class (Ord α, Enum α, Bounded α) => Locator α where Source #

Methods

locatorToDigit :: α -> Char Source #

digitToLocator :: Char -> α Source #

Instances

Locator English16 Source #
Instance details Defined in Data.Locator.English16 Methods locatorToDigit :: English16 -> Char Source # digitToLocator :: Char -> English16 Source #
Locator Latin25 Source #
Instance details Defined in Data.Locator.Latin25 Methods locatorToDigit :: Latin25 -> Char Source # digitToLocator :: Char -> Latin25 Source #

data English16 Source #

A symbol set with sixteen uniquely pronounceable digits.

The fact there are sixteen symbols is more an indication of a certain degree of bullheaded-ness on the part of the author, and less of any kind of actual requirement. We might have a slighly better readback score if we dropped to 15 or 14 unique characters. It does mean you can match up with hexidecimal, which is not entirely without merit.

The grouping of letters and numbers was the hard part; having come up with the set and deconflicted the choices, the ordering is then entirely arbitrary. Since there are some numbers, might as well have them at the same place they correspond to in base 10; the letters were then allocated in alpha order in the remaining slots.

Constructors

Zero	`'0'` 0th
One	`'1'` 1st
Two	`'2'` 2nd
Charlie	`'C'` 3rd
Four	`'4'` 4th
Foxtrot	`'F'` 5th
Hotel	`'H'` 6th
Seven	`'7'` 7th
Eight	`'8'` 8th
Nine	`'9'` 9th
Kilo	`'K'` 10th
Lima	`'L'` 11th
Mike	`'M'` 12th
Romeo	`'R'` 13th
XRay	`'X'` 14th
Yankee	`'Y'` 15th

Instances

Bounded English16 Source #
Instance details Defined in Data.Locator.English16 Methods minBound :: English16 # maxBound :: English16 #
Enum English16 Source #
Instance details Defined in Data.Locator.English16 Methods succ :: English16 -> English16 # pred :: English16 -> English16 # toEnum :: Int -> English16 # fromEnum :: English16 -> Int # enumFrom :: English16 -> [English16] # enumFromThen :: English16 -> English16 -> [English16] # enumFromTo :: English16 -> English16 -> [English16] # enumFromThenTo :: English16 -> English16 -> English16 -> [English16] #
Eq English16 Source #
Instance details Defined in Data.Locator.English16 Methods (==) :: English16 -> English16 -> Bool # (/=) :: English16 -> English16 -> Bool #
Ord English16 Source #
Instance details Defined in Data.Locator.English16 Methods compare :: English16 -> English16 -> Ordering # (<) :: English16 -> English16 -> Bool # (<=) :: English16 -> English16 -> Bool # (>) :: English16 -> English16 -> Bool # (>=) :: English16 -> English16 -> Bool # max :: English16 -> English16 -> English16 # min :: English16 -> English16 -> English16 #
Show English16 Source #
Instance details Defined in Data.Locator.English16 Methods showsPrec :: Int -> English16 -> ShowS # show :: English16 -> String # showList :: [English16] -> ShowS #
Locator English16 Source #
Instance details Defined in Data.Locator.English16 Methods locatorToDigit :: English16 -> Char Source # digitToLocator :: Char -> English16 Source #

fromEnglish16 :: [Char] -> Int Source #

Given a number encoded in Locator16, convert it back to an integer.

toEnglish16 :: Int -> String Source #

Given a number, convert it to a string in the English16 base 16 symbol alphabet. You can use this as a replacement for the standard '0'-'9' 'A'-'F' symbols traditionally used to express hexidemimal, though really the fact that we came up with 16 total unique symbols was a nice co-incidence, not a requirement.

toEnglish16a :: Int -> Int -> String Source #

Represent a number in English16a format. This uses the Locator16 symbol set, and additionally specifies that no symbol can be repeated. The a in Locator16a represents that this transformation is done on the cheap; when converting if we end up with '9' '9' we simply pick the subsequent digit in the enum, in this case getting you '9' 'K'.

Note that the transformation is not reversible. A number like 4369 (which is 0x1111, incidentally) encodes as 12C4. So do 4370, 4371, and 4372. The point is not uniqueness, but readibility in adverse conditions. So while you can count locators, they don't map continuously to base10 integers.

The first argument is the number of digits you'd like in the locator; if the number passed in is less than 16^limit, then the result will be padded.

>>> toEnglish16a 6 4369
12C40F

hashStringToEnglish16a :: Int -> ByteString -> ByteString Source #

Take an arbitrary sequence of bytes, hash it with SHA1, then format as a short digits-long Locator16 string.

>>> hashStringToLocator16a 6 "Hello World"
M48HR0

Latin25: a visually distinct character set

data Latin25 Source #

A symbol set with twenty-five visually distinct characters.

These are not protected against similar pronounciations; if you need to read your identifiers aloud use English16 instead.

Constructors

Zero'	`'0'` 0th
One'	`'1'` 1st
Three'	`'3'` 2nd
Four'	`'4'` 3rd
Seven'	`'7'` 4th
Eight'	`'8'` 5th
Nine'	`'9'` 6th
Alpha'	`'A'` 7th
Charlie'	`'C'` 8th
Echo'	`'E'` 9th
Golf'	`'G'` 10th
Hotel'	`'H'` 11th
Juliet'	`'J'` 12th
Kilo'	`'K'` 13th
Lima'	`'L'` 14th
Mike'	`'M'` 15th
November'	`'N'` 16th
Papa'	`'P'` 17th
Sierra'	`'S'` 18th
Tango'	`'T'` 19th
Victor'	`'V'` 20th
Whiskey'	`'W'` 21st
XRay'	`'X'` 22nd
Yankee'	`'Y'` 23rd
Zulu'	`'Z'` 24th

Instances

Bounded Latin25 Source #
Instance details Defined in Data.Locator.Latin25 Methods minBound :: Latin25 # maxBound :: Latin25 #
Enum Latin25 Source #
Instance details Defined in Data.Locator.Latin25 Methods succ :: Latin25 -> Latin25 # pred :: Latin25 -> Latin25 # toEnum :: Int -> Latin25 # fromEnum :: Latin25 -> Int # enumFrom :: Latin25 -> [Latin25] # enumFromThen :: Latin25 -> Latin25 -> [Latin25] # enumFromTo :: Latin25 -> Latin25 -> [Latin25] # enumFromThenTo :: Latin25 -> Latin25 -> Latin25 -> [Latin25] #
Eq Latin25 Source #
Instance details Defined in Data.Locator.Latin25 Methods (==) :: Latin25 -> Latin25 -> Bool # (/=) :: Latin25 -> Latin25 -> Bool #
Ord Latin25 Source #
Instance details Defined in Data.Locator.Latin25 Methods compare :: Latin25 -> Latin25 -> Ordering # (<) :: Latin25 -> Latin25 -> Bool # (<=) :: Latin25 -> Latin25 -> Bool # (>) :: Latin25 -> Latin25 -> Bool # (>=) :: Latin25 -> Latin25 -> Bool # max :: Latin25 -> Latin25 -> Latin25 # min :: Latin25 -> Latin25 -> Latin25 #
Show Latin25 Source #
Instance details Defined in Data.Locator.Latin25 Methods showsPrec :: Int -> Latin25 -> ShowS # show :: Latin25 -> String # showList :: [Latin25] -> ShowS #
Locator Latin25 Source #
Instance details Defined in Data.Locator.Latin25 Methods locatorToDigit :: Latin25 -> Char Source # digitToLocator :: Char -> Latin25 Source #

fromLatin25 :: String -> Int Source #

Given a number encoded in Locator16, convert it back to an integer.

toLatin25 :: Int -> String Source #

Given a number, convert it to a string in the Latin25 base 25 symbol alphabet. This is useful for primary keys and object identifiers that you need to scan for in log output, for example.

hashStringToLatin25 :: Int -> ByteString -> ByteString Source #

Take an arbitrary sequence of bytes, hash it with SHA1, then format as a short limit-long Latin25 string.

>>> hashStringToLatin25 5 "You'll get used to it. Or, you'll have a psychotic episode"
XSAV1

17 characters is the widest hash you can request.

Base62: binary without punctuation

toBase62 :: Integer -> String Source #

fromBase62 :: String -> Integer Source #

padWithZeros :: Int -> String -> String Source #

Utility function to prepend '0' characters to a string representing a number. This allows you to ensure a fixed width for numbers that are less than the desired width in size. This comes up frequently when representing numbers in other bases greater than 10 as they are inevitably presented as text, and not having them evenly justified can (at best) be ugly and (at worst) actually lead to parsing and conversion bugs.

hashStringToBase62 :: Int -> ByteString -> ByteString Source #

Take an arbitrary string, hash it, then pad it with zeros up to be a digits-long string in base 62.

You may be interested to know that the 160-bit SHA1 hash used here can be expressed without loss as 27 digits of base 62, for example:

>>> hashStringToBase62 27 "Hello World"
1T8Sj4C5jVU6iQXCwCwJEPSWX6u

Deprecated functions

fromLocator16 :: [Char] -> Int Source #

Deprecated: Use fromEnglish16 instead

toLocator16 :: Int -> String Source #

Deprecated: Use toEnglish16 instead

toLocator16a :: Int -> Int -> String Source #

Deprecated: Use toEnglish16a instead

hashStringToLocator16a :: Int -> ByteString -> ByteString Source #

Deprecated: Use hashStringToEnglish16a instead