cjk- Data about Chinese, Japanese and Korean characters and languages

Safe HaskellNone




cangjie :: Char -> Maybe CangjieInputCodeSource

The cangjie input code for the character

data CheungBauer Source




cheungBauer :: Char -> [CheungBauer]Source

Data regarding the character in Cheung Kwan-hin and Robert S. Bauer, _The Representation of Cantonese with Chinese Characters_, Journal of Chinese Linguistics, Monograph Series Number 18, 2002

cihai :: Char -> [Text]Source

The position(s) of this character in the Cihai (辭海) dictionary, single volume edition, published in Hong Kong by the Zhonghua Bookstore, 1983 (reprint of the 1947 edition), ISBN 962-231-005-2.

The position is indicated by a decimal number. The digits to the left of the decimal are the page number. The first digit after the decimal is the row on the page, and the remaining two digits after the decimal are the position on the row.

data Fenn Source




fennSoothill :: Maybe Int

Soothill number of the character's phonetic, if any

fennFrequency :: Maybe Int

Number from 1 to 11 indicating roughly which group of 500 most popular characters this character is included in (i.e. 1 is the first 500 characters, 2 the next 500 characters etc). Nothing if the character is rare.


fenn :: Char -> [Fenn]Source

Data on the character from The Five Thousand Dictionary (aka Fenn’s Chinese-English Pocket Dictionary) by Courtenay H. Fenn, Cambridge, Mass.: Harvard University Press, 1979.

fourCornerCode :: Char -> [Text]Source

The four-corner code(s) for the character

The four-corner system assigns each character a four-digit code from 0 through 9. The digit is derived from the “shape” of the four corners of the character (upper-left, upper-right, lower-left, lower-right). An optional fifth digit can be used to further distinguish characters; the fifth digit is derived from the shape in the character’s center or region immediately to the left of the fourth corner.

The four-corner system is now used only rarely. Full descriptions are available online, e.g., at http://en.wikipedia.org/wiki/Four_corner_input.

frequency :: Char -> Maybe IntSource

A rough frequency measurement for the character based on analysis of traditional Chinese USENET postings; characters with a kFrequency of 1 are the most common, those with a kFrequency of 2 are less common, and so on, through a kFrequency of 5.

gradeLevel :: Char -> Maybe IntSource

The primary grade in the Hong Kong school system by which a student is expected to know the character; this data is derived from 朗文初級中文詞典, Hong Kong: Longman, 2001

hkGlyph :: Char -> [Int]Source

The index of the character in 常用字字形表 (二零零零年修訂本),香: 香教育學院, 2000, ISBN 962-949-040-4. This publication gives the “proper” shapes for 4759 characters as used in the Hong Kong school system

phonetic :: Char -> [Text]Source

The phonetic index for the character from _Ten Thousand Characters: An Analytic Dictionary_, by G. Hugh Casey, S.J. Hong Kong: Kelley and Walsh, 1980

totalStrokes :: Char -> Maybe (StrokeCount, StrokeCount)Source

The total number of strokes in the character (including the radical), that is, the stroke count most commonly associated with the character in modern text using customary fonts.

The first value is preferred for zh-Hans (CN) and the second is preferred for zh-Hant (TW)