Copyright | (c) Colin Woodbury, 2015 |
---|---|
License | GPL3 |
Maintainer | Colin Woodbury <colingw@gmail.com> |
Safe Haskell | Safe |
Language | Haskell98 |
A library for analysing the density of Kanji in given texts, according to their Level classification, as defined by the Japan Kanji Aptitude Testing Foundation (日本漢字能力検定協会).
- class AsKanji a where
- newtype Kanji = Kanji {}
- kanji :: Traversal' Char Kanji
- allKanji :: [[Kanji]]
- isKanji :: Char -> Bool
- hasLevel :: [Level] -> Kanji -> Bool
- kanjiDensity :: AsKanji a => a -> Float
- elementaryKanjiDensity :: AsKanji a => a -> Float
- percentSpread :: [Kanji] -> [(Kanji, Float)]
- data Level = Level {}
- type Rank = Float
- rankNums :: [Rank]
- level :: [Level] -> Kanji -> Maybe Level
- levels :: [Level]
- isKanjiInLevel :: Level -> Kanji -> Bool
- levelDist :: [Level] -> [Kanji] -> [(Rank, Float)]
- levelFromRank :: [Level] -> Rank -> Maybe Level
- averageLevel :: [Level] -> [Kanji] -> Float
Kanji
Anything that can be transformed in a list of Kanji.
_Kanji :: Traversal' a Kanji Source
Traverse into this type to find 0 or more Kanji.
Despite what the Haddock documentation says, this is part of the minimal complete definition.
How long is this input source?
asKanji :: a -> [Kanji] Source
Transform this string type into a list of Kanji. The source string
and the resulting list might not have the same length, if there
were Char
in the source that did not fall within the legal
UTF8 range for Kanji.
A single symbol of Kanji. Japanese Kanji were borrowed from China over several waves during the past millenium. Japan names 2136 of these as their standard set, with rarer characters being the domain of academia and esoteric writers.
Japanese has several Japan-only Kanji, including:
- 畑 (a type of rice field)
- 峠 (a narrow mountain pass)
- 働 (to do physical labour)
kanji :: Traversal' Char Kanji Source
Deprecated: Use _Kanji instead
All Kanji, grouped by their Level (級) in ascending order. Here, ascending order means from the lowest to the highest level, meaning from 10 to 1.
kanjiDensity :: AsKanji a => a -> Float Source
What is the density d
of Kanji characters in a given String-like
type, where 0 <= d <= 1
?
elementaryKanjiDensity :: AsKanji a => a -> Float Source
As above, but only Kanji of the first 1006 are counted (those learned in elementary school in Japan).
percentSpread :: [Kanji] -> [(Kanji, Float)] Source
The distribution of each Kanji
in a set of them.
The distribution values must sum to 1.
Levels
A Level or Kyuu (級) of Japanese Kanji ranking. There are 12 of these, from 10 to 1, including intermediate levels between 3 and 2, and 2 and 1.
Japanese students will typically have Level-5 ability by the time they finish elementary school. Level-5 accounts for 1006 characters.
By the end of middle school, they would have covered up to Level-3 (1607 Kanji) in their Japanese class curriculum.
While Level-2 (2136 Kanji) is considered "standard adult" ability, many adults could not pass the Level-2, or even the Level-Pre2 (1940 Kanji) exam without considerable study.
Level data for Kanji above Level-2 is currently not provided by this library.
levelDist :: [Level] -> [Kanji] -> [(Rank, Float)] Source
How much of each Level
is represented by a group of Kanji?