nanq-1.0.0: Performs 漢字検定 (National Kanji Exam) level analysis on given Kanji.

Copyright(c) Colin Woodbury, 2015
LicenseGPL3
MaintainerColin Woodbury <colingw@gmail.com>
Safe HaskellSafe
LanguageHaskell98

Data.Kanji

Contents

Description

A library for analysing the density of Kanji in given texts, according to their Level classification, as defined by the Japan Kanji Aptitude Testing Foundation (日本漢字能力検定協会).

Synopsis

Kanji

class IsString a => AsKanji a where Source

All String types can be transformed into a list of Kanji.

Methods

asKanji :: a -> [Kanji] Source

Transform this string type into a list of Kanji. The source string and the resulting list might not have the same length, if there were Char in the source that did not fall within the legal UTF8 range for Kanji.

newtype Kanji Source

A single symbol of Kanji. Japanese Kanji were borrowed from China over several waves during the past millenium. Japan names 2136 of these as their standard set, with rarer characters being the domain of academia and esoteric writers.

Japanese has several Japan-only Kanji, including:

  • 畑 (a type of rice field)
  • 峠 (a narrow mountain pass)
  • 働 (to do physical labour)

Constructors

Kanji 

Fields

_kanji :: Char
 

kanji :: Traversal' Char Kanji Source

Traverse into a Char to find a Kanji.

allKanji :: [[Kanji]] Source

All Kanji, grouped by their Level (級) in ascending order. Here, ascending order means from the lowest to the highest level, meaning from 10 to 1.

isKanji :: Char -> Bool Source

Legal Kanji appear between UTF8 characters 19968 and 40959.

hasLevel :: [Level] -> Kanji -> Bool Source

Is the Level of a given Kanji known?

kanjiDensity :: [Char] -> Float Source

What is the density d of Kanji characters in a given String, where 0 <= d <= 1?

elementaryKanjiDensity :: [Char] -> Float Source

As above, but only Kanji of the first 1006 are counted (those learned in elementary school in Japan).

percentSpread :: [Kanji] -> [(Kanji, Float)] Source

The distribution of each Kanji in a set of them. The distribution values must sum to 1.

Levels

data Level Source

A Level or Kyuu (級) of Japanese Kanji ranking. There are 12 of these, from 10 to 1, including intermediate levels between 3 and 2, and 2 and 1.

Japanese students will typically have Level-5 ability by the time they finish elementary school. Level-5 accounts for 1006 characters.

By the end of middle school, they would have covered up to Level-3 (1607 Kanji) in their Japanese class curriculum.

While Level-2 (2136 Kanji) is considered "standard adult" ability, many adults could not pass the Level-2, or even the Level-Pre2 (1940 Kanji) exam without considerable study.

Level data for Kanji above Level-2 is currently not provided by this library.

Constructors

Level 

Fields

_allKanji :: Set Kanji
 
_rank :: Rank
 

type Rank = Float Source

A numeric representation of a Level.

rankNums :: [Rank] Source

Numerical representations of the 12 ranks.

level :: [Level] -> Kanji -> Maybe Level Source

What Level does a Kanji belong to?

levels :: [Level] Source

All Levels, with all their Kanji, ordered from Level-10 to Level-2.

isKanjiInLevel :: Level -> Kanji -> Bool Source

Does a given Kanji belong to the given Level?

levelDist :: [Level] -> [Kanji] -> [(Rank, Float)] Source

How much of each Level is represented by a group of Kanji?

levelFromRank :: [Level] -> Rank -> Maybe Level Source

Is there a Level that corresponds with a given Rank value?

averageLevel :: [Level] -> [Kanji] -> Float Source

Find the average Level of a given set of Kanji.