Sonnex-0.1.0.3: Sonnex is an alternative to Soundex for french language

Copyright© 2014 Frédéric BISSON
LicenseGPL-3
Maintainerzigazou@free.fr
Stabilityalpha
PortabilityPOSIX
Safe HaskellSafe-Inferred
LanguageHaskell2010

Text.Sonnex

Description

This package computes Sonnex codes for french words or phrases. It is an alternative to the Soundex algorithm for french language.

Characters of the Sonnex code

The Sonnex code contains the following characters:

  • 1 ← un, ein, in, ain
  • 2 ← en, an
  • 3 ← on
  • a ← a, à, â
  • b ← b, bb
  • C ← ch
  • d ← d, dd
  • e ← e, eu
  • E ← ê, é, è, ai, ei
  • f ← f, ff, ph
  • g ← gu
  • i ← î, i, ille
  • j ← j, ge
  • k ← k, c, qu, ck
  • l ← l, ll
  • m ← m, mm
  • n ← n, nn
  • o ← o, ô
  • p ← p, pp
  • r ← r, rr
  • s ← s, ss
  • t ← t, tt
  • u ← u, ù, û
  • v ← v, w
  • z ← z, s
  • U ← ou

The apostroph is ignored, every other character not understood by the Sonnex algorthim is copied without changes.

Examples

Here are a few examples of sonnex results:

>>> sonnex "champ"
C2
>>> sonnex "chant"
C2
>>> sonnex "boulot"
bUlo
>>> sonnex "bouleau"
bUlo
>>> sonnex "compte"
k3t
>>> sonnex "comte"
k3t
>>> sonnex "conte"
k3t

Synopsis

Documentation

sonnex :: String -> String Source

Compute a Sonnex code for a french word.

The string must contain only one word. Each character should be considered as being vocal, not silent

length (sonnex w) <= length w

sonnexPhrase :: String -> String Source

Compute a Sonnex code for a french phrase.

It applies the sonnex function to each word in the phrase. Since it uses the words/unwords couple, superfluous space character are removed.