Copyright | See LICENSE file |
---|---|
License | BSD3 |
Maintainer | Brad Neimann |
Safe Haskell | Safe-Inferred |
Language | Haskell2010 |
Brassica.SFM.MDF
Description
This module contains types and functions for working with the MDF dictionary format. For more on the MDF format, refer to e.g. Coward & Grimes (2000).
Synopsis
- data MDFLanguage
- = English
- | National
- | Regional
- | Vernacular
- | Other
- fieldLangs :: Map String MDFLanguage
- mdfHierarchy :: Hierarchy
- mdfAlternateHierarchy :: Hierarchy
- tokeniseMDF :: [String] -> SFM -> Either (ParseErrorBundle String Void) [Component PWord]
- tokeniseField :: [String] -> Field -> Either (ParseErrorBundle String Void) [Component PWord]
- duplicateEtymologies :: (String -> String) -> SFMTree -> SFMTree
Documentation
data MDFLanguage Source #
The designated language of an MDF field.
Constructors
English | |
National | |
Regional | |
Vernacular | |
Other |
Instances
Show MDFLanguage Source # | |
Defined in Brassica.SFM.MDF Methods showsPrec :: Int -> MDFLanguage -> ShowS # show :: MDFLanguage -> String # showList :: [MDFLanguage] -> ShowS # | |
Eq MDFLanguage Source # | |
Defined in Brassica.SFM.MDF |
fieldLangs :: Map String MDFLanguage Source #
A Map
from the most common field markers to the language of
their values.
(Note: This is currently hardcoded in the source code, based on the
values in the MDF definitions from SIL Toolbox. The exception is
et
, which is assigned as Other
rather than
Vernacular
. There’s probably a more principled way of defining
this, but hardcoding should suffice for now.)
mdfHierarchy :: Hierarchy Source #
Standard MDF hierarchy, with lx
> se
> ps
> sn
.
Intended for use with toTree
.
mdfAlternateHierarchy :: Hierarchy Source #
Alternate MDF hierarchy, with lx
> sn
> se
> ps
.
Intended for use with toTree
.
Arguments
:: [String] | List of available multigraphs (as with |
-> SFM | |
-> Either (ParseErrorBundle String Void) [Component PWord] |
Convert an SFM
document to a list of Component
s representing
the same textual content. Vernacular
field values are tokenised as
if using tokeniseWords
; everything else is treated as a
Separator
, so that it is not disturbed by operations such as rule
application or rendering to text.
(This is a simple wrapper around tokeniseField
.)
tokeniseField :: [String] -> Field -> Either (ParseErrorBundle String Void) [Component PWord] Source #
Like tokeniseMDF
, but for a single Field
rather than a whole
SFM file.
Arguments
:: (String -> String) | Transformation to apply to etymologies, e.g. |
-> SFMTree | |
-> SFMTree |
Add etymological fields to an MDF file by duplicating the values in
lx
, se
and ge
fields. e.g.:
\lx kapa \ps n \ge parent \se sakapa \ge father
Would become:
\lx kapa \ps n \ge parent \et kapa \eg parent \se sakapa \ge father \et sakapa \eg father
This can be helpful when applying sound changes to an MDF file: the vernacular words can be copied as etymologies, and then the sound changes can be applied leaving the etymologies as is.
Note that the hierarchy must already be resolved before this function can be used, as it depends on the tree structure to know where the etymologies should be placed.