maxent-learner-hw-0.2.1: Hayes and Wilson's maxent learning algorithm for phonotactic grammars.

Copyright© 2016-2017 George Steel and Peter Jurgec
Safe HaskellNone




Data structures and functions for working with phonological features and natural classes.

Feature tables are designed work with strings reperesented as lists of SegRef indices into their internal segment lists, enabling processing with fast array lookups even with non-contiguous sets of segments.


Phonological Features

data FeatureTable sigma Source #

Type for phonological feature table. Segments and features are referred to by indices so this structure includes lookup tables for those.


Eq sigma => Eq (FeatureTable sigma) Source # 


(==) :: FeatureTable sigma -> FeatureTable sigma -> Bool #

(/=) :: FeatureTable sigma -> FeatureTable sigma -> Bool #

Show sigma => Show (FeatureTable sigma) Source # 


showsPrec :: Int -> FeatureTable sigma -> ShowS #

show :: FeatureTable sigma -> String #

showList :: [FeatureTable sigma] -> ShowS #

(Ord a, NFData a) => NFData (FeatureTable a) Source # 


rnf :: FeatureTable a -> () #

srBounds :: FeatureTable sigma -> (SegRef, SegRef) Source #

Bounds for segment references

ftlook :: FeatureTable sigma -> SegRef -> Int -> FeatureState Source #

Shortcut for feature table array access

segsToRefs :: Ord sigma => FeatureTable sigma -> [sigma] -> [SegRef] Source #

Convert a string of raw segments to a string of SegRefs. Skips unrecognisable segments.

refsToSegs :: FeatureTable sigma -> [SegRef] -> [sigma] Source #

Convert a string of SegRefs back to segments

csvToFeatureTable :: Ord sigma => (String -> sigma) -> String -> Maybe (FeatureTable sigma) Source #

Parse feature table from CSV.

To use a feature table other than the default IPA one, you may define it in CSV format (RFC 4180). The segment names are defined by the first row (they may be any strings as long as they are all distinct, i.e. no duplicate names) and the feature names are defined by the first column (they are not hard-coded). Data cells should contain +, -, or 0 for binary features and + or 0 for privative features (where we do not want a minus set that could form classes).

As a simple example, consider the following CSV file, defining three segments (a, n, and t), and two features (vowel and nasal).


If a row contains a different number of cells (separated by commas) than the header line, is rejected as invalid and does not define a feature (and will not be dispayed in the formatted feature table). If the CSV which is entered has duplicate segment names, no segments, or no valid features, the entire table is rejected (indicated by a red border around the text area, green is normal) and the last valid table is used and displayed.

featureTableToCsv :: (sigma -> String) -> FeatureTable sigma -> String Source #

Cave a modified feature table to CSV format

Natural Classes

classToSeglist :: FeatureTable sigma -> NaturalClass -> SegSet SegRef Source #

Convert a class to a SegSet

cgMatchCounter :: FeatureTable sigma -> ClassGlob -> ShortDFST SegRef Source #

Create a DFST which counts the matches of the glob.