Copyright  © 20162017 George Steel and Peter Jurgec 

License  GPL2+ 
Maintainer  george.steel@gmail.com 
Safe Haskell  None 
Language  Haskell2010 
Functions for generating sets of candidate constraint sets. For basic use, CandidateSettings
and CandidateGrammar
while the other functions provide more finegrained control.
The classesByGenreraity
function enumerates the classes defined by a feature table in a sensible order, removing duplicate descriptions of the same class. The ug functions then take these classes and then combine them imto globs in various ways. For efficiency, classes are reperesented as (
pairs and constraints are output as NaturalClass
, SegSet
SegRef
)(
pairs, avoiding the need for repeated conversions and copying of classes.ClassGlob
, ListGlob
SegRef
)
 data CandidateSettings = CandidateSettings {}
 candidateGrammar :: FeatureTable sigma > CandidateSettings > (Int, Int, [(ClassGlob, ListGlob SegRef)])
 ngrams :: Int > [a] > [[a]]
 classesByGenerality :: FeatureTable sigma > Int > [(Int, (NaturalClass, SegSet SegRef))]
 ugSingleClasses :: [(Int, (NaturalClass, SegSet SegRef))] > [(ClassGlob, ListGlob SegRef)]
 ugBigrams :: [(Int, (NaturalClass, SegSet SegRef))] > [(ClassGlob, ListGlob SegRef)]
 ugEdgeClasses :: [(Int, (NaturalClass, SegSet SegRef))] > [(ClassGlob, ListGlob SegRef)]
 ugEdgeBigrams :: [(Int, (NaturalClass, SegSet SegRef))] > [(ClassGlob, ListGlob SegRef)]
 ugLimitedTrigrams :: [(Int, (NaturalClass, SegSet SegRef))] > [(NaturalClass, SegSet SegRef)] > [(ClassGlob, ListGlob SegRef)]
 ugLongDistance :: [(Int, (NaturalClass, SegSet SegRef))] > [(NaturalClass, SegSet SegRef)] > [(ClassGlob, ListGlob SegRef)]
 ugHayesWilson :: [(Int, (NaturalClass, SegSet SegRef))] > [(NaturalClass, SegSet SegRef)] > [(ClassGlob, ListGlob SegRef)]
Documentation
data CandidateSettings Source #
Settings for grammar generation
CandidateSettings  

candidateGrammar :: FeatureTable sigma > CandidateSettings > (Int, Int, [(ClassGlob, ListGlob SegRef)]) Source #
Generate a reasonable set of candidate constraints based single classes, bigrams, and the4 additionsl constraint types specified in the settings. First and second return values are the number of classes and candidates in the grammar, and the third is the set of candidates.
ngrams :: Int > [a] > [[a]] Source #
Given a number n and a sequence, returns all subsewuences of length n.
classesByGenerality :: FeatureTable sigma > Int > [(Int, (NaturalClass, SegSet SegRef))] Source #
Enumerate all classes (and their inverses) to a certain number of features in descending order of the number of segments the uninverted class contains. Discards duplicates (having the same set of segments).
Each segment is returned as a tripple with the (negated for sorting) numbet of segments in the class, the class label, and the set of segments it contains.
ugSingleClasses :: [(Int, (NaturalClass, SegSet SegRef))] > [(ClassGlob, ListGlob SegRef)] Source #
Given a set of classes, return a set of globs matching those classes.
ugBigrams :: [(Int, (NaturalClass, SegSet SegRef))] > [(ClassGlob, ListGlob SegRef)] Source #
Given a set of classes, return a set pf globs matching class pairs, ordered by total weight. At most one class may be inverted.
ugEdgeClasses :: [(Int, (NaturalClass, SegSet SegRef))] > [(ClassGlob, ListGlob SegRef)] Source #
Given a set of classes, return a set of globs matching those globs at word boundaries. At most one class may be inverted.
ugEdgeBigrams :: [(Int, (NaturalClass, SegSet SegRef))] > [(ClassGlob, ListGlob SegRef)] Source #
Given a set of classes, return a set pf globs matching class pairs at word boundaries, ordered by total weight. At most one class may be inverted.
ugLimitedTrigrams :: [(Int, (NaturalClass, SegSet SegRef))] > [(NaturalClass, SegSet SegRef)] > [(ClassGlob, ListGlob SegRef)] Source #
Given a set of classes ansd a smaller subset, return a set of globs matching trigrams of classes from the set where at least one class is contained in the subset. At most one class may be inverted.
ugLongDistance :: [(Int, (NaturalClass, SegSet SegRef))] > [(NaturalClass, SegSet SegRef)] > [(ClassGlob, ListGlob SegRef)] Source #
Given two sets of classes, return globs matching a pair oc slasses in the first set separated by any number of occurrences of a class in the second set. At most one class may be inverted. At most one class may be inverted. This can lead to fairly large grammar DFAs when multiple such constraints are merged.
ugHayesWilson :: [(Int, (NaturalClass, SegSet SegRef))] > [(NaturalClass, SegSet SegRef)] > [(ClassGlob, ListGlob SegRef)] Source #
Combine the above functions (not including ugLongDistance
) into the original candidate generator from the Hayes and Wilson paper.