Safe Haskell	None
Language	Haskell2010

NLP.GenI.LexicalSelection

Contents

Lexical selection algorithms
Selecting candidate lemmas
Anchoring
Post-processing
- Paraphrase selection

Description

This module performs the core of lexical selection and anchoring.

Synopsis

Lexical selection algorithms

data CustomSem sem Source #

This aims to support users who want to do lexical selection directly from an input other than GenI style flat semantics.

The requirement here is for you to provide some means of converting the custom semantics to a GenI semantics

Constructors

CustomSem

Fields

fromCustomSemInput :: sem -> Either Text SemInput
Conversion from custom semantics to GenI semantic input
customSelector :: LexicalSelector sem
Lexical selection function
customSemParser :: Text -> Either Text (TestCase sem)
customSuiteParser :: FilePath -> Text -> Either Text [TestCase sem]
List of named inputs intended to act as a substitute for test suites (FilePath argument is for reporting error messages only)
customRenderSem :: sem -> Text

type LexicalSelector sem = Macros -> Lexicon -> sem -> IO LexicalSelection Source #

See Configuration if you want to use GenI with a custom lexical selection function.

data LexicalSelection Source #

The result of the lexical selection process

Constructors

LexicalSelection
Fields lsAnchored :: [TagElem] the main result: a set of elementary trees (ie. anchored trees) lsLexEntries :: [LexEntry] if available, lexical entries that were used to produce anchored trees (useful for identifying anchoring failure) lsWarnings :: GeniWarnings HINT: use `mempty` to initialise to empty

defaultLexicalSelector :: Macros -> Lexicon -> SemInput -> IO LexicalSelection Source #

Performs standard GenI lexical selection as described in http://projects.haskell.org/GenI/manual/lexical-selection.html

This is just defaultLexicalSelection lifted into IO

defaultLexicalSelection :: Macros -> Lexicon -> SemInput -> LexicalSelection Source #

Helper for defaultLexicalSelector (Standard GenI lexical selection is actually pure)

This is just

missingLexEntries :: [TagElem] -> [LexEntry] -> [LexEntry] Source #

missingLexEntries ts lexs returns any of the lexical candidates lexs that were apparently not anchored succesfully.

TODO: it does this by (wrongly) checking for each lexical item to see if any of the anchored trees in ts have identical semantics to that lexical item. The better way to do this would be to throw a subsumption check on top of items reported missing, because it's possible for the trees to add semantics through unification.

Selecting candidate lemmas

defaultLexicalChoice :: Lexicon -> SemInput -> [LexEntry] Source #

Select and returns the set of entries from the lexicon whose semantics subsumes the input semantics.

chooseCandI :: Sem -> [LexEntry] -> [LexEntry] Source #

chooseCandI sem l attempts to unify the semantics of l with sem If this succeeds, we use return the result(s); if it fails, we reject l as a lexical selection candidate.

mergeSynonyms :: [LexEntry] -> [LexEntry] Source #

mergeSynonyms is a factorisation technique that uses atomic disjunction to merge all synonyms into a single lexical entry. Two lexical entries are considered synonyms if their semantics match and they point to the same tree families.

FIXME: 2006-10-11 - note that this is no longer being used, because it breaks the case where two lexical entries differ only by their use of path equations. Perhaps it's worthwhile just to add a check that the path equations match exactly.

Anchoring

type LexCombine a = MaybeT (Writer [LexCombineError]) a Source #

The LexCombine monad supports warnings during lexical selection and also failure via Maybe

runLexCombine :: LexCombine a -> (Maybe a, [LexCombineError]) Source #

lexTell :: LexCombineError -> LexCombine () Source #

Note an anchoring error

defaultAnchoring :: SemInput -> Macros -> [LexEntry] -> LexicalSelection Source #

defaultAnchoring schemata lex sem implements the later half of lexical selection (tree anchoring and enrichement). It assumes that lex consists just of the lexical items that have been selected, and tries to combine them with the tree schemata.

This function may be useful if you are implementing your own lexical selection functions, and you want GenI to take over after you've given it a [LexEntry]

Combination

combineList Source #

Arguments

:: Sem
-> Macros
-> LexEntry
-> ([LexCombineError], [TagElem])	any warnings, plus the results

Given a lexical item, looks up the tree families for that item, and anchor the item to the trees.

combineOne :: Sem -> LexEntry -> SchemaTree -> LexCombine [TagElem] Source #

Combine a single tree with its lexical item to form a bonafide TagElem. This process can fail, however, because of filtering or enrichement

Enrichment

enrich :: LexEntry -> SchemaTree -> LexCombine SchemaTree Source #

See http://kowey.github.io/GenI/manual/lexical-selection.html on enrichement

`enrich` helpers

data EnrichmentResult Source #

Constructors

EnrSuccess SchemaTree Subst
EnrNotFound
EnrFailed

enrichBy :: SchemaTree -> PathEqPair -> LexCombine SchemaTree Source #

Helper for enrich (enrich by single path equation)

maybeEnrichBy :: SchemaTree -> PathEqPair -> EnrichmentResult Source #

Helper for enrichBy

enrichFeat :: MonadUnify m => AvPair GeniVal -> Flist SchemaVal -> m (Flist SchemaVal, Subst) Source #

enrichFeat av fs attempts to unify av with fs

Note here that fs is an Flist [GeniVal] rather than the usual Flist GeniVal you may expect. This is because it comes from SchemaTree which allows non-atomic disjunctions of GeniVal which have to be flatten down to at most atomic disjunctions once lexical selection is complete.

missingCoanchors :: LexEntry -> SchemaTree -> [Text] Source #

missingCoanchors l t returns the list of coanchor node names from l that were not found in t

lexEquations :: LexEntry -> Writer [LexCombineError] ([AvPair GeniVal], [PathEqPair]) Source #

Split a lex entry's path equations into interface enrichement equations or (co-)anchor modifiers

seekCoanchor :: NodePathEqLhs -> SchemaTree -> Maybe (GNode SchemaVal) Source #

seekCoanchor lhs t returns Just node if t contains exactly one node that can be identified by lhs, Nothing if it contains none.

It crashes if there is more than one such node, because this should have been caught earlier by GenI.

matchNodeName :: NodePathEqLhs -> GNode SchemaVal -> Bool Source #

matchNodeName lhs n is True if the lhs refers to the node n

matchNodeNameHelper :: Text -> GNode SchemaVal -> Bool Source #

matchNodeNameHelper recognises “anchor“ by convention; otherwise, it does a name match

Lemanchor mechanism

setLemAnchors :: Tree (GNode GeniVal) -> Tree (GNode GeniVal) Source #

The lemanchor mechanism is described in http://projects.haskell.org/manual/lexical-selection

_lemanchor :: Text Source #

The name of the lemanchor attribute (by convention; see source)

setOrigin :: Text -> Tree (GNode v) -> Tree (GNode v) Source #

setOrigin n t marks the nodes in t as having come from a tree named n

Post-processing

defaultPostProcessing :: SemInput -> LexicalSelection -> LexicalSelection Source #

Standard post-processing/filtering steps that can take place after lexical selection. Right now, this only consists of paraphrase selection

Paraphrase selection

preselectParaphrases :: [LitConstr] -> [TagElem] -> [TagElem] Source #

Rule out lexical selection results that violate trace constraints

respectsConstraints :: [LitConstr] -> TagElem -> Bool Source #

True if the tree fulfills the supplied trace constraints

Lexical selection algorithms

Selecting candidate lemmas

Anchoring

Combination

Enrichment

enrich helpers

Lemanchor mechanism

Post-processing

Paraphrase selection

`enrich` helpers