GenI-0.22: A natural language generator (specifically, an FB-LTAG surface realiser)

Safe HaskellSafe-Infered



The heavy lifting of GenI, the whole chart/agenda mechanism, can be implemented in many ways. To make it easier to write different algorithms for GenI and compare them, we provide a single interface for what we call Builders.

This interface is then used called by the Geni module and by the graphical interface. Note that each builder has its own graphical interface and that we do a similar thing in the graphical interface code to make it possible to use these GUIs.



data Builder st it pa Source




init :: Input -> pa -> (st, Statistics)

initialise the machine from the semantics and lexical selection

step :: BuilderState st ()

run a realisation step

stepAll :: BuilderState st ()

run all realisations steps until completion

finished :: st -> GenStatus

determine if realisation is finished

unpack :: st -> [Output]

unpack chart results into a list of sentences

partial :: st -> [Output]

lexicalSelection :: TagDerivation -> [Text]Source

The names of lexically selected chart items used in a derivation

data FilterStatus a Source


NotFiltered a 

(>-->) :: Monad s => DispatchFilter s a -> DispatchFilter s a -> DispatchFilter s aSource

Sequence two dispatch filters.

defineSemanticBits :: Sem -> SemBitMapSource

assign a bit vector value to each literal in the semantics the resulting map can then be used to construct a bit vector representation of the semantics

type DispatchFilter s a = a -> s (FilterStatus a)Source

Dispatching consists of assigning a chart item to the right part of the chart (agenda, trash, results list, etc). This is implemented as a series of filters which can either fail or succeed. If a filter fails, it may modify the item before passing it on to future filters.

condFilter :: Monad s => (a -> Bool) -> DispatchFilter s a -> DispatchFilter s a -> DispatchFilter s aSource

If the item meets some condition, use the first filter, otherwise use the second one.

defaultStepAll :: Builder st it pa -> BuilderState st ()Source

Default implementation for the stepAll function in Builder

data Input Source

To simplify interaction with the backend, we provide a single data structure which represents all the inputs a backend could take.




inSemInput :: SemInput
inLex :: [LexEntry]

for the debugger

inCands :: [(TagElem, BitVector)]

tag tree


unlessEmptySem :: Input -> Params -> a -> aSource

Equivalent to id unless the input contains an empty or uninstatiated semantics

type SentenceAut = NFA Int LemmaPlusSource

A SentenceAut represents a set of sentences in the form of an automaton. The labels of the automaton are the words of the sentence. But note! “word“ in the sentence is in fact a tuple (lemma, inflectional feature structures). Normally, the states are defined as integers, with the only requirement being that each one, naturally enough, is unique.

run :: Builder st it Params -> Input -> Params -> (st, Statistics)Source

Performs surface realisation from an input semantics and a lexical selection.

Statistics tracked

  • pol_used_bundles - number of bundled paths through the polarity automaton. see automatonPathSets
  • pol_used_paths - number of paths through the final automaton
  • pol_seed_paths - number of paths through the seed automaton (i.e. with no polarities). This is normally just 1, unless you have multi-literal semantics
  • pol_total_states - combined number of states in the all the polarity automata
  • pol_total_tras - combined number of transitions in all polarity automata
  • pol_max_states - number of states in the polarity automaton with the most states
  • pol_total_tras - number of transitions in the polarity automata with the most transitions
  • sem_literals - number of literals in the input semantics
  • lex_trees - total number of lexically selected trees