The delta-h package

[ Tags: bsd3, library, natural-language-processing, program ] [ Propose Tags ]

Implementation of the model described in Grzegorz Chrupała and Afra Alishahi, Online Entropy-based Model of Lexical Category Acquisition, CoNLL 2010

[Skip to Readme]


Versions 0.0.1, 0.0.2, 0.0.3
Dependencies base (>=3 && <5), binary, bytestring, containers, monad-atom, nlp-scores, text [details]
License BSD3
Author Grzegorz Chrupala and Afra Alishahi
Category Natural Language Processing
Home page
Uploaded Sun Nov 27 07:40:01 UTC 2011 by GrzegorzChrupala
Distributions NixOS:0.0.3
Executables delta-h
Downloads 1231 total (28 in the last 30 days)
Rating (no votes yet) [estimated by rule of succession]
Your Rating
  • λ
  • λ
  • λ
Status Docs not available [build log]
All reported builds failed as of 2016-12-26 [all 7 reports]
Hackage Matrix CI


Maintainer's Corner

For package maintainers and hackage trustees

Readme for delta-h-0.0.1

[back to package description]

Online entropy-based model of lexical category acquisition.
Grzegorz Chrupala and Afra Alishahi


Install the Haskell Platform:

On linux, the following command will install the delta-h executable in the 
bin directory:

cabal install --prefix=`pwd`


The data directory has an example input file data/goat.txt
The other files are CHILDES.

To induce a model (i.e. a set of clusters), execute the following:

> ./bin/delta-h learn '[-12,0,12]' data/goat.txt

The argument '[-12,0,12]' specifies the features to be used (in this case
preceding bigram, focus word, and following bigram. Feature ids can be
inspected in the source file src/Entropy/Features.hs

The model will be stored in data/goat.txt.[-12,0,12].learn.model

You can display the model in a human-readable format with:

> ./bin/delta-h display  data/goat.txt.[-12,0,12].learn.model

The learned model can also be used to label input data, without
further learning:

> ./bin/delta-h label True True data/goat.txt.[-12,0,12].learn.model < \

The first argument specifies whether to use focus word for labeling,
the second argument whether to avoid outputting new cluster ids (not
in the model).

There is also a command which test the learned model on the word
prediction task:

> ./bin/delta-h eval-mrr True True  data/goat.txt.[-12,0,12].learn.model < \

The first argument specifies whether to marginalize over all cluster
assignments, the second whether to output detailed information.


There are some other (currently undocumented) commands: inspect src/Main.hs

The main part of the model is implemented in src/Entropy/Algorithm.hs.